From: mAsterdam on
Patrick May wrote:
> mAsterdam writes:
>> Patrick May wrote:
>>> If the application implements the solution using a DAG, for
>>> example, and the state of each node in the DAG is stored in a
>>> relational database, then a mapping between the two models must
>>> take place.
>> The data is regrouped. Not a huge decoupling, but yes, more than
>> just aliasing.
>
> It's more than regrouping. A different set of behaviors (graph
> traversal, etc.) are being supported explicitly.

From the behaviour POV (point-of-view) there is some
decoupling (effort /and/ gain), but it is completely
done within the behavioural realm, not at the border
(the schema). The set of behaviours is not relevant to
the data itself to begin with, so from the data POV
no decoupling is gained at all.

>> Barring design flaws, your examples deals with two applications
>> using the same data: the one that deals with the individual nodes
>> and the one that deals with the DAG as a whole. Should changes in
>> the individul node states outside the influence of the current
>> application be reflected in it? Not a big decoupling, yet there is
>> already a price-tag.
>
> Depending on the transactional context, it may be necessary to
> update the application state based on database changes.

We already have more specific knowledge already: If (and only if)
the change is relevant to subsequent actions within the context of
this instance of the running application, the application state
needs to reflect the mutations of the data in the database.
In that sense the application only borrows the data.

> Even if it
> is, there is a large benefit to having a representation that is more
> expressive in the context of the application's solution domain.
>
>>> If the application changes the way it uses its internal model, that
>>> shouldn't have an impact on the schema.
>> There is no "should" or "shouldn't" here.
>
> Actually, there is. Managing dependencies is essential to
> creating maintainable software.

As you might have suspected by now, we are on complete agreement
on the general statement. What I am arguing is that there is a
limit to the level of decoupling achievable. It is impossible
to decouple from the data itself at the application level. Let me
haste to add that it /is/ possible to use libraries, (consider e.g.
data-structure organized libraries, amongst which are class
libraries), including those specifically built for use in this
application, within the application.

>> As long as the application uses the same data, the change won't
>> impact the schema, and if there are changes in the data
>> requirements, it will. No design approach or tool-stack will help
>> you avoid the inevitable.
>
> True. However, the application and the database schema change at
> different rates and for different reasons. That's a good reason to
> decouple them.

Yes, no argument here. Please acknowledge the limits.

It is predictable which changes necessarily impact both
schema and application: the ones driven by changes in the
data-requirements.

>>>>> application can be decoupled from the schema. You are suggesting
>>>>> using views to do so. That's one possible mechanism. OO
>>>>> languages provide others.
>>>> Which?
>>> ORM tools, DAOs, and caches, for example.
>> These provide decoupling in names, shape and actuality of the data.
>> I would not call that decoupling from the schema - the schema still
>> describes the data itself. Is that the whole problem? Labeling
>> something as decoupling or not?
>
> Those techniques and general patterns like Dependency Inversion
> allow both the application implementation and the database
> implementation to be completely replaced without impact to the other
> component. That's decoupling.

From implementation, not from the schema.


--
What you see depends on where you stand.
From: Patrick May on
"Brian Selzer" <brian(a)selzer-software.com> writes:
> "Patrick May" <pjm(a)spe.com> wrote in message news:m27ifp5yyt.fsf(a)spe.com...
>> Ah, you may have identified the source of our miscommunication
>> here. We seem to be using "schema" differently, despite both of us
>> repeatedly attempting to clarify. Let's consider first an
>> application that doesn't use a relational database at all. This
>> application still has a schema in the sense of the data that it
>> uses internally to support the behaviors it exhibits. Are we in
>> agreement so far?
>
> I wouldn't put it that way, but I think I understand what you mean.
>
>> The implementation of this internal schema may change. For a
>> simple example, an array may be changed to a queue or a stack. In
>> this case the same data is being stored, but the behavior of the
>> data structure is different. A more complex example would be
>> changing a data member to a computation or vice versa. In these
>> scenarios, the application logic cannot be decoupled from the
>> schema representation.
>
> The potential information content, however, may not be different.
> Whether you use an array, or a queue, or a linked list, or a doubly
> linked list, or a binary tree, the information that populates those
> structures may be exactly the same information, regardless of how it
> is laid out.

I'm not positive that this is the case when considering only
data, because some information may be either generated on demand
(possibly with reference to components outside the scope of the
system) or stored locally. Your equivalence also ignores the
capabilities of the structures themselves. To a first approximation,
though, I see your point.

>> Now consider the same application modified to use a relational
>> database. The application logic continues to use the same data
>> structures (stacks, queues, DAGs, etc.) and classes (we'll assume
>> it's an OO application, since this is comp.object) it was using
>> before. The implementation of the schema in the relational
>> database presents a particular interface. This interface is based
>> on relations. Some form of mapping must take place to convert the
>> data provided via that interface into the model used internally by
>> the application.
>
> This is where we diverge. The data is the same data whether it is
> represented as a relation or as a binary tree. It is not the data
> that is subject to conversion, but rather its representation. This
> may seem like splitting hairs, but the difference is in my opinion
> critical.

Split away. I agree that you've identified the difference.
Imagine that, constructive conversation on Usenet. ;-)

> A schema specifies what is to be and can be recorded. It also
> suggests a structure, but that structure implies and is implied by a
> set of constraints that describes what the data is rather than how
> the data is laid out. That implied structure may differ
> significantly from what is optimal for a particular application,
> which is concerned less with what the data is than with how it can
> be used.

Our definitions of "schema" differ. I do include the structure
as well as the data in my definition. That's not particularly
important, however, because it's easy enough to rephrase the original
point of contention. Would you agree that the data+structure used by
an application can be decoupled from the data+structure exposed by a
relational database?

Regards,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | Large scale, mission-critical, distributed OO
| systems design and implementation.
pjm(a)spe.com | (C++, Java, Common Lisp, Jini, middleware, SOA)
From: Patrick May on
mAsterdam <mAsterdam(a)vrijdag.org> writes:
> Patrick May wrote:
>>>> If the application implements the solution using a DAG, for
>>>> example, and the state of each node in the DAG is stored in a
>>>> relational database, then a mapping between the two models must
>>>> take place.
>>>
>>> The data is regrouped. Not a huge decoupling, but yes, more than
>>> just aliasing.
>>
>> It's more than regrouping. A different set of behaviors (graph
>> traversal, etc.) are being supported explicitly.
>
> From the behaviour POV (point-of-view) there is some decoupling
> (effort /and/ gain), but it is completely done within the
> behavioural realm, not at the border (the schema). The set of
> behaviours is not relevant to the data itself to begin with, so from
> the data POV no decoupling is gained at all.

I think we might have the same disconnect that Mr. Selzer
identified. I am using the word "schema" to refer to both the data
and the data structures that hold it. Are you using that word to
refer to the data only?

>>>> If the application changes the way it uses its internal model,
>>>> that shouldn't have an impact on the schema.
>>>
>>> There is no "should" or "shouldn't" here.
>>
>> Actually, there is. Managing dependencies is essential to
>> creating maintainable software.
>
> As you might have suspected by now, we are on complete agreement on
> the general statement. What I am arguing is that there is a limit to
> the level of decoupling achievable. It is impossible to decouple
> from the data itself at the application level. Let me haste to add
> that it /is/ possible to use libraries, (consider
> e.g. data-structure organized libraries, amongst which are class
> libraries), including those specifically built for use in this
> application, within the application.

I believe we're mostly in agreement. The decoupling I'm talking
about is in terms of the data structures, not the data. There can, of
course, be other sources of data than a relational database.

>>> As long as the application uses the same data, the change won't
>>> impact the schema, and if there are changes in the data
>>> requirements, it will. No design approach or tool-stack will help
>>> you avoid the inevitable.
>>
>> True. However, the application and the database schema change
>> at different rates and for different reasons. That's a good reason
>> to decouple them.
>
> Yes, no argument here. Please acknowledge the limits.
>
> It is predictable which changes necessarily impact both schema and
> application: the ones driven by changes in the data-requirements.

Date requirements of the application, not of other applications
that may use the same database.

>> Those techniques and general patterns like Dependency
>> Inversion allow both the application implementation and the
>> database implementation to be completely replaced without impact to
>> the other component. That's decoupling.
>
> From implementation, not from the schema.

It's decoupling between the two representations that hold
overlapping subsets of the same data.

Sincerely,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | Large scale, mission-critical, distributed OO
| systems design and implementation.
pjm(a)spe.com | (C++, Java, Common Lisp, Jini, middleware, SOA)
From: mAsterdam on
Patrick May schreef:
> mAsterdam writes:
>> Patrick May wrote:
>>>>> If the application implements the solution using a DAG, for
>>>>> example, and the state of each node in the DAG is stored in a
>>>>> relational database, then a mapping between the two models must
>>>>> take place.
>>>> The data is regrouped. Not a huge decoupling, but yes, more than
>>>> just aliasing.
>>> It's more than regrouping. A different set of behaviors (graph
>>> traversal, etc.) are being supported explicitly.
>> From the behaviour POV (point-of-view) there is some decoupling
>> (effort /and/ gain), but it is completely done within the
>> behavioural realm, not at the border (the schema). The set of
>> behaviours is not relevant to the data itself to begin with, so from
>> the data POV no decoupling is gained at all.
>
> I think we might have the same disconnect that Mr. Selzer
> identified. I am using the word "schema" to refer to both the data
> and the data structures that hold it. Are you using that word to
> refer to the data only?

That is a loaded question. The schema, as seen from the application,
describes data. It does use structures to do so.
The structures holding the data, however, are invisible
to the application.

>>>>> If the application changes the way it uses its internal model,
>>>>> that shouldn't have an impact on the schema.
>>>> There is no "should" or "shouldn't" here.
>>> Actually, there is. Managing dependencies is essential to
>>> creating maintainable software.
>> As you might have suspected by now, we are on complete agreement on
>> the general statement. What I am arguing is that there is a limit to
>> the level of decoupling achievable. It is impossible to decouple
>> from the data itself at the application level. Let me haste to add
>> that it /is/ possible to use libraries, (consider
>> e.g. data-structure organized libraries, amongst which are class
>> libraries), including those specifically built for use in this
>> application, within the application.
>
> I believe we're mostly in agreement. The decoupling I'm talking
> about is in terms of the data structures, not the data. There can, of
> course, be other sources of data than a relational database.

Sure. However, they are irrelevant to the topic.
So - no decoupling from the data, needed by the application,
as described by (the application-relevant subset of) the schema.

>>>> As long as the application uses the same data, the change won't
>>>> impact the schema, and if there are changes in the data
>>>> requirements, it will. No design approach or tool-stack will help
>>>> you avoid the inevitable.
>>> True. However, the application and the database schema change
>>> at different rates and for different reasons. That's a good reason
>>> to decouple them.
>> Yes, no argument here. Please acknowledge the limits.
>>
>> It is predictable which changes necessarily impact both schema and
>> application: the ones driven by changes in the data-requirements.
>
> Date requirements of the application, not of other applications
> that may use the same database.

Yes, exactly changes in the data-requirements of the application.
Again: No approach, no toolstack influences that impact.

>>> Those techniques and general patterns like Dependency
>>> Inversion allow both the application implementation and the
>>> database implementation to be completely replaced without impact to
>>> the other component. That's decoupling.
>> From implementation, not from the schema.
>
> It's decoupling between the two representations that hold
> overlapping subsets of the same data.

It looks like you are reluctant (less strongly than Dmitry,
though) to acknowledge application-independent and
representation-independent existence, meaning and value of data.

From: topmind on


mAsterdam wrote:
> Patrick May schreef:
> > mAsterdam writes:
> >> Patrick May wrote:
> >>>>> If the application implements the solution using a DAG, for
> >>>>> example, and the state of each node in the DAG is stored in a
> >>>>> relational database, then a mapping between the two models must
> >>>>> take place.
> >>>> The data is regrouped. Not a huge decoupling, but yes, more than
> >>>> just aliasing.
> >>> It's more than regrouping. A different set of behaviors (graph
> >>> traversal, etc.) are being supported explicitly.
> >> From the behaviour POV (point-of-view) there is some decoupling
> >> (effort /and/ gain), but it is completely done within the
> >> behavioural realm, not at the border (the schema). The set of
> >> behaviours is not relevant to the data itself to begin with, so from
> >> the data POV no decoupling is gained at all.
> >
> > I think we might have the same disconnect that Mr. Selzer
> > identified. I am using the word "schema" to refer to both the data
> > and the data structures that hold it. Are you using that word to
> > refer to the data only?
>
> That is a loaded question. The schema, as seen from the application,
> describes data. It does use structures to do so.
> The structures holding the data, however, are invisible
> to the application.

Its a matter of interpretation. It can be seen in terms of an
interface, which is a set of conventions. One can also think of it as
a physical "structure", but that is not the only way.

The issue is that OO'ers often don't like the relational "interface"
because it is not OOP, and thus want to wrap it with an interface of
their own. It is *not* "hiding the hardware layer", though, as some
OO'ers claim or suggest, because RDBMS are not hardware, nor "low
level".

In my opinion, wrapping the RDBMS all the time is usually not a good
thing because it merely translates between paradigms rather than adds
anything real. It just complicates the project to translate back and
forth between two high-level conventions, like a butler carrying hand-
written messages back and forth between a fighting couple.

-T-