SQL [OOP]

Prev: Use Case Point Estimation
Next: delegation vs. inheritance

From: H. S. Lahman on 19 Jan 2006 15:22

Responding to Frebe...

>>>>SQL represents a
>>>>solution to persistence access that is designed around a particular
>>>>model of persistence itself.
>>>
>>>Since when is the relational model a "model of persistence". Can you
>>>provide any pointer showing that the relational model is supposed to be
>>>a "persistence model".
>>
>>I didn't say that. The RDM is a model of static data so it can be
>>applied to UML Class Diagrams as well. Note that I was careful to say
>>that SQL is a solution to persistence /access/ when the data is
>>represented in RDB form. As you have pointed out elsewhere one could
>>create a RAM-based RDB and use SQL to access it with no persistence.
>
>
> You agree that the relational model is not only about persistence? But
> why is the SQL language limited to only the persistence features of the
> relational model?

As my last sentence indicates, SQL is not limited to persistence.
However, that is probably where 99.99% of the usage lies.

>>>>Try using SQL vs. flat files if you think it is independent of the
>>>>actual storage mechanism. (Actually, you probably could if the flat
>>>>files happened to be normalized to the RDM, but the SQL engine would be
>>>>a doozy and would have to be tailored locally to the files.) SQL
>>>>implements the RDB view of persistence and only the RDB view.
>>>
>>>Yes, the files need to be normalized to the RDM but why do you make the
>>>conclusion that SQL needs a RDB?
>>
>>SQL requires the data to be in tables and tuples with embedded identity.
>
>
> SQL does not requires the data to be in tables. The data may reside in
> flat files or RAM structures. Just because the SQL language uses the
> keyword "table" does not actually mean that it need to be backed up by
> a physical table. SQL is an interface, remember?

SQL is specifically designed for the RDB implementation paradigm of the
RDM. If you want to use SQL for flat files, those files will have to be
especially formatted (e.g., with embedded identity keys) and normalized.
You could develop a SQL driver to use file names as table identity and
read lines via an implied line number as a key, but good luck on
correctly dealing with line insertions and deletions without an embedded
key.

>>The RDM, when applied in a broader context than Codd, does not require
>>that. SQL also assumes a very specific paradigm for navigating table
>>relationships.
>
>
> I think you are trying to hijack relational theory here. Do you have
> any pointers to this second definition of relational theory?

The RDM is basic set theory. Codd was explicitly dealing with
persistence in a computing environment so he expressed the rules in
terms of embedded identity attributes (keys). However the set theory
only requires that each tuple have unique identity. Similarly, Codd was
only dealing with data properties but there is nothing in the underlying
set theory to preclude behavior properties, so his view represents a
specialization. Thus you will see a discussion of normalization of
Class Models in most standard OOA/D books (e.g., "Executable UML" by
Mellor and Balcer pg. 77, which happened to be the first one I pulled
off my bookshelf).

*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
hsl(a)pathfindermda.com
Pathfinder Solutions -- Put MDA to Work
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
(888)OOA-PATH

From: Patrick May on 19 Jan 2006 16:25

"topmind" <topmind(a)technologist.com> writes:
> H. S. Lahman wrote:
> > Go look at an SA/D Data Flow Diagram or a UML Activity Diagram. They
> > express data store access at a high level of abstraction that is
> > independent of the actual storage mechanism. SQL, ISAM, CODASYL, gets,
> > or any other access mechanism, is then an implementation of that generic
> > access specification.
[ . . . ]
> Plus, what such UML models often do is something like:
>
> method getGreenScarvesCostingLessThan100dollars(...) {
> sql = "select * from products where prod='scarves' and color='green'
> and price < 100"
> return(foo.execute(sql))
> }

Please provide a cite to any commercial or open source tool that
creates such code from UML models.

Sincerely,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | The experts in large scale distributed OO
| systems design and implementation.
pjm(a)spe.com | (C++, Java, Common Lisp, Jini, CORBA, UML)

From: Patrick May on 19 Jan 2006 16:28

"Mikito Harakiri" <mikharakiri_nospaum(a)yahoo.com> writes:
> Patrick May wrote:
> > "topmind" <topmind(a)technologist.com> writes:
> > > Again, I have yet to see an objective or even semi-objective way to
> > > measure "complexity".
> >
> > I suggest you Google for "software complexity" and you'll
> > find several million links. Starting from a page like
> > http://yunus.hun.edu.tr/~sencer/complexity.html will give you
> > pointers to other research if you are genuinely interested in
> > learning.
>
> Hmm. I tried to find "software complexity" in wikipedia and failed.
> Apparently this topic (and the link you supplied) is a typical
> example of junk science.

Apparently your research skills are on a par with your logic.
The point I was making is that there are objective measures of
software complexity. The most appropriate metrics vary depending on
the domain, environment, and other factors, but the metrics themselves
are objective.

This is an area of active research, as you would know had you
done your homework before making spurious claims about "junk science."

Sincerely,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | The experts in large scale distributed OO
| systems design and implementation.
pjm(a)spe.com | (C++, Java, Common Lisp, Jini, CORBA, UML)

From: H. S. Lahman on 19 Jan 2006 16:50

Responding to Parker...

>>>>Of course it's an implementation! It implements access to physical
>>>>storage.
>>>
>>>That literally doesn't make sense. It's like saying that a Java interface
>>>is an implementation because it implements access to the properties of a
>>>physical instantiation.
>>
>>You have to step up a level in abstraction. Imagine you are a code
>>generator and think of it in terms of invariants and problem space
>>abstraction.
>>
>>The invariant is that all physical storage needs to be accessed in some
>>manner. There are lots of ways to store data and lots of ways to
>>access. Therefore ISAM, SQL, CODSYL, and C's gets all represent
>>specific implementations of access to physical storage that resolve the
>>invariant.
>
>
> It's not SQL that's the implementation, even in this context; it's the
> driver that implements the connectivity to the data provider. Any data
> provider that is capable of supplying data in conformance with the SQL
> data model will do. A typical driver supports connectivity to a

The operative phrase is "in conformance with". SQL reflects a very
narrowly defined model for data representation...

> relational database, but it's common these days for middleware vendors
> to provide drivers that will adapt web service output to SQL, according
> to certain rules. You can get drivers that will adapt flat file
> content to SQL, XML to SQL, and dynamic content from a C++ application
> server to SQL. Why do vendors provide such things? Because SQL is a
> standard querying language that is widely supported by a whole host of
> tools, because it provides a standard interface to data that conforms
> to a particular data model. The physical origin of that data is

Quite so. Standardization is a Good Thing. So using SQL _when the data
model conforms_ can be a good thing. However, outside CRUD/USER
processing the problem solution data model often does not conform. If
the persistence data model does conform, then one needs a conversion
between the views and I suggest that the conversion to SQL should be
encapsulated in an application subsystem.

I also suggest that when the persistence data model does not conform
(e.g., flat files), then converting to SQL as intermediary when the
solution data model doesn't conform either is just a waste of time. One
would be better off just converting directly between the views once
rather than performing two conversions. [As far as reuse is concerned,
file managers provided that sort of reuse in the '70s for flat files,
and one can provide quite generic interfaces tailored to an XML model.
But don't get me going on DOMs, which strikes me as akin to providing a
SQL interface for flat files.]

> irrelevent. It's not just a question of conveniencing users with a
> syntax that they already know; it's a question of supporting automatic
> binding and reusing exisiting tooling.
>
> In your code generator example, you could easily have multiple drivers
> supporting SQL that resolve against very different physical data
> sources. (Not to mention XQuery drivers that resolve against the same
> sources.)

But why? Surely when one does not have to maintain the code it makes no
difference how one performs the conversion, so SQL standardization is
irrelevant since the application developer will never see SQL. Since
one has to provide a driver for each paradigm anyway in the
transformation engine, why not provide one that efficiently represents
the paradigm directly?

>>>I think it's fair to say that SQL has, for all its faults, been enormously
>>>successful, to the tune of a multi-multi-billion dollar industry, and that
>>>the UML translationist approach has not. It's been over ten years since the
>>>translationist industry has claimed to have solved the problem of 100
>>>percent translation, but where is it, it's niche, it's nowhere. Other
>>>technologies have arrived, e.g. the W3C XML stack and particularly XSLT
>>>transformation, that dwarf executable UML in application. Why do you think
>>>that is? What do you think it is about software development that makes
>>>executable UML marginal, and other technologies like SQL important?
>>
>>All the world loves a straight man. B-)
>>
>
> :-)
>
>
>>However, the big translation demo lies in CRUD/USER processing.
>> Any time one develops an application using a RAD IDE like Access or
>>Delphi one is essentially using translation. That's a multi-billion
>>dollar niche that has been around since the '80s.]
>
>
> Right, but Access doesn't do this with executable UML. I don't think
> Access would benefit from going in this direction.

It wouldn't. UML is an OOA/D notation and I think OO is overkill in the
CRUD/USER realm precisely because translation automation is already
provided for most of the stuff OO would deal with in that niche via IDEs
like Access. It was just an example of translation at work in a major
industry segment; I was just addressing your assertion that translation
isn't being used. Access is a translation tool; it is just limited to
CRUD/USER processing while UML is a bona fide general purpose 4GL.

>>A fourth reason is the lack of standardization. Until OMG's MDA effort
>>all translation tools were monolithic; modeling, code generation,
>>simulation, and testing were all done in the same tool with proprietary
>>repositories, AALs, and supporting tools. (Prior to UML, they each had
>>unique modeling notations as well.) That effectively marries the shop
>>to a specific vendor. In '95 Pathfinder was the first company to
>>provide plug & play tools that would work with other vendor's drawing
>>tools. MDA has changed that in the '00s so now plug & play is a reality.
>>
>
> I grant that we are moving towards a service based world, with
> graphical interfaces playing a role in the assembly and orchestration
> of services, but I see no evidence that the glue is going to be
> executable UML. I don't think that many people care about what the OMG
> is doing anymore. Events have overtaken them. I don't think MDA is
> very important. There are other evolving standards for plug and play.

Plug & play is an issue for supporting tools. I see monolithic
I-am-the-development-environment tools as being a bar to automation in
general and translation in particular, which was the issue here. MDA
has been very helpful in bringing in the necessary conceptual
standardization.

Meanwhile eUML is an OO development design methodology. An eUML OOA
model for translation should be indistinguishable from an OOA model for
traditional elaboration. [They usually are distinguishable because
elaboration OOA models a typically done in a very sloppy manner because
errors, missing processing, etc. can be "fixed later". You can't get
away with that sort of sloppiness in translation because the code
generator does what you say, not what you meant.]

Translation is just a merger of a rigorous OO design methodology with a
suite of automation tools for the computing space.

I think MDA is very important today. Entire highly-tailorable
development environments like Eclipse have been enabled by it. I agree
it doesn't directly matter a whole lot to application developers because
they still do OOA/D the same way. [Using eUML if they are serious about
doing OOA properly. B-)] But it has a great effect on the tools that
make their lives easier. Standardization breeds plug & play and that
fosters competition and economies of scale through specialization. In
the end developers will be much better off for MDA regardless of whether
they use translation or elaboration. (Note that most of the traditional
round-trip elaboration vendors are Major Players in the MDA initiative.)

Translation just takes that further so that the application developer
never has to worry about details like SQL, EJB, XML, TCP/IP, or a host
of other deterministically defined computing space technologies and
techniques. Those things are completely transparent to the
translationist. Not having to think about stuff like that when solving
the customer's problem improves productivity a great deal.

One final note. Major software houses have been buying traditional
translation vendors for the past 3-4 years to establish strategic
positions in translation. (Some like IBM/Rational/ObjectTime are
two-tiered purchases!) Project Technologies, Mellor's firm that started
it all, was just bought last year. IBM, MS, CA, Mentor et al are all
getting into translation. I believe Pathfinder and Kennedy-Carter are
the only two pure translation venders that are still independent. So
the Deep Pockets Guys seem to think there is a future there. B-)

*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
hsl(a)pathfindermda.com
Pathfinder Solutions -- Put MDA to Work
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
(888)OOA-PATH

From: Patrick May on 19 Jan 2006 17:11

"topmind" <topmind(a)technologist.com> writes:
> > The ability to model behavior as well as data makes general
> > purpose languages better able to model the problem domain than is
> > SQL.
>
> If you design right, you can *shift* much behavior to being data and
> DB operations instead.

Depending on the requirements, some functionality can be
implemented using set operations, certainly. "Much" is pushing it,
especially when one limits oneself to non-gratuitous use of those
operations.

> SQL is close to being Turing Complete.

In other words, SQL is not Turing complete. That addresses your
original question:

> > > But how are tables less close to the domain than classes,
> > > methods, and attributes?

We're done with that one.

> Plus, OO is usually crappy at modeling behavior, at least in the biz
> domain. OO is only nice when things split up into nice hierarchical
> taxonomies. Most things don't in reality, so what is left is a mess.

You've been challenged on this assertion in the past and failed
to defend it. The history is available via Google for anyone to see.
Unless you've got more to back up your nonsense than you did before,
repeating this is intellectually dishonest.

> > > > Proliferation of get/set methods is a code smell.
> > > > Immutable objects are to be preferred.
> > >
> > > This is not the censuses in the OO community.
> >
> > Yes, it is. Josh Bloch recommends immutability explicity in
> > "Effective Java" and gives solid reasons for his position.
> > Proliferation of getters and setters violates encapsulation, one
> > of the defining characteristics of object technology. Some
> > research will show you that OO designs focus on behavior, not
> > state. You should also check out the Law of Demeter and similar
> > guidelines that provide further evidence that excessive use of
> > accessors and mutators is not good OO form.
>
> Almost all of these have a fair amount of disagreement among OO
> proponents. Check out c2.com.

Interesting. I provide explicit examples of what are generally
accepted as good OO principles and practices and you refer to a random
website. If you have real documentation of getter/setter
proliferation being an accepted OO technique, produce it.

> > > I would note that a lot of the issues you mentioned, such as
> > > performance, scalability, resiliency, and recoverability can be
> > > obtained by purchasing a "big-iron" RDBMS such as Oracle or
> > > DB2. The configuration and management of those issues is then
> > > almost a commodity skill and not as tied to the domain as a
> > > roll-your-own solution would be (which OO'ers tend to do).
> >
> > It is statements like this that strongly suggest that you
> > have never developed a large, complex system.
>
> No, because I break them up into peices so that they don't grow to
> be one big fat EXE. The Big-EXE methodology has a high failure rate.

Modularity is not exclusive to imperative programming. It is
also not the silver bullet that slays the complexity lycanthrope.

> > The vast majority of businesses that need systems of this
> > complexity have legacy software consisting of a number of COTS
> > applications and custom components, none of which were designed to
> > work with each other. These have been selected or developed for
> > good business reasons and cannot be aggregated and run on a single
> > piece of kit, no matter how large.
>
> Agreed, but what does the existence of legacy apps have to do with
> your scaling claim?

The existence of legacy systems is just one reason why your
suggestion of using '"big-iron" RDBMS such as Oracle or DB2' cannot
solve the complex problems of large organizations.

> I did not mean to suggest that everything should be centralized.

That is what you suggested above.

> It depends on the kind of business. If the vast majority of
> operations/tasks needed are per location, then regional partitioning
> works well. If not, then a more centralized approach is needed. For
> example, airline reservation and scheduling systems would make a
> poor candidate to partition by location because of the
> interconnectedness of flights. However, individual stores in a big
> franchise can operate most independently.

Your lack of experience with large, complex systems is showing,
again. Basically you're suggesting one or more monolithic processing
hubs -- the simple CRUD/USER stuff you're used to writ large. Are
those the only kinds of problems you're used to?

> > > "Security" is mostly just massive ACL tables.
> >
> > That is profoundly . . . naive. I strongly urge you to read
> > everything you can find by Bruce Schneir, join the cryptography
> > mailing list run by Perry Metzger, and not say another word about
> > security until you understand why your statement is so deeply
> > embarrassing to you. For a quick, very small taste of why ACL
> > tables don't even begin to scratch the surface of the problem,
> > read http://www.isi.edu/gost/brian/security/kerberos.html.
>
> I see no mention of ACL's there.

That's my point.

> If you have a specific case of ACL's crashing and burning, post it
> here and let's take a look at it. (Note that there are a lot of
> variations of ACL's, so a flaw in one kind is not necessarily a
> general flaw in ACP concepts.)

I never claimed that ACL's crash and burn. I said that ACLs
barely scratch the surface of the security requirements of a large
distributed system. Clearly you don't have any experience with such.

> Again, use evidence instead of patronizing insults. It is a bad
> habit of yours.

I can see how my habits of calling bullshit when I smell it and
not suffering fools gladly would be considered "bad" by someone with
your proclivities. That doesn't change the fact that your claim that
"massive ACL tables" address the security requirements of large
distributed systems is ridiculous on its face. Patronization is the
best you can expect when you spew nonsense like that.

> > CRUD applications are, however, not particularly complex as
> > software systems go. Your claims otherwise indicate a lack of
> > experience with anything else.
>
> Again, please use evidence to prove me wrong instead of patronizing
> insults. It is a bad habit of yours.

How is that patronizing? It's a simple statement of fact. There
is a reason why the CRUD work is typically given to new hires and
junior developers.

> Propose a way to measure "complexity", and then apply it to CRUD
> apps. That is how you make a point. Anecdotes and private personal
> opinions mean very little here. They are a dime a dozen.

If you're seriously suggesting that CRUD applications are equal
in complexity to compilers, telco OSS/BSS, MRP/ERP, or risk analytics,
just to pull a few examples off the top of my head, then that reflects
more on your experience than on the veracity of your claim.

> > On the other hand, there are some delightfully complex
> > software systems that consist of only a few hundred lines of code.
> > Functional languages seem especially good for this. See one of
> > Peter Norvig's books for a few examples.
>
> Most FP demonstrations of such are "toy" or "lab" examples.

Dismissing out of hand systems of which you know nothing. That's
a bad habit of yours.

I'd be tempted to ascribe your apparent need to continue
discussions ad nauseam to some form of obsessive-compulsive disorder,
but I don't have a background in psychology so I won't. You should
try that not-talking-about-things-you-know-nothing-about approach
sometime.

Sincerely,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | The experts in large scale distributed OO
| systems design and implementation.
pjm(a)spe.com | (C++, Java, Common Lisp, Jini, CORBA, UML)

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Prev: Use Case Point Estimation
Next: delegation vs. inheritance