From: Alfredo Novoa on
Hi,

Christian Brunschen wote:

>The relational model, and SQL, were developed specifically for persistent
>databases.

Wrong, the Relational model was created for databases, persistent or
not.

On the other hand persistence is something relative.

>I agree that code does some things better and DB other things, and one
>uses them *together* in a Yin-Yang fashion. They compliment each other.

DBMS are for data management (business rules) and code application is
for presentation and communication.

>Relational Databases and SQL are tools for the specific task of storing,
>accessing, modifying data

And to enforce data integrity and to derive new data (calculations).

> - they are single-purpose tools

Indeed, its purpose is data management, any data management. Thats why
they are called Database Management Systems.

>A 'general-purpose' programming language is
>one that allows one to write solutions to essentially arbitrary problems
>using it, possibly with some specific exceptions (such as, 'python is a
>general-purpose programming language, but due to its interpreted nature,
>it shouldn't be used for writing interrupt handlers').

In the same way, general purpose programming languages should not be
used for data management (business rules included) because there are
far better tools for this called DBMS.

>One thing to remember is that a RDBMS does _not_ do _anything_ that one
>can't do in code on one's own

Do you mean something like to show windows controls?

> - they are essentially just a pre-written
>library, with a little domain-specific language as part of its interface -
>whereas on the other hand, _most_ of the things you can fo in _code_,
>_cannot_ be done in an RDBMS.

Nonsenses. DBMS must be extendable. You can do any data processing with
a DBMS

>Procedural, funtional, object-oriented languages are all _general-purpose_
>tools for programming computers, for writing essentially arbitrary
>programs.

OO languages are procedural or functional.

> SQL _isn't_. If it were, then why do database vendors create
>languages to extend or 'hook into' the database (Oracle's PL/SQL), or

PL/SQL is a replacement for PSM, the procedural part of ANSI SQL. ANSI
SQL is also Turing complete.

>And again, 3GL can be used to _write_ RDBMS; the converse is _not_ true.

The converse is also true. I could write a compiler for a 3GL in ANSI
SQL or Oracle SQL


Regards
Alfredo

From: Alfredo Novoa on
>One could point to Turing completeness, but clearly, it isn't a real,
>immediate loss. Completeness is rather a precondition. It does not imply
>anything. If my application area does not require something a Turing
>machine can, then I don't care.

ANSI SQL is Turing complente, like most SQL dialects.


Regards

From: Alfredo Novoa on
Christian Brunschen wrote:

>Well, the features of a relational database are primarily that it allows
>you to store data, organised as rows and columns in tables according (more
>or less) to the relational model, and and it does so in a persistent
>manner (what you put into the database isn't going to disappear unless you
>explicitly remove it).

A relational database does not allow to store data because it is a data
set.

The features of a Relational Database Management System are primarily
that it allows to manage data enforcing its integrity.

>If we ignore the persistence aspect, what remains is the organization of
>data according to the relational model.

What remains is the data management capabilities. Almost all.

>But the filesystem remains a persistence mechanism, even though it has
>been used for its 'size' aspect rather than its 'persistence' aspect. So,
>the mere fact that you can use a database in a non-persistent manner
>doesn't make it any less of a persistence mechanism.

A DBMS is not a persistence mechanism, but it might use different
persistence mechanisms like file systems or RAM managers.

>Interesting assertion - do you have anything to back it up with? From
>everything that I have read, it has been extremely clear that he
>relational model was developed for _persistent_ databases, not for
>_transint_ ones.

What did you readed?

>Well, the potential usage of relational databases is the storage,
>organization and access to (persistent or non-persistent) data. that still
>doesn't solve the _vast_ majority of problems out there, because you
>usually have to _do somethin_ with the data (process it somehow), which
>SQL doesn't do.

This statement shows profound ignorance. SQL is intended to manage
data, not to store it.
We can apply any transformation we can imagine (among other things) to
data using SQL .

Regards
Alfredo

From: Alfredo Novoa on

>Well, what about extensible DBMS engines, where you can add new type
>definitions?

A non extensible DBMS does not deserve to be called DBMS :-)

Regards

From: Christian Brunschen on
In article <1138187413.103226.20080(a)g43g2000cwa.googlegroups.com>,
frebe <fredrik_bertilsson(a)passagen.se> wrote:
>> Transactions, as offered by RDBMS:es, are limited to the data stored
>witin the RDBMS,
>> so that if you want to use its transaction capability, you need to store the
>> appropriate data in the RDBMS.
>
>Many RDBMS vendors supports distributed transactions (like XA). Other
>resources, such as messages may also be part of the same transaction.

But those aren't part of what the _RDBMS_ offers.

>The component that controls the transaction is indeed outside the
>RDBMS,

.... as will all the other transaction participants that are not about the
data in the RDBMS itself ...

>but the RDBMS is able to participate in transaction, in opposite
>to a file system.

You could certainly add transctions to a file system, just like they were
once added to RDBMS:es.

>> but if you wanted to use one or maybe two of them _without_ wanting to use its
>> data storage model
>Which other data storage model do you have in mind? XML files? Flat
>files? In most enterprise scenarios these kind of low-level storage
>models is simply not enough.

What I was referring to above was that if, say, you needed transactions in
your application, but didn't need persistence, or queries, or your data
organized in a relational fashion, or referential integrity (because your
transactions involve only computations perhaps), then an RDBMS doesn't
help you at all. So, RDBMS:es only offer transaction support _for things
within their domain_ - i.e., they offer transactions as an addition to
their data storage model, rather than as a general tool for you to use.

Also, since 'data storage model' doesn't necessary have to refer to
persistent storage, a suitable storage model might be one of in-memory
records with pointers between them, possibly in sets, maps, arrays, or
similar; of, if you wish, objects with references to other objects and
collections thereof.

>> both queries, transactions and referential integrity
>> are just as applicable to persistent as to non-persistent data,
>Exactly my point. You need queries even if you don't need persistence.
>A RDBMS may be useful even if you don't have any persistence needs.

True - but only if you are working with a RDBMS-like data model to start
with.

>> So, if all you want is transactions, an RDBMS probably shouldn't be the first
>> place to go.
>So, where should I go?

That I don't know; I do know that J2EE, for all its many faults, does also
include transactions in things like JMS (for messaging), so that might be
a starting point.

>> What precisely is your definition of an 'enterprise application'?
>I don't have a clear definition. I use the word to make people
>understand that I am not talking about MP3 players, FTP clients etc. I
>am mainly talking about applications for accounting, logistics
>management, production control, etc.

OK.

>> but from your statement above it sounds like you would characterise enterprise
>> applications as using databases _not_ for incoming our outgoing data, but
>> _mainly_ for transient data used only in the process of whatever they are
>> doing?
>An average enterprise application need persistence. But they also need
>a lot of features provided by a RDBMS that is not related to
>persistence (such as quieries).

'queries' will only really be needed if you are working with something
that is already in a RDBMS data model; if your data aren't in that model
already, queries can be replaced by other data access methods. If you
remove the presumption of having a RDBMS in place, then a lot of its
corollaries go away as well.

>> So, databases were used to overcome the deficiencies in COBOL's support
>> for data structures?
>No, the creators of COBOL did not make any advanced collection features
>in the language simply because it was not necessary.

Because they decided it wasn't necessary, or because they didn't yet
percieve the necessity, or perhaps because they thought that oher things
were _more_ necessary?

>A high-level
>language was supposed to not handle data in a low-level way. Collection
>handling was supposed to be done in a high-level way (SQL).

Was SQL then defined before COBOL was?

Also, what is considered to be 'high' and 'low' level, can sometimes
depend a bit on your point of view. I have a feeling that a lot of what
some COBOL business applications do, would be considered 'low level' from
some points of view.

>>>If you look at enterprise applications outside the OO world, you will
>>>find that they heavily use embedded SQL.
>>Please, give me some more specific pointers.
>Do you doubt that pre-OO applications make heavy use of embedded SQL?
>Look at the Oracle products Pro*C or Pro*COBOL for example.

The use of embedded SQL (SQL code embedded in COBOL or C code, IIRC,
usually with a preprocessor of some kind which resolves the embedded SQL
into library calls that talk to the database, right?) is only a different
interface to using a RDBMS, than using, say, JDBC from within Java, and
has more to do with making the database quickly and easily accessible
for the developer by creating a variant of C and COBOL, respectively,
where SQL code can be written directly rather than having to write
explicit library calls. It probably also had a lot to do with the fact
that different database vendors could then offer compilers for embedded
SQL which would rtanslate it to _their_ subroutine calls to access the
database, when there was no standardisation on how databases should be
accessed (a la JDBC or ODBC).

>The
>corresponing product for java, SQLJ, has gain very little attention
>because the OO world rejects the use of embedded SQL.

... but not the use of _SQL_, just of that particular way of using SQL
within Java. I would consider that more of a case of saying ' let's use a
library explicitly, rather than creating a variant of the language' (which
is was 'embedded SQL' does, essentially).

'Embedded SQL' does _not_ equate that it was used for non-persistent uses.
I did, in fact, a long time ago write some simple reporting code using
embedded SQL in C, for an oracle database, and it was definitely using
embedded SQL to access persistent data, and persistent data only. Not that
this says that embeddes SQL can't _also_ be used for non-persistent data,
but it just shows that 'embedded SQL' is simply one way to use SQL,
whether in a persistent or transient manner.

>> If you package up your data structures appropriately and offer suitable
>> operations on them, you can end up with a system that becomes similarly
>> easy to use as a database,
>Lets say i want to find every customer order from a customer located in
>a given city I use this select statement:
>select *
>from order
>join customer on order.customerid=customer.id
>where customer.city=?
>
>How would your code look like?

Presuming I was using, say, Apple's Cocoa frameworks, and had written my
City, Person, and Order classes to follow the coding guidelines for things
like their Key-Value Coding, I would write:

City *city = /* ... */;
NSSet *orders =
[city valueForKeyPath:@"customers.(a)unionofSets.orders"];

This will ask the 'city' object for the value for its 'customers' key,
which will return a set, upon which we will calculate the union of the
sets that will be retrieved by looking up the value for each customer's
'orders' key, thus giving us, in total, the orders placed by all the
customers in the given city.

You might note that this doesn't need to perform a join or similar, as
each object references (through a set, for to-many relationships) its own
related objects. I suspect this would be faster than your SQL query, even
if you are using an in-memory database. Conversely, of course, this
solution may be more memory-intensive, as each city will have to maintain
a set of all the customers that live there, etc, so it may be a tradeoff
for speed vs memory.

Cocoa's key-value coding is described at
<http://developer.apple.com/documentation/Cocoa/Conceptual/KeyValueCoding/> ,
and the set and array operators are described at
<http://developer.apple.com/documentation/Cocoa/Conceptual/KeyValueCoding/Concepts/ArrayOperators.html>
..

You may also want to take a look at Cocoa's 'Core Data' framework, at
<http://developer.apple.com/documentation/Cocoa/Conceptual/CoreData>
, as well as Enterprise Objects Foundation (EOF), part of WebObjects, at
<http://developer.apple.com/documentation/WebObjects/Enterprise_Objects/>
..

>> It may well be that even an in-memory RDBMS might
>> be too slow for your application.
>But not very likely for enterprise applications. The most of time
>overhead with using a RDBMS is in the inter-process and network
>communication. Using stored procedures gives you a huge performance
>gain.

It does mean, however, that you are placing your business logic no longer
in your application, but in the RDBMS. With RDBMS:es like hsqldb, both
application and RDBMS may all live together in a single JVM, but it is
still a conceptual move of the business logic. A bespoke data structure
(objects referring to other objects, with their business logic embedded in
the objects themselves) may well be a faster option still.

>> Of course, looking at hsqldb, stored procedures etc would be written in Java,
>> just as the rest of the program, and executed potentially within the same
>> virtual machine ... Intersting things to think about.
>Done it already. Love it. But other RDBMS have support of java stored
>procedures too.

But then we still have the placement of business logic in the database,
which is supposed to be about storing the data. Certainly, it keeps data
and operations together (similarly to what OO does), but it does so in a
way which I personally feel is more 'forced' and less 'elegant' (being
fully aware that I am using those two terms in a perfectly subjective
way).

I guess in a way one can debate about whether you prefer to have the
database 'do it all' and the application just controlling the database
from a very high level, or whether you want the application to 'do it all'
and just use a database for low-level storage. From an OO point of view,
the 'natural' representation is objects, with methods for business logic,
and the database would just hold a copy of the data for persistence; from
a DB point of view, the database holds the data and handles all operations
on it including its business logic, and the application just adds a thin
layer of control or presentation. Both views are interesting; I personally
prefer the separation of the storage and operations that the OO approach
offers, though I will not in any way shape or form suggest that the
DB-oriented view is 'wrong' or 'invalid'.

>Fredrik Bertilsson
>http://butler.sourceforge.net

Best wishes,

// Christian Brunschen