From: Mikito Harakiri on
Dmitry A. Kazakov wrote:
> >> If you want a challenge, fine. Take any machine learning method. Training
> >> sets are ideal tables, rows and columns, nothing else. Take any method of
> >> your choice and implement it in SQL! For introduction to existing methods,
> >> see excellent tutorials by Andrew Moor:
> >>
> >> http://www.autonlab.org/tutorials/
> >
> > How about something from the domain of custom biz apps. I have already
> > conceded out of ignorance of the domain that DB's may not be good for
> > heavy-duty numerical analysis.
>
> This is not numerical analysis. This a challenge to the relational approach
> in general. There is a problem specified relationally, that has an obvious
> relational solution, yet nobody would even try to use RDBMS for that. How
> could THAT happen?
>
> [ Answer: RDBMS are completely unsuitable either for matching in
> multidimensional spaces, or for best-match criteria. The idea of separating
> data and using normalized search schemas simply does not work beyond
> trivial cases. Now, you will tell me that all biz-applications are trivial,
> but I won't buy it. And for all I won't buy a tool limited to trivial
> cases! ]

It is good that you already know the answer. Let me provide the
relational one. Assume you have the learning sample as a binary
relation S:

X Y
------
(1,1)
(2,4)
(3,9)
(4,16)

Your AI program is asked to answer what the value Y is when you input
X=5. Easy in relational world: just construct a relation A of all the X
values you want the answer, e.g.

X
--
5
6
7

and outer join it with the learning relation S

select A.X, S.Y from S left outer join A
on S.X = A.X

The output is

X Y
------
(1,1)
(2,4)
(3,9)
(4,16)
(5, NULL)
(6, NULL)
(7, NULL)

Therefore, the standard outer join can be viewed as a primitive machine
learning method. More sophisticated methods (eg listed on the page
http://www.autonlab.org/tutorials/ the referenced) are just variants of
outer join.

From: Patrick May on
"topmind" <topmind(a)technologist.com> writes:
> > You're renaming the combination of relational and one or more
> > additional approaches to be "relational." When you change the
> > meaning of your terms like that, mid-discussion, it makes rational
> > interaction impossible and gives every appearance that you would
> > rather knowingly equivocate than admit that you were wrong.
>
> This is why I wish to move to an example and scenarios. I feel we
> have stretched the linguistic approach to evidence as far as we can.

Wow. You commit a textbook example of the fallacy of
equivocation and not only refuse to acknowledge it but blame your
failure of logic on . . . well I'm not sure exactly what you're
blaming it on, but you sure aren't taking responsibility for your own
statements.

This thread is long past its useful life and it is painfully
clear, based on replies like the above, that your purpose is not
rational discussion. I don't really know what you think your goal is,
but all external appearances indicate that you're trolling.
Whether you're doing it consciously or not, it's pretty distasteful.

In any case, I've made my points to my satisfaction. You
obviously aren't going to be convinced by any amount of reason or
evidence. The last word is yours.

Sincerely,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | The experts in large scale distributed OO
| systems design and implementation.
pjm(a)spe.com | (C++, Java, Common Lisp, Jini, CORBA, UML)
From: topmind on

Patrick May wrote:
> "topmind" <topmind(a)technologist.com> writes:
> > > You're renaming the combination of relational and one or more
> > > additional approaches to be "relational." When you change the
> > > meaning of your terms like that, mid-discussion, it makes rational
> > > interaction impossible and gives every appearance that you would
> > > rather knowingly equivocate than admit that you were wrong.
> >
> > This is why I wish to move to an example and scenarios. I feel we
> > have stretched the linguistic approach to evidence as far as we can.
>
> Wow. You commit a textbook example of the fallacy of
> equivocation and not only refuse to acknowledge it but blame your
> failure of logic on . . . well I'm not sure exactly what you're
> blaming it on, but you sure aren't taking responsibility for your own
> statements.
>
> This thread is long past its useful life and it is painfully
> clear, based on replies like the above, that your purpose is not
> rational discussion. I don't really know what you think your goal is,
> but all external appearances indicate that you're trolling.
> Whether you're doing it consciously or not, it's pretty distasteful.
>
> In any case, I've made my points to my satisfaction. You
> obviously aren't going to be convinced by any amount of reason or
> evidence. The last word is yours.

I was honestly interested in seeing scenarios where RDB/ACL's don't
work very well. Honest. Apparently I won't get any from you. You seem
to want to play lawyer instead of talk tech. That is how it comes
across. You seem to have a Fear of Scenarios. Scenarios are good
things.

Does anyone else want to volunteer such scenarios, if you know of any?

>
> Sincerely,
>
> Patrick
>

-T-

From: topmind on

Dmitry A. Kazakov wrote:
> On 5 Feb 2006 14:00:22 -0800, topmind wrote:
>
> > Dmitry A. Kazakov wrote:

> > When I see an actual case with code from my domain, I might change my
> > mind. Types don't model biz things very well IMO.
>
> They need not to. The whole point is that relational is typed as well. Full
> stop.

For a loose or wide definition of "types", perhaps.

>
> Whatever advantages of modern typed systems could be used to improve (sic!)
> existing typed (sic!) relational systems is another question. It is up to
> developers.

Go for it. Improve away.

>
> >>> How specificly is "types" different from "validation" in this example?
> >>>
> >>> x = add("123","99.28");
> >>> y = add("foo","7");
> >>>
> >>> The language does not care what "foo" is here. Only the "add" function
> >>> will care when it checks to see that the first parameter is a valid
> >>> number.
> >>
> >> The above isn't properly typed. "123" has the type String. Strings aren't
> >> additive.
> >
> > Language can and do this. Label it however you want, it works.
>
> Really? What is the result of add("fee", "123") ?
>
> "1069"? "4201"? "fee123"? "sysadmin(a)error.org"?

An error. Unless perhaps we have something optional like:

x = add(a, b, valueOnError=0)

Or

x = add(a, b, onError="raiseErr('std')")

>
> >>> There are other things it might check such as range because it
> >>> may not be able to add large numbers. Types cannot do this very well
> >>> unless we either pick arbitrary chunk sizes, or create a type for every
> >>> possible length/size, which is dumb.
> >>
> >> Types do it perfectly. I can have a wide set of numeric types having
> >> different models.
> >
> > And long-winded confusing code to use it.
>
> Huh, the code you presented is 1) longer, 2) far more confused.

I don't have to define and hunt down definitions of types. You will
have something like:

a = new integer(min=-223452345, max=234234234);
b = new integer(min=-09809809809, max=242093423423);
x = a.add(b);

>
> As I have shown, it is not type free. It is typed in a definite way.

I suspect we would need a clear definition of "types" to settle this.
Either way, it is not relying on "types" in the usual strong-typed way.
By the way, many scripting languages make no internal distinction
between:

x = 123

and

x = "123"

Some keep an internal "flag" to track how it was defined, and some
don't, treating everything more or less as strings with no internal
flags to indicate origin or history of usage.


> How do
> you know that this way is the proper one? Do you have use-tests to verify
> that? Did you count all these tests? Where are use tests for accuracy,
> rounding errors?

Rounding errors for "add"?

(I never liked how floating point did division anyhow.)

>
> > Businesses are run by scrooges: they want productivity at a low cost.
>
> I'm sure, they know pretty well what risk management is.

Indeed, and often don't want to pay for too much of it.

>
> > Strong typing does not provide this.
>
> SQL is typed, so I see no point.

Perhaps, but it is possible to create a nearly type-free version of
SQL.

SQL-Lite (sqlite.org) is allegedly one such tool, although I've never
tried it.


>
> >>>> but also utterly inefficient with respect of
> >>>> space and time required for may operations.
> >>>
> >>> When I encounter enough of such scenarios,
> >>
> >> Put an image in relational table, so that each pixel would be a cell.
> >> Put a program code in a relational table and write compiler in SQL
> >> ...
> >
> > Why SQL? SQL is one of an infinite possible relational languages.
>
> Fine, show us an outline of a better language limited to solely relations
> and we'll see.

Hold on, are we talking about implementation here, or the language? I
don't see how the language relates to compactly representing pixels
internally.

As far as implementation, one could create a table such as:

table: images
--------------
imageRef // image ID
x
y
pixelValue

It could be mapped directly to a RAM image if we want speed and
compactness. (Perhaps it should be called a "view" and not a table,
although philosophically they are perhaps the same thing.)

>
> >>> Further, relational does not dictate implementation.
> >>
> >> Wrong, it puts definite limits of the implementation.
> >
> > Show me a definite limit.
>
> 1. Row ID. It falls out of model, completely. In a truly relational model
> you cannot mark either cells or rows, they have no identity of their own.

Please clarify. Relational does not limit auto-generated keys. Some say
it "encourages" one to use domain keys, but this has never been
settled, and some argue that auto-gen keys are or can be domain keys.

>
> 2. Tables of tables.

This is a *good* restriction of the relational model. Hard-nesting is
hard-wiring a particular viewpoint or (access path) into the model,
which goes against the relativistic philosophy of relational. Dr. Codd
set out to purposely avoid hard-wired access paths when he started
thinking about relational due to the navigational messes that others
were creating. However, it does not limit what can be modelled
externaly. One can still model nested stuff using non-nested tables.

It is not a technical "limit", but a philosophical guide-wall.
Relational imposes rules to be called "relational". Otherwise, it would
turn into the navigational messes that motivated the creation of
relational to begin with.

>
> 3. Generic programming in terms of sets of tables sharing common
> properties.

You seem to want to model paradigm X in paradigm Y. That is the wrong
approach. You *use* paradigm X, not change it to look like another
paradigm. Relational is a paradigm. If you don't like its rules, then
leave it.

It is like saying the "no side-effects" philosophy of Functional
Programming is limiting, and thus one should dump it. If you dump it,
you don't have FP anymore.


> Referential GPS resolution is about 30cm horizontally. Let's take, say, a
> Germany map (not such a big country) with 30x30cm mesh... [Hint: roads have
> limited width.]

One could model roads as polygons if you want that precision. GIGO in
effect here.

>
> > There are other, perhaps better, ways to do this. I added a "Point"
> > table so intersection info may be easier to represent. This is just a
> > quickie demo.
>
> ... of why this cannot work. Not to mention that GPS navigation system
> calculates from a stream of longitude, latitude, height, mileage, speed
> from the car, not only the road position, but relative positions of close
> objects and the direction you are driving. All this is done with a high
> level or error tolerance. GPS is very unreliable.

Are you bashing relational or GPS, I cannot tell here.

> Forgetting for a while,
> that searching the DB you proposed would require several days (years?) per
> query and a large truck to load the DB onto. Just write down a SELECT
> statement for the model above!

If the properties you are searching on are indexed, I see no reason why
queries should run long. Plus, it is probably practical to internally
partition the tables into states or counties to speed up
location-specific queries, which are probably the most common for
traveling. (FBI queries, for example, may be different in that aspect
because they may look all over in a given query.) I believe Oracle and
other vendors has the ability to internally partition like this without
changing the external view (such as table name.

And, I don't know where you are getting your size estimates.

>
> > One of the biggest selling points of RDBMS is that multiple languages
> > can use them.
>
> You mean the languages, without which one couldn't be able to use these
> RDBMS. I don't see how relying on concurrent products makes yours any
> better...

Sharing lots of company info in multi languages and tools without using
a DB is a royal pain.

>
> >>> Show with code.
> >>
> >> But you said that you aren't interested in static program correctness
> >> analysis.
> >>
> >> If you want a challenge, fine. Take any machine learning method. Training
> >> sets are ideal tables, rows and columns, nothing else. Take any method of
> >> your choice and implement it in SQL! For introduction to existing methods,
> >> see excellent tutorials by Andrew Moor:
> >>
> >> http://www.autonlab.org/tutorials/
> >
> > How about something from the domain of custom biz apps. I have already
> > conceded out of ignorance of the domain that DB's may not be good for
> > heavy-duty numerical analysis.
>
> This is not numerical analysis. This a challenge to the relational approach
> in general. There is a problem specified relationally, that has an obvious
> relational solution, yet nobody would even try to use RDBMS for that. How
> could THAT happen?


I don't know. I do notice that academics tend to be poorly trained in
RDBMS. They could simply be afraid of them out of unfamiliarity.
Second, RDBMS had been fairly expensive until the recent advent of
usable OSS solutions. When pressed on budget, people tend to roll their
own.


>
> [ Answer: RDBMS are completely unsuitable either for matching in
> multidimensional spaces, or for best-match criteria. The idea of separating
> data and using normalized search schemas simply does not work beyond
> trivial cases. Now, you will tell me that all biz-applications are trivial,
> but I won't buy it. And for all I won't buy a tool limited to trivial
> cases! ]

Further, if there are specialized engines already built to process
things such as Bayesean networks or neural nets, obviously it makes
sense to go with those specific already-built solutions. RDBMS shine
where you have *different uses* for the *same* info. Those things you
list tend to be *same* uses for the same info. See the difference? One
simply imports the load of data into those kits, chomps on it for a
while, and then dumps out the answers.

>
> > Again, I don't expect SQL or relational to solve the *entire* problem
> > by itself.
>
> From this is derive that you consider relational is a niche solution for a
> limited number of problems. As such it cannot be a concurrent to OO.

Perhaps RDBMS suck outside of custom biz apps and OO sucks inside.

>
> But actually, they are incomparable not because relational is incomplete
> and thus bad. It isn't bad. They are incomparable, because relational is of
> lower level. You cannot compare a vehicle with a wheel, though a circus
> rider could use the later. I, personally, better take a car. Some would a
> jet or ship.

No, it is actually higher-level than OOP. (This will turn into "level"
definition fight I fear, similer to the "complexity" metrics mess.)

>
> > But to give an analogy, program code is merely data to the
> > interpreter/compiler.
>
> I don't care about the CPU.

I said compiler/interpreter, not the CPU. However, CPU is a similar
example: the machine code is just data to it.

>
> > A developer may think of a function as
> > "behavior", but the interpreter treats it more like data if we look at
> > other processes that read what we normally call "data".
>
> That the point, software is developed, maintained and finally scraped by
> humans.

I don't see how this relates to the relativistic viewpoint of behavior
and data. If a robot, evolution, or God wrote the interpreter instead
of humans, would that change anything?

> >
> > Perhaps you should make a distinction between "is a set of" and "can be
> > defined as a set of". Claiming something "is" is a strong ascertion.
>
> For you, who believes in existence of data. To me, there is no data without
> something that interprets them exposing a behavior. There is no information
> without the receiver.

Yes, but one can focus their efforts on one or the other. I mentally
try to use tables to codify information and (potential) behavior such
that the "interpreter" (table processor) is a fairly trivial process,
or at least one that can be mentally seperated from the data config to
help divide up the effort into clean sub-parts. This is similar to the
music composer versus the player-piano engineer, who only cares that
the holes in the paper roll are "executed" according to expected rules
and conventions of music. The music composer does not want to have to
concern him/herself with the machanics of the player-piano. In practice
the seams are not always clean, such as when the composer is limited to
a max number of notes because brand X cannot play that many holes at
the same time or cannot reach high notes. But 90% seperation is still
better than zero. OOP fails to take advantage of this concept in my
opinion. Most biz OO code is or can be about 90% attributes anyhow. The
Grand Behavior machine that OO allegedly is is *wasted* on hard-coded
navigational DB-like code.

> --
> Regards,
> Dmitry A. Kazakov

-T-

From: Christian Brunschen on
In article <1139283131.650330.173930(a)o13g2000cwo.googlegroups.com>,
topmind <topmind(a)technologist.com> wrote:
>Dmitry A. Kazakov wrote:

[ much deletia ]

>>
>> SQL is typed, so I see no point.
>
>Perhaps, but it is possible to create a nearly type-free version of
>SQL.
>
>SQL-Lite (sqlite.org) is allegedly one such tool, although I've never
>tried it.

SQlite is not in fact untyped:

<quote src="http://sqlite.org/different.html">

Manifest typing

Most SQL database engines use static typing. A datatype is associated with
each column in a table and only values of that particular datatype are
allowed to be stored in that column. SQLite relaxes this restriction by
using manifest typing. In manifest typing, the datatype is a property of
the value itself, not of the column in which the value is stored. SQLite
thus allows the user to store any value of any datatype into any column
regardless of the declared type of that column. (There are some exceptions
to this rule: An INTEGER PRIMARY KEY column may only store integers. And
SQLite attempts to coerce values into the declared datatype of the column
when it can.)

</quote>

So SQLite certainly is typed, it's just not completely statically typed.
The author also notes that this is a difference from stanard SQL.

[ more deletia ]

>>-T-

Best wishes,

// Christian Brunschen