|
From: AndyW on 28 Jan 2008 00:36 On Sun, 27 Jan 2008 01:57:37 -0800 (PST), frebe <frebe73(a)gmail.com> wrote: >> >> > select i.invoiceid >> >> > from invoice i join payment p on i.invoiceid=p.invoiceid >> >> > group by i.invoiceid >> >> > having sum(p.amount) < i.amount and datediff(now(), i,duedate) >= 10 >> >> >> > This is the kind of code I write every. Even though the number of >> >> > invoices and payments are very high, the queries perform within a >> >> > number of millis. The customer is happy, I am happy. >> >> >> Once again, if that is all you are doing, that is CRUD/USER processing; >> >> you are just moving piles of data back and forth between the UI and the RDB. >> >> >> > Maybe you can show your code performing the same task. >> >> >> You want code for inventory forecasts? A Linear programs to allocate >> >> advertising budget to various markets' media? A simulation model of >> >> atmospheric diffusion? >> >> >No. I was asking for the OO equivalence for the SQL statement above. >> >> There isnt one. � But there is a way to achieve the end result - >> locating an invoice. > >The end result in my query is to find invoices, so please show the OO >equivalence. > >> One has to remember that OO isnt about lists of related data as >> relational DB are. � Its about collections of unique objects. > >Is that an advantage or a disadvantage? > >> To me in OO every instance of an object is unique and that I feel >> means if one knows enough of the attributes one can go directly to the >> item without searching for it > >By pointer traversal? Doesn't you need some kind of hashmaps etc, in >order to "go directtly". Isn't this also a kind of "search`"? > >> (in the rare case there are duplicate >> items they are reference counted). �In close appromimation to >> relational, I think its like knowing the primary key but in OO its a >> composite key based on the sum of all of the attributes and methods at >> a given point in time (this means that an objects identity can >> change). > >Object identity is a memory adress. > >> In OO, I would suggest that we also dont have collections of related >> data, we have schemas of objects (a set of rules that define the >> collection). Schema can also be composite. �In the SQL statement above >> as I interpreted it one has a collection of invoices each with >> possibly a collection of payments. � The most ideal way of searching >> is to go for the one that is easier to find - so if one knows the >> payment find that, then transverse to its invoice (like using a >> foreign key). � In other words, sometime its better to reverse the >> search order given the information known. > >A SQL database is very good at doing thing kind of decisions. >Depending on the query and current statistics, it will choose the best >set of indexes to use. I disagree, SQL only works if the data has been normalised (ordered) and the statements are optimised. It does not work on non-orded, disjoint data. In other words its good at lists, but not disjoint patterns. > >> Think of it like Bob going into a crowded hall and >> shouting out, can Wilma of 43 someplace street please go to the >> information kiosk fill in the appropriate form. �Then a short time >> later, a faint voice shouts back, ok I am done what next. (this is the >> observer pattern) > >I order to "shout out" to every object, you need to traverse every >object. The complexity of the algorithm will be O(N). In a SQL >database, with appropriate indexes, most searches will be O(log N). This is an incorrect assumption. You simply need to set a flag that the appropriate items are looking out for. This is a common way of implementing an event model. > >> So to me in reference to the select statment above, the request is >> simply - can everyone who has a sum of money that is now due for >> payment, please now execute their request payment routines (remember, >> objects contain methods based on their classification as well). �If >> one were being smart one could actually provide them a method >> invocation to execute - which is called dynamic method invocation. > >That means that every object need to calculate the sums and decide if >calling the callback or not. The complexity will be O(N). Using SQL, >you could create a materialized view with the sum columns, create >indexes for them, and you will have O(log N). Yes, this is correct. Remember that objects contain the code needed to perform the actions on their own data. This code is encapsulated in 'methods'. A single method can have an instance for each object or it can have a single instance across many objects - this is what we mean when we say a method is static. > >> So I think the techniques in OO are that objects can find themselves, >> objects can manipulate themselves and it can all work in parallel. > >That is why so many OO applications have such bad performance. They dont, in general they are usually hacked by a poorly designed relational database. Any application both OO and non-OO will suffer ---------------- AndyW, Mercenary Software Developer
From: frebe on 28 Jan 2008 02:32 > >> In OO, I would suggest that we also dont have collections of related > >> data, we have schemas of objects (a set of rules that define the > >> collection). Schema can also be composite. In the SQL statement above > >> as I interpreted it one has a collection of invoices each with > >> possibly a collection of payments. The most ideal way of searching > >> is to go for the one that is easier to find - so if one knows the > >> payment find that, then transverse to its invoice (like using a > >> foreign key). In other words, sometime its better to reverse the > >> search order given the information known. > > >A SQL database is very good at doing thing kind of decisions. > >Depending on the query and current statistics, it will choose the best > >set of indexes to use. > > I disagree, SQL only works if the data has been normalised (ordered) > and the statements are optimised. Data should obviously be normalised. SQL works on not normalised data too, but you risk getting inconsistent data. What do you mean with optimised statements? Are you refering to statements that can't utilize indexes, when executed? > It does not work on non-orded, > disjoint data. Tuples in a relation is non-orderd, so I have a problem understanding your statement. > In other words its good at lists, but not disjoint patterns. Isn't this a disjoint pattern? create table vehicle ( vechicleid integer, primary key (vechicleid) ) create table car ( vehicleid integer, primary key (vehicleid), foreign key (vehicleid) references vehicle (vehicleid) ) create table boat ( vehicleid integer, primary key (vehicleid), foreign key (vehicleid) references vehicle (vehicleid) ) > >> Think of it like Bob going into a crowded hall and > >> shouting out, can Wilma of 43 someplace street please go to the > >> information kiosk fill in the appropriate form. Then a short time > >> later, a faint voice shouts back, ok I am done what next. (this is the > >> observer pattern) > > >I order to "shout out" to every object, you need to traverse every > >object. The complexity of the algorithm will be O(N). In a SQL > >database, with appropriate indexes, most searches will be O(log N). > > This is an incorrect assumption. You simply need to set a flag that > the appropriate items are looking out for. This is a common way of > implementing an event model. But the items (objects) have to do some polling? Doesn't every object need to do the polling? Doesn't that imply O(N). If not, please show some source code proving the opposite. > >> So to me in reference to the select statment above, the request is > >> simply - can everyone who has a sum of money that is now due for > >> payment, please now execute their request payment routines (remember, > >> objects contain methods based on their classification as well). If > >> one were being smart one could actually provide them a method > >> invocation to execute - which is called dynamic method invocation. > > >That means that every object need to calculate the sums and decide if > >calling the callback or not. The complexity will be O(N). Using SQL, > >you could create a materialized view with the sum columns, create > >indexes for them, and you will have O(log N). > > Yes, this is correct. Remember that objects contain the code needed to > perform the actions on their own data. This code is encapsulated in > 'methods'. That is why encapsulating is a bad thing in this case. Since the code (and data) necessary to tell whether the condition is met or not, are encapsulated, the only thing to do is to actually ask every object. And the complexity will be O(N). I have seen many OO geeks building applications this way, and believe me, it is a disaster. > >> So I think the techniques in OO are that objects can find themselves, > >> objects can manipulate themselves and it can all work in parallel. > > >That is why so many OO applications have such bad performance. > > They dont, in general they are usually hacked by a poorly designed > relational database. Any application both OO and non-OO will suffer And who did the poor schema design, probably a OO geek? And if we have a poorly designed schema, the OO geek will say: No problem, I will separate the schema from the business logic in different layers, and everything will be OK. The database guy would say: The schema is the foundation for your application, if you don't fix the schema, the application will sucks for ever. //frebe
From: AndyW on 29 Jan 2008 03:19 On Sun, 27 Jan 2008 23:32:51 -0800 (PST), frebe <frebe73(a)gmail.com> wrote: >> >> In OO, I would suggest that we also dont have collections of related >> >> data, we have schemas of objects (a set of rules that define the >> >> collection). Schema can also be composite. �In the SQL statement above >> >> as I interpreted it one has a collection of invoices each with >> >> possibly a collection of payments. � The most ideal way of searching >> >> is to go for the one that is easier to find - so if one knows the >> >> payment find that, then transverse to its invoice (like using a >> >> foreign key). � In other words, sometime its better to reverse the >> >> search order given the information known. >> >> >A SQL database is very good at doing thing kind of decisions. >> >Depending on the query and current statistics, it will choose the best >> >set of indexes to use. >> >> I disagree, �SQL only works if the data has been normalised (ordered) >> and the statements are optimised. � > >Data should obviously be normalised. SQL works on not normalised data >too, but you risk getting inconsistent data. What do you mean with >optimised statements? Are you refering to statements that can't >utilize indexes, when executed? Optimised as in fast performance. > >> It does not work on non-orded, >> disjoint data. > >Tuples in a relation is non-orderd, so I have a problem understanding >your statement. ordered data, as in it has relation (name address phone), disjoint for example (a car and a fish). >> This is an incorrect assumption. �You simply need to set a flag that >> the appropriate items are looking out for. �This is a common way of >> implementing an event model. > >But the items (objects) have to do some polling? Doesn't every object >need to do the polling? Doesn't that imply O(N). If not, please show >some source code proving the opposite. Not really, polling would indicate to me the use of the semaphore pattern, but is not necessarily the only way. The observer pattern for example using callbacks is another. > >> >> So to me in reference to the select statment above, the request is >> >> simply - can everyone who has a sum of money that is now due for >> >> payment, please now execute their request payment routines (remember, >> >> objects contain methods based on their classification as well). �If >> >> one were being smart one could actually provide them a method >> >> invocation to execute - which is called dynamic method invocation. >> >> >That means that every object need to calculate the sums and decide if >> >calling the callback or not. The complexity will be O(N). Using SQL, >> >you could create a materialized view with the sum columns, create >> >indexes for them, and you will have O(log N). >> >> Yes, this is correct. Remember that objects contain the code needed to >> perform the actions on their own data. �This code is encapsulated in >> 'methods'. > >That is why encapsulating is a bad thing in this case. Since the code >(and data) necessary to tell whether the condition is met or not, are >encapsulated, the only thing to do is to actually ask every object. >And the complexity will be O(N). I have seen many OO geeks building >applications this way, and believe me, it is a disaster. I have seen many examples of poor application architecture as well. Its surprising how many developers that will try an emulate the functionality of an OO database system in an object relational system without any real knowledge of the features of the OO Db and how to even implement those techniques if they did know what they were. ---------------- AndyW, Mercenary Software Developer
From: frebe on 29 Jan 2008 04:09
> >> >> In OO, I would suggest that we also dont have collections of related > >> >> data, we have schemas of objects (a set of rules that define the > >> >> collection). Schema can also be composite. In the SQL statement above > >> >> as I interpreted it one has a collection of invoices each with > >> >> possibly a collection of payments. The most ideal way of searching > >> >> is to go for the one that is easier to find - so if one knows the > >> >> payment find that, then transverse to its invoice (like using a > >> >> foreign key). In other words, sometime its better to reverse the > >> >> search order given the information known. > > >> >A SQL database is very good at doing thing kind of decisions. > >> >Depending on the query and current statistics, it will choose the best > >> >set of indexes to use. > > >> I disagree, SQL only works if the data has been normalised (ordered) > >> and the statements are optimised. > > >Data should obviously be normalised. SQL works on not normalised data > >too, but you risk getting inconsistent data. What do you mean with > >optimised statements? Are you refering to statements that can't > >utilize indexes, when executed? > > Optimised as in fast performance. So what is the conclustion? SQL only works when the statements are fast? But the statements can execute fast O(log N), so we don't have a problem. Your solution, on the contrary, perform in O(N). > >> It does not work on non-orded, > >> disjoint data. > > >Tuples in a relation is non-orderd, so I have a problem understanding > >your statement. > > ordered data, as in it has relation (name address phone), But the attributes doesn't have to be ordered in that way insert into employee (phone, name, address) values (...) > disjoint for example (a car and a fish). Do you claim that the relational model can't model cars and fishes? > >> This is an incorrect assumption. You simply need to set a flag that > >> the appropriate items are looking out for. This is a common way of > >> implementing an event model. > > >But the items (objects) have to do some polling? Doesn't every object > >need to do the polling? Doesn't that imply O(N). If not, please show > >some source code proving the opposite. > > Not really, polling would indicate to me the use of the semaphore > pattern, but is not necessarily the only way. The observer pattern > for example using callbacks is another. So, all items (invoices) are subscribing for events (query requests). When you want to find all items matching a given criteria (unpaid invoices), you send a notification to all invoices, and invoices that matches the criteria executes the callback. Since you need to notify all items and all items has to execute the criteria evaluation, you will have O(N). OO philosophy doesn't work on large amount of data. > >> >> So to me in reference to the select statment above, the request is > >> >> simply - can everyone who has a sum of money that is now due for > >> >> payment, please now execute their request payment routines (remember, > >> >> objects contain methods based on their classification as well). If > >> >> one were being smart one could actually provide them a method > >> >> invocation to execute - which is called dynamic method invocation. > > >> >That means that every object need to calculate the sums and decide if > >> >calling the callback or not. The complexity will be O(N). Using SQL, > >> >you could create a materialized view with the sum columns, create > >> >indexes for them, and you will have O(log N). > > >> Yes, this is correct. Remember that objects contain the code needed to > >> perform the actions on their own data. This code is encapsulated in > >> 'methods'. > > >That is why encapsulating is a bad thing in this case. Since the code > >(and data) necessary to tell whether the condition is met or not, are > >encapsulated, the only thing to do is to actually ask every object. > >And the complexity will be O(N). I have seen many OO geeks building > >applications this way, and believe me, it is a disaster. > > I have seen many examples of poor application architecture as well. > Its surprising how many developers that will try an emulate the > functionality of an OO database system in an object relational system > without any real knowledge of the features of the OO Db and how to > even implement those techniques if they did know what they were. The features of a OO database is basically the same as of a network database. It is surprising that OO people try to reinvent a database type that have been obsolte for more than 20 years. The lack of sucess of OO database speaks for itself. It is also surprising to find so many software developers such ignorant about data management basics. //frebe |