|
From: jimbalo22 on 19 Jan 2008 02:37 > What about metrics for your SQL queries? I have no metrics. I started rewriting it using datasets instead of business objects, but when I hit the problem I posted about earlier I continued working on the business object based version of the application. It is now almost completed and is working very well so far, so it is unlikely that I will rewrite it. But I am still interested in learning more about your approach since that could come in handy for the next project. Jim
From: frebe on 19 Jan 2008 03:51 > > What about metrics for your SQL queries? > > I have no metrics. I started rewriting it using datasets instead of > business objects, but when I hit the problem I posted about earlier I > continued working on the business object based version of the > application. It is now almost completed and is working very well so > far, so it is unlikely that I will rewrite it. But I am still > interested in learning more about your approach since that could come > in handy for the next project. You can look at the source code of http://www.sourceforge.net/projects/therp.. Unfortunately the application targets MySQL version 4, so view or stored procedures is not used. But otherwise, it is a good example of putting as much logic possible in SQL statement, and let the host language focus on presentation logic. //frebe
From: H. S. Lahman on 21 Jan 2008 16:20 Responding to Frebe... >>>>> By the way, what is your RAD IDE of choice? >>>> For the last 20 years before retiring I was an R-T/E guy so the only RAD >>>> problems we saw were support tools for the software process (e.g., >>>> defect tracking). For that mundane stuff I usually used Access. I >>>> haven't looked at the market so I wouldn't know what to use for >>>> something like a web-based POS where one has both networking and server >>>> issues. (There wasn't much around back in the '70s when I was doing IT, >>>> but I'm sure we've come a long way since.) >>> When you say "When I need to solve a CRUD/USER problem I forget about >>> OO and use a RAD IDE", you are talking about MS Access? Since you >>> frequently refer to RAD IDE as the silver bullet for everything that >>> the OO model can't solve, it is a little big surprisingly that the >>> only RAD IDE you have experience from is MS Access. Don't you think >>> there are something between the OO approach and MS Access? >> To clarify, when I refer to RAD IDE I also include the object-based >> "canned" infrastructures in things like .NET and J2EE that support the >> 1D Presentation/Business/Data models. Those are certainly between Access >> and an OO development. > > In what way does J2EE support 1D Presentation/Business/Data? I forget the buzzwords because I don;t do Java, but it supports the equivalent of MS' DAO and whatnot. > >>> create view invoice_balance as >>> select i.invoiceid, i.amount - sum(p.amount) >>> from payment p >>> join invoice i on i.invoiceid=p.invoiceid >>> group by i.invoiceid >> How is this different than a VBA routine tied to a predefined event in >> Access and stored by Access for reuse? > > If you show the VBA routine, I can tell the difference... The only difference is the syntax details. >> The VBA syntax will be very >> similar since it is designed (i.e., has specialized constructs) to be >> compatible with and easily converted to SQL. > > Are you saying that Visual Basic and SQL has similar syntax? If yes, > please show it to me. VBA has similar syntax to SQL for queries and table access. But you are pulling a Topmind and deflecting onto a tangent that has nothing to do with the thread. > >>> Using such view, you call "select balance from invoice_balance where >>> invoiceid=?", whereever in your application, reusing the business >>> logic above. >> I am missing the point. This seems like basic modularization that goes >> bas to function libraries under SA/D/P. Different development >> environments will have different mechanisms, but they all do the same >> thing. (The OOP-based agile processes argue that the primary goal of >> refactoring is to eliminate code duplication and ensure reuse.) > > Yes, you almost got it. Views corresponds to functions or classes. It > is all about modularization. But what is the point in this context? Almost all 3GLs have the ability to encode procedures and the ability to reuse those procedures. What has that got to do with CRUD/USER processing and RAD IDEs? -- There is nothing wrong with me that could not be cured by a capful of Drano. H. S. Lahman hsl(a)pathfindermda.com Pathfinder Solutions http://www.pathfindermda.com blog: http://pathfinderpeople.blogs.com/hslahman "Model-Based Translation: The Next Step in Agile Development". Email info(a)pathfindermda.com for your copy. Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php. (888)OOA-PATH
From: H. S. Lahman on 21 Jan 2008 16:28 Responding to Frebe... >> It is trivial to develop a schema tailored to a >> specific application that will be unusable by another application >> solving a different problem and needing the same data. > > It is also trivial not to do. That is only true if one does NOT customize the schema to a particular problem. That's why RDBs should be constructed around the problem domain rather than specific problems. Whenever the schema is customized to a particular problem, one can usually find another problem in the problem domain where the schema is not usable. >> Do you have references to a definition of CRUD/USER processing? If we >>>>> don't have a working definition, we use of the term seem to be rather >>>>> pointless. >>>> Google it. You'll find plenty references to CRUD and USER acronyms. >>> I did. But didn't find any definition that could be useful for telling >>> if an application is CRUD or not. But obviously you would not have any >>> problem providing such definition. >> If the primary tasks of the application are the acronym activities, it >> is CRUD/USER. Order entry for a POS system is CRUD/USER. If the acronym >> activities are peripheral to the problem being solved, the application >> is not CRUD/USER. The forecasting, transportation, and other functions >> of an inventory control system are not CRUD/USER. > > Is this CRUD? > > select i.invoiceid > from invoice i join payment p on i.invoiceid=p.invoiceid > group by i.invoiceid > having sum(p.amount) < i.amount and datediff(now(), i,duedate) >= 10 If that's all the application is doing, yes. >>>>> How do you apply 2NF to your classes? >>>> The same as you do. Every non-identity responsibility must be a simple >>>> domain (1NF) and must be fully dependent on the object identity. >>> "each nonkey attribute in the relation must be functionally dependent >>> upon the primary key." >>> Object identity is not the same thing as a primary key. >> OTOH, a primary key is a tuple identity. They are semantically the same >> thing in the RDM context. > > The primary key is part of the tuple. Object identity (RAM pointer) is > not. Whether it is embedded in the tuple or not, it still represents unique identity for the tuple and that is that is what is semantically important to the RDM. Only when the RDM is applied to RDBs does one need embedded keys. -- There is nothing wrong with me that could not be cured by a capful of Drano. H. S. Lahman hsl(a)pathfindermda.com Pathfinder Solutions http://www.pathfindermda.com blog: http://pathfinderpeople.blogs.com/hslahman "Model-Based Translation: The Next Step in Agile Development". Email info(a)pathfindermda.com for your copy. Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php. (888)OOA-PATH
From: H. S. Lahman on 21 Jan 2008 17:52
Responding to Frebe... >>> I solve particular problems using "the RDB query model" every day, and >>> I have not noticed it being "very inefficient". >> That's because you are either doing CRUD/USER applications or you >> haven't benchmarked your solutions... > > Once again, is the statemet below CRUD? > > select i.invoiceid > from invoice i join payment p on i.invoiceid=p.invoiceid > group by i.invoiceid > having sum(p.amount) < i.amount and datediff(now(), i,duedate) >= 10 > > This is the kind of code I write every. Even though the number of > invoices and payments are very high, the queries perform within a > number of millis. The customer is happy, I am happy. Once again, if that is all you are doing, that is CRUD/USER processing; you are just moving piles of data back and forth between the UI and the RDB. > Maybe you can show your code performing the same task. You want code for inventory forecasts? A Linear programs to allocate advertising budget to various markets' media? A simulation model of atmospheric diffusion? You are basically asking me to show you code for an entire application that solves some other problem that view conversion between UI and RDB. I don't think so. [Note that the smallest application I have worked on in the past four decades, inside IT and out, was 250 KLOC. Non-CRUD/USER applications are not the sort of thing one tucks into eMails.] >>>> That paradigm is fine for generic data storage and >>>> access but searching large sets sucks for algorithmic processing. >>> SQL databases sucks for searching large data sets, come on... >> You don't deny my assertion that I can perform the same O(log N) >> optimization in the implementation of a <reusable> collection class. > > Of course not. The difference is that you have to do it by yourself. I > can reuse existing tools instead. That is the main difference between > using a database and not using a database. And, as I have said at least twice, the price one pays for superior performance is hand crafting of the optimization. In CRUD/USER processing that is largely irrelevant because (a) data is accessed once by each solution and (b) seek time dominates the access performance issues in those situations. > >> But >> if OO relationships are instantiated at the object (tuple) level, then >> the N in OO searches will usually be much smaller than the N in table >> level searches. > > Since you qualify with "usually", this statement is pretty pointless. I don't agree. The 'usually' applies in the relatively rare situations where the requirements specifically require class/table level searches. Unless the application is just a pipeline between UI and RDB, that situation rarely arises. >> So one /must/ be able to achieve more efficient searches >> given OO's object-level instantiation. The price one pays for that >> efficiency is that the object-level instantiation has to be hand-crafted >> based on the particular problem context. > > Yes, I agree that by writing the corresponing code by your self, you > might get faster applications than you would if you had used a SQL > database. > > But "usually" development time, cost more than CPU time. And the only > thing you have showed is that databases has complexity O(log A) and a > tailored solution O(log B), there B < A. This is not a very big deal, > compared to the "extra tons of keystrokes", you would have to do. It /is/ a big deal when the same data aggregate is accessed many times. Besides, that was just one example chosen because the implications were so obvious. >> Why do you think that designers of large, complex applications spend so >> much effort trying to minimize persistence access, regardless of whether >> they are doing OO or not? They introduce elaborate caching schemes, >> deliberately present themselves with major data integrity problems, and >> whatnot because persistence access is almost always the performance >> bottleneck of such applications. > > Does > "insert into payment (invoiceid, date, amount) values (5, > '2008-01-19', 500)" > has anything to do with persitance? > > Caching is managed by the database. Caching is not the concern for the > application designer. Nonsense. Every non-CRUD/USER application I have ever seen had a client-side cache in some form. By the time the requests get to the DBMS, it is too late. The DBMS caches just manage server access and resources (unless you are talking about memory mapped OODBs). One caches on the client side to minimize DB access. BTW, note that in an OO application as soon as one instantiates objects in memory using attributes values from the DB, one is providing a sort of client-side cache. > > You are absolutely right that your solution introduces major data > integrity problem. That is what I have been trying to show. That is > why we should use a database, if possible. > > Persistence is problably the bottleneck for many kind of applications. > That is why databases using caching. In a well-tuned database 98% of > all calls, doesn't cause disk access. Persistence is /always/ the bottleneck when the solution needs to access the same data repeatedly. >> When solving complex problems the same data is quite often accessed and >> processed many times during the course of the solution. > > Is this the criteria for a "complex problem"? It is one criteria. It is a sufficient condition, though not a necessary condition. >> If one used the >> same accessing paradigm internally in the solution as the RDB uses the >> application would brought to its knees. > > There are scenarios, there mainstream SQL databases wouldn't perform > well. But the do excel in many application areas. The major problem > with current SQL databases, is the limited set of index types that is > used. B-trees is used as a one-size-fits-all solution. Obviously > better support for other index types is wanted. I can imagine > scenarios there foreign keys implemented as pointers would be a good > thing too. As a matter of fact, I think such databases exists, even if > I don't have time to find references to support this claim. SQL databases excel in CRUD/USER processing, which is what I keep saying. Conversely, they suck once one gets out of that realm. As an example, I play an MMORPG. The game employs an RDB, a star client/server model, and thin clients. That is a great architecture for airline reservations systems but it makes the MMORPG non-scalable. That's because the same data is constantly being accessed and updated by groups of clients involved in the same interactions (i.e., many times per second). The game provider is throwing hardware at the problem and doing handstands to balance loads but the lag just keeps getting worse as the number of players increased from 5K to 30K. I figure the game will simply die when the player count hits 50K because it will be unplayable. > >> So the cardinal rule of complex application >> development is to read the data once and write it once, no matter how >> many times it must be accessed in the solution. > > Read and write once from what, disk? Or RAM? The DBMS. Disk seek is the big problem but the table-level searches are still an important problem when the same data is accessed repeatedly in a single problem solution. (Or there are many possible relationships among the data or the data is constantly being updated.) It is also important to be able to convert identity into formats where one can use more efficient data structures, like arrays. (Note that does not necessarily mean pointers; consecutive integers work well too.) >>>>>> But the 'n' in >>>>>> O(log n) will usually be much smaller in the OO application because the >>>>>> collections are object-based rather than class-based. >>>>> Lets say you want to find all unpaid invoices. Why would the n be much >>>>> smaller in a OO solution? >>>> I said, "usually'. You are postulating a class-based search as a problem >>>> requirement. >>> You might think that my example is too extreme, but isn't it good to >>> use a method/tools that doesn't limit you to work on small amounts of >>> data? >> Who is limited to working on small amounts of data?!? > > You complained about my example, since it was a class-based search, > and not a search which could be solved with a limited number of > objects in a collection. So why don't show how you would find all > unpaind invoices? I've already point out that you are postulating a requirement that /requires/ a table level search. [I have also pointed out that even in that situation there are possible ways to avoid it, such as a collection dedicated to just unpaid invoices.] You are playing Topmind forensic games here: making a patently false assertion and then deflecting when called on it. >> Pointers are a red herring. They just provide a more efficient mechanism >> for object (tuple) identity than embedded identifiers. > > This in generally not true for data mangement. Pointer-based databases > hardly exists any more. Modern databases uses emedded identifiers. So what? As I have said repeatedly, RDBs provide generic data access and embedded identifiers are quite convenient for that since identity has been cast to data. The issue is solving specific problems efficiently in a computational environment where one has address manipulation, structures like arrays, and whatnot. > >> My main problem is with table-based searches. Table-based searches are a >> very general access mechanism that provides uniform access regardless of >> usage context (exactly what one wants in enterprise data storage). > > You seem to contine to claim, that SQL database doesn't perform well > enough. I think it is up to the reader to judge, if this is true or > not, considering all existing applications using SQL in many different > areas. That is not what I claim at all. Quite the contrary, I have said several times that RDBs provide an excellent paradigm for generic data storage and access. What I claim is that when solving specific problems in a computational environment like a computer there are more efficient paradigms for /manipulating/ data. > >> [The WHERE clause I used in an example is actually very rarely >> seen in OO applications; it is a kind of last resort when the >> requirements allow no more efficient alternative. And using a WHERE is >> very likely to get close scrutiny by OOA/D reviewers.] > > Since WHERE is not supported by OOPLs, this is obviously true. The first clause is not true. The OOPLs provide several mechanisms for implementing AAL WHERE clauses. If they didn't it would not be possible to provide a code generator for AAL. >>> Are producing invoices, "complex processing"? There are scenarios of >>> data processing that are not supported very well by existing index >>> types in current mainstream (SQL) databases. But the major part of all >>> processing done in common business applications, perform very well >>> using SQL databases, without having any custom data structures. What >>> about all COBOL applications out there with extremly simple data >>> structures, relying heavily on SQL statements to do the job. Doesn't >>> they perform well enough? >> In most situations producing invoices is pretty straight forward. Order >> + Lading Bill = Invoice. IOW, I would normally expect that to be >> CRUD/USER processing. That's why companies buy OTS Accounts Receivables >> packages rather than reinventing the wheel. > > I could continue to give you examples of common tasks from business > applications, and you would classify them as CRUD as well. I think it > is honest by you to make your disclaimer about OO solutions not being > appropriate in all scenarios. You are almost the only one at > comp.object, doing so. The problem is that knowing when something is > CRUD or not. You claim that 20% of business applications out there are > CRUD, I claim 80%. The point is that one can construct and Invoice entry from a Order entry and a Lading Bill entry by simply moving field values from one view to another. That sort of view conversion is quintessential CRUD/USER processing. That sort of view conversion was ubiquitous in core accounting packages that were the bulk of IT through the '60s. But those problems have all been solved and one can buy the solution OTS rather than reinventing the solutions. So most CRUD/USER processing today just supports ad hoc desktop analysis. The important IT problems today look a lot more like R-T/E or scientific programming as they struggle to deal with concurrency, cutting edge data warehousing algorithms, networking, catalog navigation, highly complex operations research algorithms, and a bunch of other problems that were not even feasible to solve in the '60s. As I indicated, 20% was a guess, but the proportion is certainly a whole lot smaller than it was in the '60s. > >>> My databases solves problems for my applications. Between >>> applications, data are exported and imported. >> So you live with the reuse problems and the need to keep revising >> schemas as new applications come online. > > The last decade, I have been working as an indepent software provider. > Usually I don't allow other applications to access databases for my > applications. Most software companies like SAP etc, doesn't allow > other applications to access their database either. Good work if you can get it. But where do you get the data in your database? From user data entry? Seems a pity existing data can't be used. >>>> When one creates a Data Model, it should be based on the overall problem >>>> domain rather than specific problems within the domain. If you do that, >>>> then the same schema should be reusable by all the core accounting >>>> applications. >>> One could also say that it only need to be one core accounting >>> application. The question is: How do you define the "overall problem", >>> and the "specific problem". You base almost all your argumentation on >>> such fuzzy terms, which makes everything derived from it rather fuzzy >>> too. >> I didn't say overall problem; my use of 'problem' was as an adjective. >> Problem domains are quite different things from problems. THe point was >> that Data models should be constructed based on how the overall business >> enviroment works, not how individual problems are solved. > > The words "overall" and "individual" are such fuzzy in this context, > that any argument based on them are rather pointless. Speaking of being pointless, we have been down this road several times before without resolution and I see no indication of it this time. We are just repeating ourselves. I think it is time to break it off. -- There is nothing wrong with me that could not be cured by a capful of Drano. H. S. Lahman hsl(a)pathfindermda.com Pathfinder Solutions http://www.pathfindermda.com blog: http://pathfinderpeople.blogs.com/hslahman "Model-Based Translation: The Next Step in Agile Development". Email info(a)pathfindermda.com for your copy. Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php. (888)OOA-PATH |