|
Prev: Showing exceptions on UML class diagram
Next: Call for Papers: International MultiConference of Engineers and Computer Scientists IMECS 2008
From: Panu on 16 Oct 2007 03:23 H. S. Lahman wrote: .... > When in doubt, think: encapsulation. The mechanisms for accessing stored > data in flat files, an RDB, an OODB, or on clay tablets will be quite > different. Do the applications that need to access stored data care > about those details? I find it interesting ... wouldn't it be better to encapsulate further, and only access 'objects' instead of flat data which needs to be parsed somehow according to some agreed-upon syntax? If yes, it would seem that object-databases are the better solution. In practice they are not around much these days. But if we say "relational data better", then this would seem to be against the advice of encapsulation, no? I'm truly interested in this question - not trying to argue either way: *If encapsulation is good, why are relational databases so prevalent?* -Panu Viljamaa
From: H. S. Lahman on 16 Oct 2007 12:36 Responding to Panu... >> When in doubt, think: encapsulation. The mechanisms for accessing >> stored data in flat files, an RDB, an OODB, or on clay tablets will be >> quite different. Do the applications that need to access stored data >> care about those details? > > > I find it interesting ... wouldn't it be > better to encapsulate further, and only > access 'objects' instead of flat data which > needs to be parsed somehow according to > some agreed-upon syntax? First, let me qualify that I am not talking about CRUD/USER processing where the primary problem being solved by an application is conversion between the DB and UI views of data. Databases of any flavor exist to provide storage of data that is independent of how the data are used. IOW, the database provides persistence in a fashion that is reusable across broad classes of data usage. It is also provides optimum efficiency of access when usage is arbitrary. To do that it must provide standardized access mechanisms like SQL. But the databases access mechanisms are optimized for the particular database storage paradigm. Thus SQL isn't very useful for accessing an OODB. OTOH, applications solve very specific and unique problems for the customer. To do that efficiently the application often needs a customized view of that data that is different than the view in a particular persistence paradigm. In addition, the problem the application is solving really doesn't care which particular storage paradigm is used for storage. So the application needs to encapsulate the persistence paradigm and provide an interface to it that suits its specific solution needs. Thus the application is going to abstract data objects that it needs to store/recover just as you suggest. But those data objects will be tailored to the particular application problem context. Similarly, the database is going to deal with data objects that are tailored to its particular paradigm. So a mapping is needed to the storage view de jour. Thus it really doesn't matter what the storage paradigm is; a mapping still needs to be provided between the application and storage views. As an obvious example, consider how relationships are managed in an OO application vs. an RDB. In the RDB they are instantiated at the table (class) level and one needs explicit embedded identity in the tuple. In an OO application relationships are instantiated at the object (tuple) level rather than the class level and identity is usually implicit in a memory address. Thus one /constructs/ objects and their relationships differently in an OO application than in an RDB. The result is that query-based processing and joins are very rare in OO applications. > > If yes, it would seem that object-databases > are the better solution. In practice they > are not around much these days. But if we > say "relational data better", then this would > seem to be against the advice of encapsulation, > no? One needs a different accessing paradigm for OODBs. That's because OODBs are optimized to deal with data where there are many complex relationships among data elements and those relationships need to be instantiated at the object level rather than the class (table) level. [It is no accident that the memory-mapped OODBs provide a literal mapping to the <OO> application structure. The price of that literal mapping is that the OODB access is ubiquitous in the code so that one cannot switch OODB vendors without massive shotgun refactoring.] The real issue is that the database -- RDB or OODB -- provides an interface for generic data access. But that interface is necessarily optimized around the particular storage paradigm. Thus the database provides a quite abstract access mechanism in its interface, but that mechanism is limited to the particular storage paradigm (however, common it may be). Think of it this way: +-------------+ +-----------------+ | Application | | Database | | +------+ +-------+ | | | Iin |<-------------------| Iout | | | +------+ +-------+ | | | | | | +------+ +-------+ | | | Iout |------------------->| Iin | | | +------+ +-------+ | | | | | +-------------+ +-----------------+ The application has an input interface, Iin, that the rest of the world uses to talk to it. Similarly, the database has in input interface, Iin, that the rest of the world talks to, such as a SQL driver interface. However, because the application doesn't care about specific persistence paradigms and has its own unique view of persistence, internally it talks to its own output interface when it needs to communicate with persistence. Similarly, the database will have its own internal view of data that it needs to convert to the more generic view the rest of the world wants to see (e.g., datasets), so it internally talks to an output interface, Iout. [That is trivial for an RDB since Iin is query-based and synchronous; in effect there is no Iout interface. But it can raise all sorts of interesting issues for virtual memory in a memory-mapped OODB.] What the Iin and Iout interfaces provide is decoupling. They allow the implementations of the application and database engine be completely independent of their context. The interfaces presented by Iin and Iout are always fixed. The "glue" that resolves syntactic mismatches between paradigms resides in the implementation of the Iout interface. It provides the view conversion from its interface to the relevant Iin interface of the service. (For a complex application Iout will tend encapsulated in the substitutable subsystem that is reusable across applications, depending on the persistence paradigm currently in favor.) Bottom line: Iin/Iout are at different levels of abstraction and serve different masters for the application and the database. > I'm truly interested in this question - not > trying to argue either way: > > *If encapsulation is good, why are relational > databases so prevalent?* But RDBs /are/ encapsulated. That is exactly what SQL provides; a quite abstract interface to a particular storage paradigm. That allows RDBs to be plug & play across applications. It is just at a different level of abstraction than the application problem solution's view. The answer to this question is more about /how/ the data is structured and used in the intended market. RDBs are ideally suited to read-many/write-few contexts where relationships among data are relatively simple. CRUD/USER processing is the quintessential example of where the RDB paradigm shines and it absolutely dominated IT through the '70s when RDBs came on the scene. Those criteria still dominate IT because the data is looked at a whole lot more than it is updated and relationships just aren't that complicated in IT. Since IT still dominates the softwre market, it is not a surprise that RDBs prevail. [There is a chicken-and-egg issue. Are IT relationships simple because the IT world is coerced by the presence of RDBs left over from when CRUD/USER processing ruled? FWIW, I think so. I think it is because the IT problem domain naturally works that way because it is the easiest way to manage processes where a lot of people are involved. IOW, KISS rules for business processes.] OODBs are ideally suited to situations where relationships tend to be quite complex at the tuple level and, to a lesser extent, where data is constantly being updated (more precisely, where data changes need to be synchronized among multiple clients simultaneously). Those sorts of problems are much less common. In fact, the only obvious examples that come quickly to mind are mapping software (e.g., MapQuest) and MMORPGs. ************* There is nothing wrong with me that could not be cured by a capful of Drano. H. S. Lahman hsl(a)pathfindermda.com Pathfinder Solutions http://www.pathfindermda.com blog: http://pathfinderpeople.blogs.com/hslahman "Model-Based Translation: The Next Step in Agile Development". Email info(a)pathfindermda.com for your copy. Pathfinder is hiring: http://www.pathfindermda.com/about_us/careers_pos3.php. (888)OOA-PATH
From: topmind on 16 Oct 2007 20:24 H. S. Lahman wrote: > Responding to Panu... [snip] > OTOH, applications solve very specific and unique problems for the > customer. To do that efficiently the application often needs a > customized view of that data that is different than the view in a > particular persistence paradigm. Relational is perfectly capable of handling "local" app-specific views and/or copies of data. I will agree that the current crop of tools often does not make that very easy though. But I used to do it all the time back in the days before OOP hype killed the market for nimble table-oriented tools. Plus, on a small app or task level, OOP is usually overkill anyhow. (In a nearby message I already griped about calling RDBMS mere "storage mechanisms", so I won't repeat that here.) > But RDBs /are/ encapsulated. That is exactly what SQL provides; a quite > abstract interface to a particular storage paradigm. That allows RDBs to > be plug & play across applications. It is just at a different level of > abstraction than the application problem solution's view. I would not call it a "different level of abstraction". SQL is very high-level, in some ways even more high-level than OOP. Many OO proponents want to hide it away because they either don't like it or don't want to bother to learn it and deal on a 1960's pointer-by- pointer-like approach instead. > > The answer to this question is more about /how/ the data is structured > and used in the intended market. RDBs are ideally suited to > read-many/write-few contexts where relationships among data are > relatively simple. I've seen a lot of RDBMS where the relationships were far from simple. (True, they probably could have been cleaned up a bit, but the businesses were inherantly non-trivial.) > OODBs are ideally suited to situations where relationships tend to be > quite complex at the tuple level and, to a lesser extent, where data is > constantly being updated (more precisely, where data changes need to be > synchronized among multiple clients simultaneously). Those sorts of > problems are much less common. In fact, the only obvious examples that > come quickly to mind are mapping software (e.g., MapQuest) and MMORPGs. GIS (mapping) applications often use RDBMS. ESRI comes to mind. > H. S. Lahman -T-
From: Phlip on 17 Oct 2007 17:39
Panu wrote: > *If encapsulation is good, why are relational > databases so prevalent?* Because one good way to process tables of data is with declarative statements. SQL typically allows you to declare the results you want, encapsulating their mechanism. The DSLs (Domain Specific Languages) that wrap SQL are often also declarative. And they are still encapsulating, and still based in an OO language. -- Phlip http://www.oreilly.com/catalog/9780596510657/ ^ assert_xpath |