SQL [OOP]

Prev: Use Case Point Estimation
Next: delegation vs. inheritance

From: Mikito Harakiri on 16 Jan 2006 17:48

Hasta wrote:
> In article <1137447794.122780.151500(a)f14g2000cwb.googlegroups.com>,
> mikharakiri_nospaum(a)yahoo.com says...
> > In general, I agree with Bruce -- there is no objective measure for
> > program complexity. What is the complexity measure of
> > 100000110000000000
> >
>
> Well, there is an objective measure of the complexity of
> 100000110000000000. It's the length of the smallest
> program able to generate that string.
>
> Browse for chaitin-kolmogorov complexity/randomness.
> A fascinating subject :-)

This is exactly what I had in mind (although I wanted to emphasize
Martin-Loef criteria of randomness). Therefore, what is the length of
that program in the earlier example?

>From wikipedia: "More formally, the complexity of a string is the
length of the string's shortest description in some fixed description
language. The sensitivity of complexity relative to the choice of
description language is discussed below."

Excuse me but this is not a very practical suggestion. For finite
object there is no mathematically sound way to establish that

100000110000000000

is more complex than

100000000000000000

Again, this is what the earler message "no *objective* measure for
program complexity" was saying.

From: Oliver Wong on 16 Jan 2006 18:58

"Mikito Harakiri" <mikharakiri_nospaum(a)yahoo.com> wrote in message
news:1137447794.122780.151500(a)f14g2000cwb.googlegroups.com...
> Patrick May wrote:
>> "topmind" <topmind(a)technologist.com> writes:
>> > Again, I have yet to see an objective or even semi-objective way to
>> > measure "complexity".
>>
>> I suggest you Google for "software complexity" and you'll find
>> several million links. Starting from a page like
>> http://yunus.hun.edu.tr/~sencer/complexity.html will give you pointers to
>> other research if you are genuinely interested in learning.
>
> Hmm. I tried to find "software complexity" in wikipedia and failed.
> Apparently this topic (and the link you supplied) is a typical example
> of junk science.
>
> In general, I agree with Bruce -- there is no objective measure for
> program complexity. What is the complexity measure of
> 100000110000000000
> for example?

You might want to lookup Entropy (in the context of information, not
thermodynamics). If you like Wikipedia, see
http://en.wikipedia.org/wiki/Information_entropy

But I suspect this has very little to do with "Software Complexity" as
Patrick May means it.

> Speaking of *finite* objects, it is a basic fact that on
> TM can model another, therefore, you have to [arbitrarily] choose some
> reference TM. It is interesting [and nontrivial] fact that there is a
> way to establish complexity metrics for infinite objects, though.
>

I'm not sure what the "basic" fact that one Turing Machine can model
another Turing Machine has to do with software complexity or information
entropy. Seems rather offtopic here.

What does it mean for an object to be "infinite" in this context, and
what does it mean for an infinite object to be "complex" in this context?

- Oliver

From: Mikito Harakiri on 16 Jan 2006 19:43

Oliver Wong wrote:
> "Mikito Harakiri" <mikharakiri_nospaum(a)yahoo.com> wrote in message
> > In general, I agree with Bruce -- there is no objective measure for
> > program complexity. What is the complexity measure of
> > 100000110000000000
> > for example?
>
> You might want to lookup Entropy (in the context of information, not
> thermodynamics). If you like Wikipedia, see
> http://en.wikipedia.org/wiki/Information_entropy
>
> But I suspect this has very little to do with "Software Complexity" as
> Patrick May means it.

You have to realize that the whole idea of random sequence definition
arisen from the mathematicians being unsatisfied with foundation of
probability theory. It was Kolmogorov who established probability
theory as merely an applied measure theory. Unfortunately, this
development had very little practical impact, as it failed to define
what is random sequence. Kolmogoroff continued his quest, but failed to
give a satisfactory definition of random sequence. It was Martin-Loef
who suceeded completing Kolmogoroff's program in 1966. Then Gregory
Chaitin developed theory further, emphasizing computational complexity
side.

Entropy, while being just a statistical measure, suffers from the same
foundation problems that plagued probability theory. You can't define
statistical measure on a sample consisting of just a single object.

Perhaps I have to explain why the concept of "random" sequence is
important in the context of this complexity discussion. It is
considered more challenging to generate "random" sequence as compared
to "nonrandom" one, hence intuitively random sequences are more complex
than non-random ones. I believe, Knuth has some initial exposition to
random sequence generation in volume 2.

> > Speaking of *finite* objects, it is a basic fact that on
> > TM can model another, therefore, you have to [arbitrarily] choose some
> > reference TM. It is interesting [and nontrivial] fact that there is a
> > way to establish complexity metrics for infinite objects, though.
> >
>
> I'm not sure what the "basic" fact that one Turing Machine can model
> another Turing Machine has to do with software complexity or information
> entropy. Seems rather offtopic here.

http://en.wikipedia.org/wiki/Kolmogorov_complexity

The first theorem in the basic results uses the concept of universal
TM.

> What does it mean for an object to be "infinite" in this context, and
> what does it mean for an infinite object to be "complex" in this context?

Infinite sequence of 0s and 1s versus finite one. Again, the concept of
random infinite sequence is quite counterintuitive. You can predfix
random sequence with million of 1s, and it would still be a random
sequence. Once again, there is no way to define what random *finite*
sequence is. In layman terms, if you go to Vegas and roulete produces a
sequence of 10 zeros in a row, you can suspect that roulete is
defective, but there is no mathematical foundation that would support
your belief.

From: topmind on 16 Jan 2006 21:31

Hasta wrote:
> In article <1137447794.122780.151500(a)f14g2000cwb.googlegroups.com>,
> mikharakiri_nospaum(a)yahoo.com says...
> > In general, I agree with Bruce -- there is no objective measure for
> > program complexity. What is the complexity measure of
> > 100000110000000000
> >
>
> Well, there is an objective measure of the complexity of
> 100000110000000000. It's the length of the smallest
> program able to generate that string.

That is more or less our old friend the "lines of code" metric.

However, code-size metrics can be sticky to measure. For example,
shouldn't long lines be counted as more than short lines? Should long
names be penalized? Does a parenthesis count as much as a variable
name? If not, how much? 1/3? If we don't count it, then somebody could
write a parenthesis-oriented language and win it all. (Kind of like
that BrainF*ck language.)

Plus, it may be nearly impossible to prove that a given candidate is
the shortest *possible*. We could only compare what is presented.

However, I don't think P. May is interested in *any* kind of code size
metric. However, I have no idea what he has in mind as an alternative.

>
> Browse for chaitin-kolmogorov complexity/randomness.
> A fascinating subject :-)

-T-

From: topmind on 16 Jan 2006 22:01

Patrick May wrote:
> "topmind" <topmind(a)technologist.com> writes:
> > > > > SQL does support abstraction in the sense that it abstracts
> > > > > the RDB implementation. However, here I was referring to
> > > > > problem space abstraction.
> > > >
> > > > This depends on how one defines "problem space abstraction".
> > >
> > > "Problem space abstraction" typically refers to a concept
> > > from the domain for which the software system is being developed,
> > > e.g. Customer, Sales Order, Network Element, Product, etc. This
> > > is distinguished from "solution space abstractions" such as
> > > tables, rows, columns, keys, pointers, functions, and so on. This
> > > isn't a point of contention among experienced software developers.
> >
> > But how are tables less close to the domain than classes, methods,
> > and attributes?
>
> The ability to model behavior as well as data makes general
> purpose languages better able to model the problem domain than is
> SQL.

If you design right, you can *shift* much behavior to being data and DB
operations instead. SQL is close to being Turing Complete. Thus, it
only needs minor help from procedural code to implement any algorithm.
(I don't propose we do make it or other relational languages TC, by the
way.) Also, whether this is practical or not is another issue. Some
things are best done by the DB and some best done in app code. It is 2
paradigms that fit together like Yin and Yang.

Plus, OO is usually crappy at modeling behavior, at least in the biz
domain. OO is only nice when things split up into nice hierarchical
taxonomies. Most things don't in reality, so what is left is a mess.

Things like the Visitor Pattern have objectively more repetition than
multi-dispatching in tables. This is not an opinion but a fact.

>
> > (I agree that some of the behavior side is not something to be done
> > on the DB, but that is simply a partitioning of specialties issue.)
>
> No, it's a qualitative difference.

What is wrong with partitioning that way? I agree it makes a few things
harder, but most things simpler because things that are good at data
are not good at behavior and visa versa. Attribute management via
relational is fairly clean and consistent. Attribute management via OO
is a goto-like speghetti mess with every designer doing it differently,
modeling imaginary sh8t in their diverse heads.

>
> > > > Repetative SET/GET syndrome is an example of this poor pattern
> > > > factoring.
> > >
> > > Proliferation of get/set methods is a code smell. Immutable
> > > objects are to be preferred.
> >
> > This is not the censuses in the OO community.
>
> Yes, it is. Josh Bloch recommends immutability explicity in
> "Effective Java" and gives solid reasons for his position.
> Proliferation of getters and setters violates encapsulation, one of
> the defining characteristics of object technology. Some research will
> show you that OO designs focus on behavior, not state. You should
> also check out the Law of Demeter and similar guidelines that provide
> further evidence that excessive use of accessors and mutators is not
> good OO form.

Almost all of these have a fair amount of disagreement among OO
proponents. Check out c2.com.

>
> > I would note that a lot of the issues you mentioned, such as
> > performance, scalability, resiliency, and recoverability can be
> > obtained by purchasing a "big-iron" RDBMS such as Oracle or DB2. The
> > configuration and management of those issues is then almost a
> > commodity skill and not as tied to the domain as a roll-your-own
> > solution would be (which OO'ers tend to do).
>
> It is statements like this that strongly suggest that you have
> never developed a large, complex system.

No, because I break them up into peices so that they don't grow to be
one big fat EXE. The Big-EXE methodology has a high failure rate.

> The vast majority of
> businesses that need systems of this complexity have legacy software
> consisting of a number of COTS applications and custom components,
> none of which were designed to work with each other. These have been
> selected or developed for good business reasons and cannot be
> aggregated and run on a single piece of kit, no matter how large.

Agreed, but what does the existence of legacy apps have to do with your
scaling claim? Integration with legacy apps is just something we all
have to face.

>
> Even if it were possible to go down the mainframe route, in many
> cases it would not make business sense.

Mainframe route?

> Big iron is expensive to buy,
> maintain, and upgrade. Distributed systems running on relatively
> inexpensive hardware can provide a more cost-effective solution.

Centralized server shops use the same hardware as decentralized ones
these days. It is a matter of where the servers are put, not how big
the individual boxes are or even how many there are.

I did not mean to suggest that everything should be centralized. It
depends on the kind of business. If the vast majority of
operations/tasks needed are per location, then regional partitioning
works well. If not, then a more centralized approach is needed. For
example, airline reservation and scheduling systems would make a poor
candidate to partition by location because of the interconnectedness of
flights. However, individual stores in a big franchise can operate most
independently.

>
> > "Security" is mostly just massive ACL tables.
>
> That is profoundly . . . naive. I strongly urge you to read
> everything you can find by Bruce Schneir, join the cryptography
> mailing list run by Perry Metzger, and not say another word about
> security until you understand why your statement is so deeply
> embarrassing to you. For a quick, very small taste of why ACL tables
> don't even begin to scratch the surface of the problem, read
> http://www.isi.edu/gost/brian/security/kerberos.html.

I see no mention of ACL's there. If you have a specific case of ACL's
crashing and burning, post it here and let's take a look at it. (Note
that there are a lot of variations of ACL's, so a flaw in one kind is
not necessarily a general flaw in ACP concepts.)

Again, use evidence instead of patronizing insults. It is a bad habit
of yours.

>
> > > > You are participating in Domain Bigotry here.
> > >
> > > No, he is simply stating the obvious: CRUD/USER applications
> > > are not particularly complex, especially when compared with other
> > > software systems.
> >
> > Again, I have yet to see an objective or even semi-objective way to
> > measure "complexity".
>
> I suggest you Google for "software complexity" and you'll find
> several million links. Starting from a page like
> http://yunus.hun.edu.tr/~sencer/complexity.html will give you pointers to
> other research if you are genuinely interested in learning.

That site is down right now. They must be using Kerberos :-)

>
> > Again, the basic concepts of CRUD are easier to learn than most
> > other domains or problems, but that does not mean that the
> > implementation and maintainence of related apps is simple.
>
> CRUD applications are, however, not particularly complex as
> software systems go. Your claims otherwise indicate a lack of
> experience with anything else.

Again, please use evidence to prove me wrong instead of patronizing
insults. It is a bad habit of yours.

Propose a way to measure "complexity", and then apply it to CRUD apps.
That is how you make a point. Anecdotes and private personal opinions
mean very little here. They are a dime a dozen.

>
> > For lack of a better metric, I propose lines of code (LOC) for now
> > as our metric for complexity.
>
> This is yet another suggestion that shows you don't know much
> about the topic you're discussing. Lines of code is not a good
> measurement of anything. Do some research.

Present some evidence. It is not my job to make YOUR points for you. If
your evidence is "over there", then go "over there", grab it, and bring
it back.

>
> > Can you make an argument for something being lots of LOC without
> > being "complex" (beyond learning the domain)? If not, then you are
> > probably stuck using LOC here.
>
> Some of the largest programs I've seen, in terms of lines of
> code, are for generating reports. Getting just the right information
> from a large data set and laying it out precisely as requested can be
> time consuming, but it's not particularly challenging.

Well, these are rather non-specific words you are using. Perhaps such
tasks bore you, but bordem as a measure of complexity is fraught with
problems.

>
> Object technology is not immune. J2EE in general, and EJBs in
> particular, require a great deal of code to provide functionality that
> could be provided far more efficiently.
>
> On the other hand, there are some delightfully complex software
> systems that consist of only a few hundred lines of code. Functional
> languages seem especially good for this. See one of Peter Norvig's
> books for a few examples.

Most FP demonstrations of such are "toy" or "lab" examples. The
concepts that keep them simple in a lab environment don't seem to scale
to the real world where more factors and messy little conditions are
involved. I have asked for FP demonstrations of practical code
simplification, and FP fans could not do it. They flunked the
challenge. This is why FP has not made much headway into the practical
world. Outside of lab toys it is not demonstratably better.

However, FP is wondering off topic. Let's try to stay on track.

>
> Lines of code is a useless metric.

Limited perhaps, but not useless. Especially since we have *nothing
else* besides your boredom as a ruler so far.

Perhaps you are a computer genius, but you seem to be having trouble
turning your alleged genius into words.

If you are going to use "complexity" as a metric to make your case, you
better find ways to measure it objectively. Otherwise it is just an
opinionfest. Read up on the scientific process.

>
> Sincerely,
>
> Patrick
>

-T-

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: Use Case Point Estimation
Next: delegation vs. inheritance