From: Michael J. Mahon on
mdj wrote:
> Michael J. Mahon wrote:

>>BTW, I independently invented FOR i OVER <subrange type> DO ... OD
>>prior to ADA, so I know there are ways to simplify things--I just don't
>>think it's likely that anyone will be able to eliminate the possibility
>>of errors!
>
>
> these days Java also has the rather cute for(reference:collection) {}
> statement as well which is very nice, as it lacks the ability to handle
> subtypes (grumble)

I had FOR <index> OVER <subrange or subrange type>...
and FOR <index> IN <set value>... [Pascal-like sets].

>>Overwriting data is a classic problem, and is usually pretty easily
>>found if it's literally a bad conditional or an off-by-1 error, since
>>the error is always misbehaving.
>>
>>(Overwriting code is another matter, but code pages should be protected
>>from self-modification except in unusual cases.)
>
>
> A big problem is how to avoid execution of code within data pages, when
> its been put there by a malicious perpeprator exploiting a flaw. Code
> overwrite is easy, lock the codepages against writes. Can't easily lock
> the data pages though :-)

Nonsense. Virtually all architectures today have page attributes that
specify whether anything on a page can be *executed*, as well as whether
it can be written. Some architectures provide for these attributes
to be dependent on the current privilege level of the processor.

Most data pages are marked "writeable, not executable" and most code
pages are marked "non writeable, executable" (at least if the processor
is running at user process privilege level).

Any violation results in an immediate hardware trap.


>>You should take some solace from the fact that probably the most
>>frequently used "higher level language" is Excel, and it is completely
>>pointer safe. ;-) The only mistakes you can make in Excel are the
>>ones that will bankrupt your company! (Get my point?)
>
>
> No it isn't! Excel spreadsheets routinely refer to a cell that hasn't
> been defined. This brings a whole new meaning to segmentation
> violation. Don't get me started about the number of "databases" that
> exist in spreadsheets :-)
>
> There's no restriction whatsoever in Excel code over what part of the
> "address space" you reference in an expression. You can even reference
> a sheet that doesn't exist (a genuine "bus error" :) )

But remember, all cells are implicitly given a value of zero. And
there are no addresses exposed for people to do arithmetic on. ;-)

All the errors that you can make in Excel are just like programming
errors that you can make in any language. For example, what happens
if you store all your data in an array? Are individual elements
maintained as "initialized" or "uninitialized", or are they all just
set to some (hopefully) benign value?

My point was, that there are more direct ways that human error can
lead to the failure of a business (or what-have-you) than can be
protected against in a programming language!


>>Hmmm. I expect that a letter to the editor of any trade rag would
>>reach a much larger audience...
>
>
> I think they gave up publishing such things years ago, unfortunately.

Then what do all those legions of programmers read? Or do they all
have so little influence over the spending of money that no one cares
to advertise to them?


>>I've generally been a bottom-up thinker (which I guess is like being
>>a geometer vs. an arithmetician ;-), so I've often approached questions
>>of performance with a blunt instrument.
>
>
> I'm fairly similar in this regard - many it seems approach it
> guesswork; look over source code, spot 'slow' bit, fix it, get 2% gain,
> repeat ;-)
>
>
>>For example, after implementing a compiler and getting it mostly
>>working, I became curious about whether it had any significant hot
>>spots that I could squash to get significant performance improvements.
>>
>>I went to the console of the (mainframe) machine as it was compiling
>>itself, and manually hit the stop key a couple of dozen times, each
>>time writing down the program counter address.
>>
>>If a part of the program was using more than 25% of the time, I should
>>have seen its PC value represented in my sample about 6 times, plus or
>>minus the square root of 6, or between 4 and 8 times. If no range
>>of addresses was sampled that often, then there was no procedure in
>>the compiler taking at least 25% of the time.
>>
>>I found a spot that was taking about a third of the time, and another
>>taking about a fifth, so I looked at them and sped them up by about
>>a factor of 4 or 5 each.
>>
>>Less than an hour after taking my PC samples, the compiler was running
>>almost twice as fast.
>>
>>Not rocket science, but thinking across several levels of abstraction--
>>and all I needed was a piece of paper, a pencil, and a load map for the
>>compiler!
>>
>>(BTW, this method can be applied to *many* performance issues, but it
>>helped that wall clock time was identical to process time in this case.)
>
>
> Most development environments today provide profiling tools that make
> it simple to spot such hotspots. The tricky areas now are usually where
> programs are I/O bound.

Clock time includes I/O waits. In fact, the 20% routine was a codefile
output routine that originally had only one buffer. I added three page
buffers for it to manage LRU and it sped up by a factor of ten.


> This is why I advocate tools and languages that allow you observe
> complexity. Nobody can understand complex systems in their entirety.

I'd like to see some of those tools... Tools that span abstraction
levels are quite interesting, and mind-blowing for many.

>>>> >>I only note that there are successful patterns which are *not* made
>>>>
>>>>
>>>>>>of OO concepts, as well. Perhaps some of the most useful ones...
>>>>>
>>>>>
>>>>>Indeed many of the patterns are also applicable to non-OO designs. OO
>>>>>at the end of the day is just another level of abstraction above
>>>>>structures and functions, allowing convenient paring of related
>>>>>structures and functions, resulting in more manageable code. Going back
>>>>>to languages that lack it is very 'constraining' :-)
>>>>
>>>>I have always maintained this pairing, even in assembly language.
>>>>
>>>>It's nice when a language supports something you know is a good idea,
>>>>but you can generally do it even without explicit support.
>>>
>>>
>>>True. It's very hard though when you drop down 2 or 3 levels of
>>>abstraction to recreate them well, as you're ultimately implementing
>>>poor-mans tools as you do so.
>>
>>I have not often found it so.
>>
>>Most often, I find that I am exploiting a pattern of behavior that
>>spans multiple levels of abstraction, and often couples code and data
>>that are surprisingly distant in conceptual space.
>>
>>In most cases like this, no one looking at the high level code would
>>ever have imagined that there was any such relationship to be exploited.
>>
>>The world of behavior is much stranger and more wonderful than our
>>abstract models suggest--that's what makes them "abstract".
>>
>>There is enlightenment here, for those who would find it.
>
>
> The interesting elements of behavior only manifest when it's possible
> to observe them, and observations of complex system behaviors are
> difficult with current toolsets.

Observations of system behaviors, where interesting behaviors span time
scales of picoseconds to hours, are extremely difficult. I have not
seen any "great" tools, nor even any effort to create them.

What I do see are little tiny steps, disconnected from one another,
and generally used under protest by people who don't see the value.

It is evident that the folks building new levels of abstraction have
never felt any need to construct stairways to connect up with the
lower level(s).

If it were easy to visualize the behavior of a program, then programmers
would not be so universally in the dark about how their programs
actually behave!

Have things improved greatly in the last few years? When I made
profiling tools available, I found them virtually unused! Apparently
programmers felt they already understood what their code was doing,
so measuring it would be a waste of time (or a nasty surprise ;-),
so there was no motivation to do it.


>>Glad to hear it. When I last looked seriously (2000), I was arguing
>>that the CLR should try to do this *very* well.
>
>
> There's a lot of incremental improvement going on in this field.
> Building the optimisers is taking a long time, but the ones we have now
> are already very impressive, and very hard to beat in any reasonable
> time frame 'by hand'. I imagine it won't be too long before they're
> impossible to beat for any relatively complex program.
>
>
>>>Indeed. There seems to be some disagreement about what must be
>>>sacrificed to achieve these levels of efficiency though.
>>
>>Actually, almost *nothing* needs to be sacrificed if the optimization
>>is based on run-time truth.
>>
>>All you have to do is generate a predicate that ensures that the
>>preconditions you have measured continue to exist prior to executing
>>the optimized code. If that predicate fails, then it's time to fall
>>back to the original code and do some more measurement to see if a
>>new, perhaps more general, behavioral pattern has emerged.
>>
>>Compiler technology is based on theorem proving in a fairly weak
>>axiom system. You want to strengthen the axioms to allow more
>>theorems to be proved (optimizations to be done) statically. But
>>it is not necessary to limit yourself to static knowledge, as you
>>know. You can replace "proof" with "probable inference" on the
>>basis of observed behavior, and then generate code to verify that
>>this inference is still valid--or escape to the old "interpretive"
>>approach if it is not. This is a much more robust system design
>>than one based on static proof rules, and, in fact, it largely
>>obsoletes the static approaches.
>>
>>Of course, separate compilation of applications as *hundreds* of
>>separate modules already almost totally invalidates any hope of
>>strong axioms holding across all modules. Better to just rely
>>on dynamic inference anyway.
>
>
> This is precisely why I argue so strongly against using C/C++, you lose
> any ability to do this kind of optimisation when you have no runtime,
> and no runtime typing.

No, you can "observe" machine code, too! It is pretty easy, and
languages like C and C++ can gain tremendously from it.

All you need to do adaptive optimization is a program and its
behavior, and they *all* have lots of that. ;-)

You don't need any *algebra* of types, since the only types that
matter are the machine data types. It is only at the level of
actual machine code and actual machine addresses that it is
possible to observe "practical" reference disambiguation and
optimize for it, while putting in a guard against its failure.

>>>It's probably a combination of obstinacy and caution. On todays
>>>hardware, you can do some pretty amazing numerics on GPU's rather than
>>>CPU's, which have orders of magnitude more computational power than
>>>CPU's with regards to numerical calculations. Getting it *right* for
>>>numerics is hard, and given the rate of change in the hardware arena
>>>thus far, it may prove to be more sensible to have left it until the
>>>dust settles a little.
>>
>>Whoops--cop out. There is essentially universal agreement on IEEE
>>FP, with the only holdouts being...LANGUAGE DESIGNERS! It's at least
>>a decade past time to get with the program!
>
>
> A decade sounds about right :-) As I noted, these changes can be made,
> and should be. Let's just hope it does eventually happen.
>
>
>>And don't worry about the GPUs and DSPs, not only are they moving to
>>IEEE FP as well, but their computations are practically *never* well
>>described in a conventional high level language, despite lots of
>>publicity to the contrary.
>
>
> I think things will be much easier once there's universal adoption.

We are well past the point where a language must run on all extant
machines to be a winner.

It is appropriate to leave behind the machine baggage that obstructs
doing what needs to be done. The programmers for that machine will
struggle along for a while, but so what? At least the 95% majority
will move ahead on more modern platforms.

(As an aside, Intel was the *first* company to adopt IEEE FP, back
when it was on a coprocessor--and they did it in full generality,
but pretty slowly.)

>>Any 2-bit (!) DSP coder can run rings around your favorite high level
>>language on any algorithm whose performance matters. This will
>>continue to be true as long as the DSP/GPU actually has lots of
>>various kinds of parallelism--and for that matter, it also applies
>>to the various multimedia extensions (SIMD) in popular desktop
>>processors, as well. The only known "solution" to this problem
>>is providing libraries of carefully hand-tuned code for applications
>>to call--and it changes radically with each generation.
>
>
> There's actually a great absence of such libraries though.

Which, I suppose, speaks to the need for humans who understand caches
and superscalar, out-of-order scheduling to code them--or another of
those infernal genetic algorithms. ;-)

>>>It's important to note one obvious area of parallelism that many fail
>>>to consider: garbage collection. In older environments where you're
>>>forced to collect garbage yourself, you end up with your collection
>>>being executed sequentially with your other code. While it's arguable
>>>that this is more efficient in terms of number of cycles than collector
>>>based approaches, you lose out on the obvious ability to collect
>>>garbage concurrently, which on todays hardware is a big inefficiency.
>>
>>Actually, in older environments, there *is* no garbage collection!
>
>
> I mean you must do it manually :-)

Well, that's true (and I almost said so), but that is not the
usual understanding of "garbage collection".

>>Garbage collection arises as a necessity only when programmers are
>>released from the obligation to return resources they are no longer
>>using.

There, you see, I said it!

>>Garbage collection generally means traceable data structures, and
>>therefore disciplined data structures. However, don't overlook the
>>"catch all" discipline for garbage--simply allocating a chunk of
>>resource and then reclaiming everything that wasn't "registered
>>as persistent" at end-of-job.

And this "chunky" garbage collection is relatively inexpensive
and quite effective for most programming chores.

(After all, the prime directive is to not generate (much) garbage if
it can be avoided. ;-)

>>In any case, concurrent garbage collection is possible in any
>>disciplined environment, at some level of time/space granularity
>>for which the discipline applies.
>>
>>
>>>There's also the issue around algorithmic complexity going up with
>>>inline collection (try implementing a fast B+ tree that doesn't leak)
>>>so there's another big payoff to a more abstract approach.
>>
>>Actually, there are a lot of good reasons not to prefer a
>>concurrent garbage collector, if it can be avoided.
>
>
> Other than not having plenty of free concurrent processing power
> available, what are they?

I was about to say their lack of provable correctness combined with
their ability to wreak havoc across all levels of abstraction. But
then I realized that they almost have a "provable lack of incorrectness"
in a highly concurrent, hierarchically connected multiprocessor.

Garbage collection, because it typically cuts across several layers
of abstraction, exhibits a criminal lack of locality. So one must
use a hierarchy of concurrent local garbage collectors...is this
considered a solved problem?

> Any non-trivial program creates data then needs to dispose of it. What
> purpose does it serve for that program to waste sequential running time
> doing so?

One could as easily, and more coherently, say that part of doing any
job is cleaning up after it. And such local "cleaning up" is a much
more efficient process than anything more global--think cache misses,
page faults, network traffic...

Why on earth would you expect that the "cleanup" phase of a program
was a good part to run in parallel with the rest of it, rather than
looking for a more structural division into concurrent parts.

Garbage collection originally came into being because of a programming
paradigm which made multiple, uncountable references to various data
objects, then re-assigned references, leaving data sometimes abandoned
without any explicit knowledge of that fact--"garbage" data.

While this paradigm is occasionally useful, for most programming it
is simply a mark of laziness and failure to keep track of the scope
or lifetime of data structures. Some languages define semantics (of
strings, for example) which naturally result in garbage creation.
But even then, the garbage can be restricted to the string pool(s).

Most "garbage" is either avoidable, and can be prevented, or is
negligible, and can wait to be blindly recovered at the end of a
task. Building in a pervasive notion of a background garbage
collection activity is, in many cases, surrendering to a lack of
discipline in managing data structures. (And a lack of tools to
verify that they are properly managed.)


>>No, they treated multidimensional objects as vectors of vectors of...
>>The workaround was to declare a one-dimensional vector of numbers, say,
>>and take responsibility for the multi-dimensional mapping yourself--
>>a much less "automatic" approach. And pointers were not necessary,
>>since all the arithmetic could be index arithmetic.
>
>
> The main point for me is that the language allows a portable definition
> of multidimensional structures. Whether or not a particular
> implementation compiles it one way or another is not really an issue.
>
> Once we rely on techniques that bind us to architectures we lose. It's
> easy to avoid them, and we should do so.

So you think that address spaces are in danger of becoming obsolete?
Or that addresses will not be operated on by arithmetic operations?
Trust me, we'll all be long gone before that changes!



>>It's bloat. ;-)
>>
>>More seriously, the bloat is undeniable, and is, in large part, the
>>reason that running much code is I/O bound--just loading the code!
>>
>>In the days before we could have a gigabyte of memory, the bloat caused
>>lots of virtual memory I/O, which is the worst kind of I/O bound.
>>
>>I think that the time required to boot a machine is an interesting
>>measure of performance. Somehow, we continue to lose ground here
>>(and I do realize that much of that is I/O time). Still, disk I/O
>>is many times faster than it was a decade ago...
>
>
> Yet still the speed deficit between I/O and computation grows...

But somehow that processing advantage has not translated into the
need to do less I/O!

>>>I think a certain degree of bloat is unavoidable, hopefully it evolves
>>>out of systems over time.
>>
>>That would be a new evolutionary trend. ;-)
>
>
> LOL, true. I mean that much of it can be pruned off when it proves
> extraneous, much like callouses on a toe or foot. In fact, that seems a
> most appropriate analogy when discussing most of the UI frameworks I've
> seen :-)

But one man's callous is another man's foot. ;-)

Frameworks only get pruned by discipline--and someone's ox always
gets gored in the process. That's why, in a market-driven world, the
universe is always expanding, not contracting. Until the "stressor"
lands like a ton of bricks...

>>Organisms usually get simpler only when some extreme stressor hits their
>>environment. Hand-cranked machines? Boot times exceeding attention
>>spans? ;-)
>
>
> Well, such things are on their way :-)

Yes, if Negroponte gets his way. ;-)

Already, many people keep their machines powered off, and don't use
them for many tasks in passing because they take too long to boot
(and shut down). There is a huge productivity loss in the extended
time required to start up a modern machine.

(I keep mine on all the time--so I trade off electricity use against
twiddling my thumbs--that and the fact that my machine is the house's
print server. ;-)


>>>Indeed. I believe though that minimising these decisions through
>>>keeping interfaces highly abstract is the best way to provide space
>>>down the track for optimisation. Generally speaking, it isn't that hard
>>>to get it right, but getting everyone to agree is oft problematic :-)
>>
>>Hence the need for objective models, and the expertise to appreciate
>>their results.
>
>
> If only there was time to analyse them. Can I say more, better tools
> one more time ? ;)

But most of the tools we need are programs that we need to write, and
many of them are analogous to, and therefore specialized to, the problem
that we are currently solving. It is the unwillingness to write this
"scaffold" code (I call it that because it is essential to the efficient
construction of the product, but is not a shipping part of the product).

Dijkstra used to make the analogy between a program and an iceberg:
like an iceberg, the visible code of a program is only a small part
of the documentation and code that supports the activity of creating
the program. Yet, we often find that these supporting tasks are left
to the end of a project, or are left undone entirely.

Most engineering disciplines have a solid understanding of the value
and necessity of thorough planning and "tooling" to do a project.
Making proper drawings, forms, and scaffolds is as critical to the
project's success as a good design or good materials.

Only in software construction are we so informal (perhaps even sloppy)
that we jump right in and start deciding things before the problem is
even well-defined, we code modules before considering how we will test
them, and we attempt to integrate systems from modules of radically
differing robustness and pedigree. It's no wonder that we are so
often surrounded by piles of collapsed rubble!

You keep saying "tools" and I keep saying "discipline".

I have observed that a disciplined programmer will find or construct
good tools as a natural part of his discipline.

I see little corresponding indication that a programmer provided with
good tools will develop good discipline. I hope I'm wrong...

-michael

Parallel computing for 8-bit Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design
tool--and it is seriously underused."
From: Michael J. Mahon on
mdj wrote:
> Michael J. Mahon wrote:
>
>
>>>It's probably a combination of obstinacy and caution. On todays
>>>hardware, you can do some pretty amazing numerics on GPU's rather than
>>>CPU's, which have orders of magnitude more computational power than
>>>CPU's with regards to numerical calculations. Getting it *right* for
>>>numerics is hard, and given the rate of change in the hardware arena
>>>thus far, it may prove to be more sensible to have left it until the
>>>dust settles a little.
>>
>>Whoops--cop out. There is essentially universal agreement on IEEE
>>FP, with the only holdouts being...LANGUAGE DESIGNERS! It's at least
>>a decade past time to get with the program!
>
>
> Here's the changes that actually were made to the Language
> Specification for version 1.2.
>
> http://java.sun.com/docs/books/jls/strictfp-changes.pdf
>
> In essence, these changes relax the rules on floating point arithmetic
> for intermediate calculations, meaning you can take advantage of the
> peculiarities of particular fp engine implementations (the Intel 80 bit
> implementation being the primary example cited) at the cost of
> non-reproducability of results across Java implementations, which
> nobody seems to care about.
>
> There's a strictfp keyword that exists to enforce the original
> reproducable semantics for cases where it's required, which it would
> seem, is almost never, I've not seen any code actually use it.

I expect that strictfp will become the most commonly unimplemented
modifier in the language. ;-)

I note, however, that in violation of the spirit of the IEEE FP
standard, the exception traps were not added--thus preventing the
skilled use of the traps to obtain better numeric results, or their
unskilled use to detect anomalies.

> This edition of the platform was released in 1999, so it seems the
> major issue went away with those changes.

That's probably why I missed it...starting in 1998, I had other fish
to fry.

> What remains to be done is an extension to handle richer numerics, like
> complex numbers.

Ah, yes--and simple (non-precedence changing) operator overloading.

-michael

Parallel computing for 8-bit Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design
tool--and it is seriously underused."
From: Michael J. Mahon on
mdj wrote:
> Michael J. Mahon wrote:

>>>I'm awaiting a TimeMaster HO which has a rather nice programmable
>>>interrupt controller. Much more flexible than relying on the 60Hz VBL I
>>>use at the moment for experimentation.
>>
>>Actually, 60Hz is plenty fast enough to find almost anything of
>>real interest on a slow machine. And it is infrequent enough that
>>you can actually execute some code without it being intrusive.
>
>
> True. Actually I've found 60Hz to be a good slicing interval on the
> Apple II, perhaps moreso when the machine accelerated than not. I'm
> looking forward though to having a programmable timer source in the
> machine - something I've been missing for a long time.

I've found any card with a 6522 a great source of programmable timers
and interrupts. It's pretty easy to hook up the IRQ line if the card
hasn't done so.

Just running two counters as a 1MHz counter provides a great running
cycle counter for sampling at interesting places.

>>>It's good fun exploring these ideas on smaller environments that have
>>>nice predictable behaviors, but you already know this :-)
>>
>>...and I *love* it! Of course, I love it even more when I fail to
>>predict a behavior. ;-)
>
>
> And it's very enlightening just how frequently this occurs, even on
> machines that are very humble. There is so much to be learned.

Hear, hear.

>>Yes, I always provided a way for processes to handle interrupts
>>directed to them. (Most language designers hated the idea.)
>
>
> It's hard enough getting OS designers to acknowledge the issue :-(

Most OS designers think of application programmers as wimps,
and the *first* set of wimps are the language folks. ;-(

> Most language designers are fundamentally opposed to concurrency
> concerns entering the language. This is a folly, and a real impediment
> to having modern system provide realtime scheduling :-(

Yep--that's been my experience, too.

In most languages, concurrency further weakens the already weak
axiom system defined by type and scope rules. The fact that it
adds tremendous power as it does so is apparently lost on them.

>>That is the very simple approach which I believe the encoders themselves
>>should do after taking a "census" of the system they are running on.
>
>
> Yeah - it's simple enough to get a poor mans version through a simple
> wrapper script, but you're right encoders need to do this. Bring on
> parallelism support in languages says I !
>
>
>>Quantum computing evaluates all possible computations simultaneously,
>>so the "selection" of the answer tends to be done at the end. ;-)
>>
>>NP-complete problems become solvable because a huge combinatoric
>>space of possible solutions is explored "in superposition" in the
>>time required to explore a single solution. Then you have to "read
>>out" the final state(s) to discover the solution(s).
>
>
> Parallelism provides a reduction in time required to compute many
> combinatorial problems too.

But only (at best) in proportion to the degree of parallelism.

Quantum mechanics is using the whole universe to get the answer. ;-)

>>I'm reminded of "the answer to the ultimate question". At some point,
>>you realize that you really want the ultimate question, too. ;-)
>
>
> Indeed! The answer is arbitrary without the question :-)
>
>
>>>>What needs to be done is for trained people with an *engineering*
>>>>mindset to sit down and squarely face the problem of multi-level
>>>>concurrency.
>>>>
>>>>I expect the solutions will roll out in layers of successive refinement,
>>>>but we haven't yet even chosen to directly address the problem.
>>>
>>>
>>>It won't be long now. We can't wait much longer for companies to
>>>engineer processors with massively faster sequential processing speeds
>>>before realising that they can't :-)
>>
>>Yes, I think that has dawned on them as they whip their design teams
>>harder while watching their stock stagnate and then fall...
>
>
> I think it has dawned on them the stock price is falling, but beyond
> that, the actual measures of such failure tend to evade being
> addressed. :-(

I think it's more a problem of not knowing what to do. The "machine"
of the semiconductor industry has been tuned up to the rhythm of Moore's
"Law"--actually a simple economic prediction that if you can increase
the number of transistors on a chip fast enough, you can create enough
business to finance the work required to maintain the density increases.

Now that physics is interfering with the rate of improvement, and making
the transistors less ideal, business is down, and what was a virtuous
cycle is turning vicious.

Like the proverbial frog, the industry is taking a long time to figure
out that their business model needs to change fundamentally if they are
to survive. And that includes the PC market, where the average life
of a system has increased from about 2.5 years to about 4 years, causing
a massive decrease in the effective size of the (saturated) market.

We're now in a *serious* buyers' market--with the real price of
computers dropping like a rock as their effective performance has
effectively stagnated.

Now that the stage is set, it will be interesting to see what happens!

>>>You're absolutely right, but there's a degree of dependency between the
>>>two that needs to be addressed, and part of that is engineering out old
>>>notational forms which inhibit the progress of parellel system design.
>>
>>Again, I would say that very little of the current linguistic goals have
>>more than incidental relevance to parallelism. It's a plain case of
>>"looking where there's light instead of where they lost it".
>
>
> Indeed the things I'd like to change are fairly little things too, yet
> the degree of defiance that ones faces when suggesting it is staggering
> :-)

So if you're going to make a change, make it a *big* one--it won't get
any worse reaction than a small one, and maybe *less*! It's certainly
a better average return on your investment. ;-)

>>If an alien intelligence is watching, a little box on page 11,325 of
>>their weekly report must be devoted to a betting pool on how long it
>>will take us to figure out that we were working on the wrong problem.
>>(Just in *this* area--there are lots of boxes on other pages! ;-)
>
>
> And the one who wins the pool probably made his guess based on looking
> at their own history ;-)

Nah--the only thing we learn from history is that no one ever learns
anything from history. ;-) (And the Barber of Seville has a beard. ;-)

-michael

Parallel computing for 8-bit Apple II's!
Home page: http://members.aol.com/MJMahon/

"The wastebasket is our most important design
tool--and it is seriously underused."
From: aiiadict on
Michael J. Mahon wrote:
> the only thing we learn from history is that no one ever learns
>anything from history. ;-)

So in school, we are taught that we can never learn! (in
history courses!)

What a strange world we live in.

Rich

From: mdj on
Michael J. Mahon wrote:

> I had FOR <index> OVER <subrange or subrange type>...
> and FOR <index> IN <set value>... [Pascal-like sets].

Both are quite good approaches. Ada had both.

> >>Overwriting data is a classic problem, and is usually pretty easily
> >>found if it's literally a bad conditional or an off-by-1 error, since
> >>the error is always misbehaving.
> >>
> >>(Overwriting code is another matter, but code pages should be protected
> >>from self-modification except in unusual cases.)
> >
> >
> > A big problem is how to avoid execution of code within data pages, when
> > its been put there by a malicious perpeprator exploiting a flaw. Code
> > overwrite is easy, lock the codepages against writes. Can't easily lock
> > the data pages though :-)
>
> Nonsense. Virtually all architectures today have page attributes that
> specify whether anything on a page can be *executed*, as well as whether
> it can be written. Some architectures provide for these attributes
> to be dependent on the current privilege level of the processor.
>
> Most data pages are marked "writeable, not executable" and most code
> pages are marked "non writeable, executable" (at least if the processor
> is running at user process privilege level).
>
> Any violation results in an immediate hardware trap.

Yet we still suffer security issues like buffer overrun exploits- I
don't think this approach is all that effective on present systems, as
you can't get every codepage secured.

> >>You should take some solace from the fact that probably the most
> >>frequently used "higher level language" is Excel, and it is completely
> >>pointer safe. ;-) The only mistakes you can make in Excel are the
> >>ones that will bankrupt your company! (Get my point?)
> >
> >
> > No it isn't! Excel spreadsheets routinely refer to a cell that hasn't
> > been defined. This brings a whole new meaning to segmentation
> > violation. Don't get me started about the number of "databases" that
> > exist in spreadsheets :-)
> >
> > There's no restriction whatsoever in Excel code over what part of the
> > "address space" you reference in an expression. You can even reference
> > a sheet that doesn't exist (a genuine "bus error" :) )

> My point was, that there are more direct ways that human error can
> lead to the failure of a business (or what-have-you) than can be
> protected against in a programming language!

Indeed. I just want to improve the quaility of expression of ideas such
that the code is more reusable over time.

> Then what do all those legions of programmers read? Or do they all
> have so little influence over the spending of money that no one cares
> to advertise to them?

Web sites, blogs, etc. A big issue is that a majority of the larger
sites are vendor sponsored, so there's a lot of don't bite the hand
that feeds going on.

> > Most development environments today provide profiling tools that make
> > it simple to spot such hotspots. The tricky areas now are usually where
> > programs are I/O bound.
>
> Clock time includes I/O waits. In fact, the 20% routine was a codefile
> output routine that originally had only one buffer. I added three page
> buffers for it to manage LRU and it sped up by a factor of ten.

This is probably one of the biggest areas of performance enhancement we
could explore with well designed tools, that showed up the bottlenecks
on performance due to over utilisation of I/O resources.

I had a case sometime back where a database system was performing
extremely slowly, while running on some pretty impressive hardware.
Turns out the system was issuing asynchronous I/O calls in 8kb chunks
at such a rate that it exceeded the transactional throughput of the
fibre card it was using for I/O without getting anywhere near its
bandwidth.

Problem was solvable by tuning a parameter or two, but such a thing
should be easier to find than it was (used an interposing technique to
sniff the read/write calls to the OS)

> > This is why I advocate tools and languages that allow you observe
> > complexity. Nobody can understand complex systems in their entirety.
>
> I'd like to see some of those tools... Tools that span abstraction
> levels are quite interesting, and mind-blowing for many.

Yes, but nobody's putting in the effort required to make them happen.

> > The interesting elements of behavior only manifest when it's possible
> > to observe them, and observations of complex system behaviors are
> > difficult with current toolsets.
>
> Observations of system behaviors, where interesting behaviors span time
> scales of picoseconds to hours, are extremely difficult. I have not
> seen any "great" tools, nor even any effort to create them.
>
> What I do see are little tiny steps, disconnected from one another,
> and generally used under protest by people who don't see the value.
>
> It is evident that the folks building new levels of abstraction have
> never felt any need to construct stairways to connect up with the
> lower level(s).
>
> If it were easy to visualize the behavior of a program, then programmers
> would not be so universally in the dark about how their programs
> actually behave!
>
> Have things improved greatly in the last few years? When I made
> profiling tools available, I found them virtually unused! Apparently
> programmers felt they already understood what their code was doing,
> so measuring it would be a waste of time (or a nasty surprise ;-),
> so there was no motivation to do it.

Herein lies the problem. There are some very nice profiling tools these
days, and profiling is just the beginning.

It takes a pretty big shift in thinking, apparently, to get people
using the tools. Once they start though they start wishing for better
ones....

> > This is precisely why I argue so strongly against using C/C++, you lose
> > any ability to do this kind of optimisation when you have no runtime,
> > and no runtime typing.
>
> No, you can "observe" machine code, too! It is pretty easy, and
> languages like C and C++ can gain tremendously from it.
>
> All you need to do adaptive optimization is a program and its
> behavior, and they *all* have lots of that. ;-)
>
> You don't need any *algebra* of types, since the only types that
> matter are the machine data types. It is only at the level of
> actual machine code and actual machine addresses that it is
> possible to observe "practical" reference disambiguation and
> optimize for it, while putting in a guard against its failure.

It is significantly trickier given the architecture of current OS's to
glean this level of measurement though, where a VM approach greatly
simplifies it.

> > I think things will be much easier once there's universal adoption.
>
> We are well past the point where a language must run on all extant
> machines to be a winner.
>
> It is appropriate to leave behind the machine baggage that obstructs
> doing what needs to be done. The programmers for that machine will
> struggle along for a while, but so what? At least the 95% majority
> will move ahead on more modern platforms.
>
> (As an aside, Intel was the *first* company to adopt IEEE FP, back
> when it was on a coprocessor--and they did it in full generality,
> but pretty slowly.)

Yep, and x86 is still one of the only architectures with a complete
implementation.

> >
> > There's actually a great absence of such libraries though.
>
> Which, I suppose, speaks to the need for humans who understand caches
> and superscalar, out-of-order scheduling to code them--or another of
> those infernal genetic algorithms. ;-)

You certainly need a lot less of those humans, and often they can be
the same people who helped design the hardware in the first place -
this isn't really a problem.

> >
> > I mean you must do it manually :-)
>
> Well, that's true (and I almost said so), but that is not the
> usual understanding of "garbage collection".
>
> >>Garbage collection arises as a necessity only when programmers are
> >>released from the obligation to return resources they are no longer
> >>using.
>
> There, you see, I said it!
>
> >>Garbage collection generally means traceable data structures, and
> >>therefore disciplined data structures. However, don't overlook the
> >>"catch all" discipline for garbage--simply allocating a chunk of
> >>resource and then reclaiming everything that wasn't "registered
> >>as persistent" at end-of-job.
>
> And this "chunky" garbage collection is relatively inexpensive
> and quite effective for most programming chores.
>
> (After all, the prime directive is to not generate (much) garbage if
> it can be avoided. ;-)

I'm actually convinced that on current and future architectures
releasing programmers from the need to clean up after themselves (so to
speak) is a major benefit. It's one less thing to be concerned about,
and the net result is (usually) better execution time for a given
program.

Only in cases where resource consumption is extremely high is there a
need to invoke manual collection strategies.

> > Other than not having plenty of free concurrent processing power
> > available, what are they?
>
> I was about to say their lack of provable correctness combined with
> their ability to wreak havoc across all levels of abstraction. But
> then I realized that they almost have a "provable lack of incorrectness"
> in a highly concurrent, hierarchically connected multiprocessor.
>
> Garbage collection, because it typically cuts across several layers
> of abstraction, exhibits a criminal lack of locality. So one must
> use a hierarchy of concurrent local garbage collectors...is this
> considered a solved problem?

Not yet - but it's getting there slowly :-)

> > Any non-trivial program creates data then needs to dispose of it. What
> > purpose does it serve for that program to waste sequential running time
> > doing so?
>
> One could as easily, and more coherently, say that part of doing any
> job is cleaning up after it. And such local "cleaning up" is a much
> more efficient process than anything more global--think cache misses,
> page faults, network traffic...
>
> Why on earth would you expect that the "cleanup" phase of a program
> was a good part to run in parallel with the rest of it, rather than
> looking for a more structural division into concurrent parts.
>
> Garbage collection originally came into being because of a programming
> paradigm which made multiple, uncountable references to various data
> objects, then re-assigned references, leaving data sometimes abandoned
> without any explicit knowledge of that fact--"garbage" data.
>
> While this paradigm is occasionally useful, for most programming it
> is simply a mark of laziness and failure to keep track of the scope
> or lifetime of data structures. Some languages define semantics (of
> strings, for example) which naturally result in garbage creation.
> But even then, the garbage can be restricted to the string pool(s).
>
> Most "garbage" is either avoidable, and can be prevented, or is
> negligible, and can wait to be blindly recovered at the end of a
> task. Building in a pervasive notion of a background garbage
> collection activity is, in many cases, surrendering to a lack of
> discipline in managing data structures. (And a lack of tools to
> verify that they are properly managed.)

Quite strongly disagree with this point. Any reasonably large program
will generate 'garbage', and indeed it can be prevented, by cleaning it
up. I just don't think it's an efficient use of either human time
(which is limited) or machine time (which is always expanding) to put
responsibility for a relatively simple computation task on the human.

You gain nothing by doing it yourself, and potentially lose out, by
losing the ability to handle it concurrently.

> > Once we rely on techniques that bind us to architectures we lose. It's
> > easy to avoid them, and we should do so.
>
> So you think that address spaces are in danger of becoming obsolete?
> Or that addresses will not be operated on by arithmetic operations?
> Trust me, we'll all be long gone before that changes!

Indeed, but applications that are bound to single address spaces by
virtue of exploiting their nature in code unnecessarily will also by
long gone. And we're recreating them still, far beyond the point where
it's sensible to continue.

> > Yet still the speed deficit between I/O and computation grows...
>
> But somehow that processing advantage has not translated into the
> need to do less I/O!

Indeed it's caused the need for more I/O. There's potential light at
the end of the tunnel though - the possibility of treating secondary
storage as it it's a persistant addressable medium is not far away,
which can result in many positive shifts in the computing paradigm.

> But one man's callous is another man's foot. ;-)
>
> Frameworks only get pruned by discipline--and someone's ox always
> gets gored in the process. That's why, in a market-driven world, the
> universe is always expanding, not contracting. Until the "stressor"
> lands like a ton of bricks...

We agree on that. It's unfortunate that the stressors always end up
being so extreme, and are dealt with reactively.

> >>Organisms usually get simpler only when some extreme stressor hits their
> >>environment. Hand-cranked machines? Boot times exceeding attention
> >>spans? ;-)
> >
> >
> > Well, such things are on their way :-)
>
> Yes, if Negroponte gets his way. ;-)
>
> Already, many people keep their machines powered off, and don't use
> them for many tasks in passing because they take too long to boot
> (and shut down). There is a huge productivity loss in the extended
> time required to start up a modern machine.
>
> (I keep mine on all the time--so I trade off electricity use against
> twiddling my thumbs--that and the fact that my machine is the house's
> print server. ;-)

I do the same, it consumes little power in standby mode anyway, and the
time gained back by not waiting for it is a big plus.

I frequently have to defend myself against those who claim it is a
waste of power, when that remark is usually a thinly disguised (that
noise is keeping me awake :) )

> >>>Indeed. I believe though that minimising these decisions through
> >>>keeping interfaces highly abstract is the best way to provide space
> >>>down the track for optimisation. Generally speaking, it isn't that hard
> >>>to get it right, but getting everyone to agree is oft problematic :-)
> >>
> >>Hence the need for objective models, and the expertise to appreciate
> >>their results.
> >
> >
> > If only there was time to analyse them. Can I say more, better tools
> > one more time ? ;)
>
> But most of the tools we need are programs that we need to write, and
> many of them are analogous to, and therefore specialized to, the problem
> that we are currently solving. It is the unwillingness to write this
> "scaffold" code (I call it that because it is essential to the efficient
> construction of the product, but is not a shipping part of the product).
>
> Dijkstra used to make the analogy between a program and an iceberg:
> like an iceberg, the visible code of a program is only a small part
> of the documentation and code that supports the activity of creating
> the program. Yet, we often find that these supporting tasks are left
> to the end of a project, or are left undone entirely.
>
> Most engineering disciplines have a solid understanding of the value
> and necessity of thorough planning and "tooling" to do a project.
> Making proper drawings, forms, and scaffolds is as critical to the
> project's success as a good design or good materials.
>
> Only in software construction are we so informal (perhaps even sloppy)
> that we jump right in and start deciding things before the problem is
> even well-defined, we code modules before considering how we will test
> them, and we attempt to integrate systems from modules of radically
> differing robustness and pedigree. It's no wonder that we are so
> often surrounded by piles of collapsed rubble!
>
> You keep saying "tools" and I keep saying "discipline".
>
> I have observed that a disciplined programmer will find or construct
> good tools as a natural part of his discipline.

Indeed this is true.

> I see little corresponding indication that a programmer provided with
> good tools will develop good discipline. I hope I'm wrong...

You don't learn from the tools, but you can learn from information, or
demonstrations of things that aren't quite right. It's not the most
efficient approach to constantly learn only from ones own mistakes.

I mean, by the logic of applied discipline, we could all program in
assembly and construct massive macro libraries, but we don't because
it's a waste of time - we can use high level languages with some built
in discipline to save ourselves effort.

I think there's an awfully long way to go before we're doing that even
remotely to the potential of the technique.

Matt

First  |  Prev  |  Next  |  Last
Pages: 44 45 46 47 48 49 50 51 52 53 54 55
Prev: what happened CBM=VGA
Next: 1581 Drive Kits on eBay