The Perfect Computer - 36 bits? [Computer Architecture]

Prev: Searching for the PDP-3
Next: superscalar and superpipelined

From: Anne & Lynn Wheeler on 19 Mar 2007 11:26

jmfbahciv(a)aol.com writes:
> In the few cases where it was impossible to avoid, we gave
> the customers three major monitor releases to adjust. EAch
> monitor release cycle was about two years, calendar time.

vm370 had around a year release cycle ... however there were monthly
"PLC" service ... which included accumulative source maintenance
.... and periodically there would be an especially large PLC ... which
might include new feature/function.

when i did the resource manager ... i got saddled with the whole
product operations ... having to personally handle all aspects of what
would be normally considered activity of a development, test and
product ship group.

one of the things that had been done leading up to release of resource
manager was a lot of automated benchmarking ... that included both
stress testing as well as lot of calibration and validation of the
dynamic adaptive resource controls. the final cycle had approx. 2000
such tests that took 3 months elapsed time to run
http://www.garlic.com/~lynn/subtopic.html#benchmark

initially they wanted me to also do a monthly PLC for the resource
manager that was in sync with the base product PLC. I refused,
claiming that I could hardly even perform necessary regression and
validation tests (besides doing everything else) on a monthly
cycle. The final compromise was was a three month PLC cycle for the
resource manager.

shortly after graduating and joining the science center ... as new man
in the group ... i got saddled for six months with the cp67
"development plan" (this was before there was the enormous growth in
the development group and physical split from the science center). It
was a five year plan, included growing from 15 people to several
hundred people and individual line-items with typical resource
requirements from 1-3 person months up to 30-40 person months. This
plan had the individual line-items arranged into product "releases" at
approx. six month intervals (including morph from cp67 to
vm370). There were lots of interdependencies, ramp-up with bringing
new people on board, different items might have pre-requisites and/or
co-requisites involving other items (affecting how things were
packaged into releases)... eventually hitting several hundred items
(along with interdependencies and time-lines).

However, there was this really "fun" session once a week. Some
corporate hdqtrs people claimed that to be a real "development" group
you had to have this five year plan with reqular reviews ... which
they instituted with weekly conference calls. Some of the faction
seemed to be part of making cp67 (and vm370) go away. It seemed to be
that they thot they could bury a 15 person group with what it met to
be a "real" development group. The weekly conference calls seemed to
be filled with minutiae and trivia ... what would happen to the plan
if some trivial line-item was shifted from one release to another
.... or if the personal ramp-on was trivially changed, how might
various line-items and releases be affected (this was all before any
project plan software existed). After a couple weeks of this
.... rather than laboriously re-arranging the hundreds of items to take
into account their various trivia ... I got to the point where I could
do it nearly in real-time ... answer their questions on the fly
.... rather than spending the whole week preparing the responses (for
the next conference call) to the trivia questions (that had been
raised in the previous call).

The basic plan was never actually revised in response to the trivia
questions ... just being able to come up with the responses to the
trivia 'what-ifs".

From: Stephen Fuld on 19 Mar 2007 12:45

jmfbahciv(a)aol.com wrote:

snip

I'll try one more time. It is difficult with this "low bandwidth"
connection to get some of these points across as we seem to be talking
about different things sometimes and hard to keep on a set of agreed
basics of what we are talking about.

>> The hardware has to know whether to do it or not.
>
> Of course. Each _new_ CPU design will learn the new way.
> The goal, in this extensible design discussion, is to
> be design an instruction format where a shorter field length
> will still work when the length has to be increase by n bits
> to accomodate larger numbers.

OK. But apparently the PDP 10 wasn't designed that way.

>> For example, the CPU
>> reads the first 36 bits. It has to know whether those bits represent an
>> "old style", or "original" PDP10 instruction or the start of a "new
>> style", extended instruction, in which case it needs to read the next
>> say 36 bits to get the rest of the address. So the question becomes
>> *How does the CPU know which thing to do?*
>
> The OS knows. You don't boot up a monitor built for a KA-10
> on a KL-10. The OS booted on the CPU has been built using
> a MACRO (or whatever machine instruction generator) that knows
> about the new instructions.

But not only the OS, but the hardware has to know. This is "below" the
level of the OS. Remember the CPU reads some number of bits from memory
and considers them to be an instruction to execute. It has to know how
many bits constitute that instruction. For the PDP 10, it was trivial -
always 36. It had no way to say "This instruction performs the same
function as a particular 36 bit instruction but is really X bits longer
to accommodate a larger address field. You could use new, previously
reserved op codes for some of these, but it seems there were more
instructions that had 18 address fields that needed more than there were
available new op codes. Hence the problem.

>>
>> If all of the bits in the original instruction are already defined to
>> mean something else, then you can't use one of them or you would break
>> the programs that used them in the old way.
>
> Right. One of our rules that was never to be broken was to never
> ever redefine fields or values to mean something else. If we
> needed a new field, we added it. We did not use an "old" field
> or value. If you ever see our documentation, you will see
> fields and values that were labelled "Reserved" or "Customer
> Reserved". "Reserved" conveyed to the customer that we may
> use the "blank" fields or values in the future; therefore a
> smart customer will stay away from using them. We promised
> to never use the Customer Reserved.

Sure. Most people did that. But if you look at the definition of the
ISA, as opposed to a software document, you will find precious few
reserved fields in the instruction format.

>> You could add some sort of
>> mode switch, to tell the CPU that it is now in "double word instruction
>> mode", and then you need some way of setting that mode ( a new op code,
>> or a previously unused bit in an instruction settable mode register, etc.)
>
> But you are running hardware that doesn't have to know the old.

It does in order to be able to run existing executables that all have
the 36 bit instruction format.

> If there is a case where it does, invoked the old creates a fault
> which the OS can then trap and dispatch to a software substitute.
> That's how we handled KA-floating point instructions on KLs.
> You don't care if it "takes longer". AAMOF, it is mandatory
> you don't care.

You can do that for situations where it will happen rarely (such as the
floating point example you gave), but if a new machine only handles the
new, longer instructions and faults to the OS for every 36 bit
instruction, the performance of old programs would be so slow as to be
unacceptable. The emulation would take say 10s of instructions to
emulate each old, 36 bit instruction, so even if the new CPU were twice
the speed of the old one, it would run old programs five times slower
than the old hardware.

>> The reason that the COBOL and Fortran examples you gave worked, are that
>> the knowledge of whether to read the rest of the record is logically
>> coded in the program.
>
> No. The reason was that there was a law everybody obeyed in that
> an end of record character meant end of record. So it didn't matter
> what the user program asked for as long as the compiler knew
> to advance to the first character after the end-of-record
> character.

Ahhhh! We progress. Now think about the instruction stream compared to
the "record stream" of a file. In the instruction stream, there is no
"end of record" character at the end of each instruction. So think
about how difficult it would be to do what you say without such characters.

>> Some programs read all the record, other read
>> less, but each program knew what to do. The CPU doesn't "know" since it
>> must be prepared to handle both types.
>
> Not at all. Some opcodes will not need special handling. Those
> opcodes that do need special handling can be trapped and let
> software do the setup as a substitute for the hardware.

But if the CPU doesn't know how many bits constitute an instruction (due
to the lack of "end of record" characters, then the CPU can't even tell
what the op-code is (i.e. which bits of the stream constitute the
op-code) to know how to handle the instruction.

> That's why it's important to design extensibility in the machine
> instruction format first.

Sure. But again, I thought we were talking about why the PDP10 couldn't
scale to larger address spaces. In that case, we have an existing
design that didn't have that kind of extensibility.

> But that [starting out with a clean sheet] is what this thread
> is all about!
>> The problem is taking an existing instruction set, in this case the
>> PDP10, and extending in a way that was never anticipated.
>
> No,no,no. Savard gave an overview of his new thingie. Morten
> said that, with the exceptions of a couple of things, the PDP-10
> was spec'ed. That's why I am and can use the PDP-10 has a basis for
> my posts. Otherwise, I wouldn't have peeped in this thread.

Perhaps we are at cross purposes here. There is no doubt that one could
create
an architecture where you could extend addressing range arbitrarily. I
thought
we were talking about why the PDP 10 couldn't easily be extended in that
way and
thus why it was limited and that might have contributed to either its
demise or its lack of applicability to problems that wanted a larger
address space.

>> claimed in this thread is that the difficulty of doing that was a major
>> factor in limiting the PDP 10's growth. While I have no personal
>> experience to know whether that claim is true, it certainly seems
>> reasonable.
>>
>>>> Presumably, you can't easily add a few bits to the length of the basic
>>>> instruction, as that would break existing programs.
>>> I'm asking why not?

Because, without the "end of record" character mechanism, or some
equivalent, the CPU has no way of knowing whether it is executing the
old, 36 bit instructions or the new, longer ones. Without knowing how
long an instruction is, it can't know where the next one starts.

>> Because then you have to have some way of telling the CPU whether it is
>> executing the "old" instructions or the "new" ones, without changing how
>> it treats the old ones.
>
> I would define the "new" instructions with a new opcode.
> For instance opcode 101 would be the class of instructions that
> did full word moves. Opcode 1011 is a move full 36-bit word from memory
> into register. Opccode 1012 is the other way. Two decades
> later the shiny new CPU designer wants a "move two full 36-bit words
> into zbunkregister (new hardware compontent breakthrough). So he
> defines that opcode to be 10150. (50 MOVE flavored instructions
> had been defined over the two decades.)

But note that you require more bits to hold "10150" than "1011". So you
need a longer instruction to hold those extra bits and, absent a
mechanism to know whether you are executing the shorter or longer
instructions, you have a problem. :-(

>>> I don't know if it's online anymore but my USAGE file specification
>>> gave instructions about how any customer site could extend
>>> the USAGE records. As long as all new fields were added to the
>>> end of each record, the "old" code would read "new" formatted records
>>> with no error.
>>>
>>> So why can't you do the same thing with a new machine instruction
>>> format definition?
>> Presumably (and again, I don't know PDP 10 stuff), there was some
>> mechanism for the software to know where the next record started.
>
> It was an agreed standard that a particular character pattern
> would always be the end of record pattern.
>
>> So if
>> the old software read the first N bytes or words of the record, when it
>> went to read the second record,
>
> Yes.
>
>> it or some software library it used
>> "knew" where to start that second read (a record length coded somewhere,
>> an end-of record_ character, etc.) That is the information that is
>> missing in the PDP 10 ISA, so the CPU doesn't know where the next
>> instruction starts.
>
> Right. And that's what I think needs to be designed. That's the
> extensibility piece.
>
> If a machine instruction format can be designed so that any of its
> fields and values can be expanded, the biz wouldn't have to
> go through a "need more bits and have to rewrite the world to
> get them" paradigm.
>
> I've never heard of hardware including extensibility in its design;
> they left it as an exercise for the software to sort out.

Many architectures allowed for expandability, at least in some
dimensions. But there is a cost of doing that! The reserved op-codes
you mentioned are a form of extensibility and are cheap, so most
architectures support them. Supporting co-processors is another form of
extensibility that is relatively cheap and so many systems support them.

Allowing instructions of a variable number of bits (to support increased
addressability) used to be expensive (in terms of the hardware required
to decode them and the extra time (hence reduced performance) they also
require. That is why the PDP 10 had fixed length instructions, and
(part of) why the 8086 and the VAX (variable length instructions) were
slower than the early RISC CPUs which had fixed length instructions.
This is much less expensive now, so a hypothetical new design might very
well use variable length instructions.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

From: Rich Alderson on 19 Mar 2007 15:31

jmfbahciv(a)aol.com writes:

> In article <86odmq22xt.fsf(a)brain.hack.org>,
> Michael Widerkrantz <mc(a)hack.org> wrote:
>> jmfbahciv(a)aol.com writes:

>>> Then, as with TOPS-10, DEC essentially canned VMS.

>> Are you aware that HP still delivers OpenVMS systems?

> Yes. I'm aware of it.

>> HP also still
>> sells[1] and supports Alpha systems although they have moved on to
>> IA64 (sometimes known as the Itanic in comp.arch). The largest server
>> is the Integrity Superdome with 64 processors (128 cores, 32 cores
>> supported under OpenVMS) and 2 Terabytes RAM in a single server.

>> OpenVMS pages at HP:

>> http://www.hp.com/go/openvms/

>> [1] Until April 27, 2007. So buy now!

> Q.E.D.

I'm not so sure what's so Q.E.D. about this, in the context of "I'm not talking
about hardware, I'm talking about systems." OpenVMS on Itanic^Hum apparently
works just fine, according to some folks in comp.os.vms.

Of course, there are those in c.o.v who feel about Alpha the way we feel about
the PDP-10, but they have a fallback that LSG customers did not. (If you need
OpenVMS cycles, you can hold your nose and move to the new hardware.)

--
Rich Alderson | /"\ ASCII ribbon |
news(a)alderson.users.panix.com | \ / campaign against |
"You get what anybody gets. You get a lifetime." | x HTML mail and |
--Death, of the Endless | / \ postings |

From: Rich Alderson on 19 Mar 2007 16:30

jmfbahciv(a)aol.com writes:

> The -10 already had "segmentation" in its philosophy. You didn't run any
> code without GETSEGing. So I don't understand how you expected an
> enhancement to make segment boundaries invisible.

STOP!

IRRELEVANCY ALERT!

NOT THE THING BEING DISCUSSED AT ALL!

<ahem> Now that I have your attention, please allow me to discourse a little on
the two things merged like creatures in a 1950's monster film.

The Tops-10 operating system has a notion, originally based in the hardware of
the first PDP-10 (the 1967 KA10 processor) but divorced from that in all later
processors with the same ISA, of high and low "segments" of core^Wmemory. It
is possible to give the two segments different hardware-level protection, and
to change the start address of the high segment (usually 400000, that is to
say, half of the address space). As usually written, a user program has a
sharable "high segment" and a non-sharable "low segment"; HLL runtimes, for
example, are written to be "high segments".

(The OS, of course, maintains a table of physical starting addresses for each
segment of each job, and relocates as is usual in any timesharing architecture,
but this is not relevant to the programmer-level view of the situation.)

Sometimes code is too large to fit in the 18-bit address space, and overlays
are necessary (as in any small-memory computer system). The Tops-10 innovation
is to give explicit runtime control of what "segment" may be overlaid next to
the programmer (as opposed to the OS-mediated overlays familiar to programmers
on, for example, IBM 360 systems). This is done with the system call GETSEG,
and that is what Barb is thinking of in her statement above.

Note that neither Tops-20 nor ITS, PDP-10 operating systems which do not share
code with Tops-10, has this style of overlay handling, so it is not required
even on the PDP-10.

[Note to Barb:

What is meant by "segmented addressing" in the posts by others than yourself
in this thread is an entirely unrelated notion, more similar to the relocation
registers in the KA10 that allowed the development of the Tops-10 "segments"
feature. It is best known to most younger programmers from the early Intel
x86 family of processors: There is a hardware register into which the
programmer or the OS puts a segment address, and from that point on until the
register is changed, all addresses in the running program are modified by the
automatic addition of the segment address (left shifted IIRC 4 bits, to give a
20-bit extension to the base 16-bit address space). --rma
]

Extended addressing on the KL10 and later processors is not *precisely* a
segmented memory architecture in the sense of the early x86's, but it shares
enough commonality of problems that it can be lumped in with them, even if we
call them "sections" instead of "segments".

In PDP-10 extended addresses, the 18-bit address space is extended by 12 bits
(on the KL10, only 7 implemented in hardware; on the Toad-1, the full 12), with
the processor state machine (loosely, "the microcode") taking cognizance of
whether instructions are being executed in "section 0" or a non-zero section.
Certain instructions are relegated to "local-section-only" status, meaning they
are restricted to any particular 18-bit subset of the global address space in
which they are used: Increments from 777777 wrap to 0 rather than going to
1000000 (which would be the next section start), and so on.

Now that we have that straight...

--
Rich Alderson | /"\ ASCII ribbon |
news(a)alderson.users.panix.com | \ / campaign against |
"You get what anybody gets. You get a lifetime." | x HTML mail and |
--Death, of the Endless | / \ postings |

From: jmfbahciv on 20 Mar 2007 08:51

In article <m3y7lv3n0a.fsf(a)garlic.com>,
Anne & Lynn Wheeler <lynn(a)garlic.com> wrote:
>jmfbahciv(a)aol.com writes:
>> All of this cancelling stuff...when did this happen? I don't want
>> a year; I want a contect of the state of the economy at the time.
>> (I don't know a better way to write the question).
>
>re:
>http://www.garlic.com/~lynn/2007f.html#25 The Perfect Computer - 36bits?
>
>early '76 ... the issue on cancelling vm370 and moving the resources
>to getting mvs/xa out the door ... wasn't particularly tied to the
>general economy ... it was that the corporation had been so focused on
>turning out FS (as a 360/370 replacement) ... that the 360/370 product
>pipeline was bare.

Work pressures were very much tied to the economy back then.
1976 was just before corporate asked all salaried people to
work 48 instead of the 40 hours.* I assume that a lot of project
planning decisions were changed because of not being able to
hire help.
<snip>

>one of the final nails in the FS coffin was report by the Houston
>Science Center ... something about if you took ACP/TPF running on
>370/195 and moved it to equivalent FS machine (made out of fastest
>hardware then available) ... the thruput would appprox. be that of
>running ACP/TPF on 370/145 (i.e. something like 20 to 30 times
>slower).
>
>a few other refereces to FS ... which i've previously posted, can be
>found here:
>http://www.ecole.org/Crisis_and_change_1995_1.htm
>
>from above:
>
>IBM tried to react by launching a major project called the 'Future
>System' (FS) in the early 1970's. The idea was to get so far ahead
>that the competition would never be able to keep up, and to have such
>a high level of integration that it would be impossible for
>competitors to follow a compatible niche strategy. However, the
>project failed because the objectives were too ambitious for the
>available technology. Many of the ideas that were developed were
>nevertheless adapted for later generations. Once IBM had acknowledged
>this failure, it launched its 'box strategy', which called for
>competitiveness with all the different types of compatible
>sub-systems. But this proved to be difficult because of IBM's cost
>structure and its R&D spending, and the strategy only resulted in a
>partial narrowing of the price gap between IBM and its rivals.

That difficulty is still something I need to learn about. I
don't seem to be able to leap to the appropriate conclusions
when cost structures and R&D spending is mentioned. In my
day of working, that was what managers were supposed to deal
with.

>
>.... snip ...
>
>i.e. focusing on high-level of integration as countermeasure to the
>clone controllers & device business ... as previously mentioned, I've
>been blaimed for helping start (project worked on as undergraduate in
>the 60s).
>http://www.garlic.com/~lynn/subtopic.html#360pcm
>
>The decision to shutdown the burlington mall group and move everybody
>to POK had been made, but it appeared that the people weren't going to
>be told until just before they had to be moved ... leaving people with
>little or no time to make the decision and/or make alternative plans.

That decision was just plain stupid. People would disagree at
great volume, but, if given enough time to plan, would go along
with a decision.
>
>Then there was a "leak" ... and a subsequent "witch hunt" attempting
>to identify the source of the leak ...

Sounds like what HP just went through.

<snip>

/BAH

First | Prev | Next | Last
Pages: 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Prev: Searching for the PDP-3
Next: superscalar and superpipelined