From: Terje Mathisen "terje.mathisen at on
Robert Myers wrote:
> Servers not only have different workloads, they use different
> operating systems, and I'll take a wild guess that almost any server
> OS can take advantage of intelligent I/O better than Desktop Windows,
> which, I speculate, could take advantage of it hardly at all without a
> serious rewrite.

Right.

I haven't written any Win* server code in a _long_ time.

Linux/*BSD however is quite another matter, and in this area OS support
happens _very_ quickly indeed.

I.e. I happen to know that at least one research cpu runs only Linux,
not Win*, simply because it was easy to tweak the OS source to support
it, while getting MS to do the same for Win* is pretty much impossible.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: George Neuner on
On Fri, 28 May 2010 10:42:46 +0200, Terje Mathisen <"terje.mathisen at
tmsw.no"> wrote:


>On the AppleII, the design point was to use as little HW as the Woz
>could get way with, including the (in)famous sw diskette interface.
>
>On the first PC we had a very similar situation, up to and including the
>choice of the 8088 instead of the 8086 in order to get fewer and cheaper
>(8 vs 16 bits!) interface/memory chips.

That's an interesting spin ... there were more chips on the original
PC's MDA video board than in the entire Apple II.

The PC designers were concerned about cost - no doubt - but they were
hardly averse to a high chip count: they avoided expensive chips by
using lots of cheap ones. Wozniak would use an expensive chip if it
reduced overall hardware complexity. Very different design attitudes.

George
From: Andrew Reilly on
On Fri, 28 May 2010 10:42:46 +0200, Terje Mathisen wrote:

> On the AppleII, the design point was to use as little HW as the Woz
> could get way with, including the (in)famous sw diskette interface.

Don't forget the ZX-80 along that spectrum: it only spent cycles running
user programs during the video vertical retrace, because the rest of the
time the CPU was busy copying the video RAM out to the video output jack
(it was only black and white, so no DAC to speak of...)

I thought it gross and icky at the time, but it has a certain minimalist
elegance now: also puts the users and their code in their place. :-)

Cheers,

--
Andrew
From: Morten Reistad on
In article <htmoal$u5$1(a)usenet01.boi.hp.com>,
FredK <fred.nospam(a)dec.com> wrote:
>
>"Morten Reistad" <first(a)last.name> wrote in message
>news:rpo2d7-ho3.ln1(a)laptop.reistad.name...
>> In article <htjt4h$pi0$1(a)usenet01.boi.hp.com>,
>> FredK <fred.nospam(a)dec.com> wrote:
>>>
>>><nmm1(a)cam.ac.uk> wrote in message
>>>news:htjphb$edj$1(a)smaug.linux.pwf.cam.ac.uk...
>>>> In article
>>>> <7b6474ee-81dd-4ab7-be1d-756a544ed515(a)u7g2000vbq.googlegroups.com>,
>>>> Robert Myers <rbmyersusa(a)gmail.com> wrote:
>>
>>>You are on a PDP11 and you want to have IO. Propose the alternative to
>>>interrupts that provides low latency servicing of the device. Today you
>>>can
>>>create elaborate IO cards with offload engines, but ultimately you need to
>>>talk to the actual computer system which is otherwise engaged in general
>>>computing.
>>
>> If that PDP11 has a good number of processors, dedicating one of them
>> to handle low-level I/O; and build a FIFO with hardware-provided
>> atomic reads and writes (not _that_ hard to do) and simple block
>> on read on that should solve that.
>>
>
>The DEC PDP-11 was a single processor minicomputer from the 70's (there was
>a dual CPU 11/74 IIRC). On these systems it was not feasible from a
>cost/complexity viewpoint to implement "IO channel" processors. Just as it
>wasn't reasonable for those systems/CPUs that ultimately resulted in the
>x86.

The PDP11/74 has 4 KL11C processors. Or KL11D if you are a dec god.

With a modern PDP11 implementation if would be natural to fill the
die with processors and cache, since the relevant applications and operating
systems handle it pretty well; and the PDP11 does have some hurdles
regarding pipelining; but handle cache very well.

Implementing a lightning fast FIFO for such a PDP11 should be pretty simple.

>>>On many (most? all?) of todays SMP systems you can generally (depending
>>>on
>>>the OS I suppose) direct IO interrupts to specific CPUs and have CPUs that
>>>may get no interrupts (or only limited interrupts like a clock interrupt).
>>
>> You would still need to signal other cpu's, but that signal does not
>> have to be a very precise interrupt. That cpu can easily handle a
>> few instructions more before responding. It could e.g. easily run it's
>> pipeline dry first.
>>
>
>For an IO interrupt, the other CPU's don't need to be informed about the
>interrupt itself. As to when the CPU is ready to handle an interrupt, that
>is I imagine CPU architecture specific (running pipelines "dry", etc). I
>believe the general gripe about interrupts (and by extension "faults" in
>general which tend in many cases to be handled using the same mechanism) is
>the overhead of saving and subsequently restoring the current context.

So why do that when a pretty simple and effective alternative is in
place?

>
>> With lots and lots of almost-as-fast processors we need to rethink a
>> lot what we do.

-- mrr


From: FredK on

"Morten Reistad" <first(a)last.name> wrote in message
news:1o2dd7-4b.ln1(a)laptop.reistad.name...
> In article <htmoal$u5$1(a)usenet01.boi.hp.com>,
> FredK <fred.nospam(a)dec.com> wrote:
>>
>>"Morten Reistad" <first(a)last.name> wrote in message
>>news:rpo2d7-ho3.ln1(a)laptop.reistad.name...
>>> In article <htjt4h$pi0$1(a)usenet01.boi.hp.com>,
>>> FredK <fred.nospam(a)dec.com> wrote:
>>>>
>>>><nmm1(a)cam.ac.uk> wrote in message
>>>>news:htjphb$edj$1(a)smaug.linux.pwf.cam.ac.uk...
>>>>> In article
>>>>> <7b6474ee-81dd-4ab7-be1d-756a544ed515(a)u7g2000vbq.googlegroups.com>,
>>>>> Robert Myers <rbmyersusa(a)gmail.com> wrote:
>>>
>>>>You are on a PDP11 and you want to have IO. Propose the alternative to
>>>>interrupts that provides low latency servicing of the device. Today you
>>>>can
>>>>create elaborate IO cards with offload engines, but ultimately you need
>>>>to
>>>>talk to the actual computer system which is otherwise engaged in general
>>>>computing.
>>>
>>> If that PDP11 has a good number of processors, dedicating one of them
>>> to handle low-level I/O; and build a FIFO with hardware-provided
>>> atomic reads and writes (not _that_ hard to do) and simple block
>>> on read on that should solve that.
>>>
>>
>>The DEC PDP-11 was a single processor minicomputer from the 70's (there
>>was
>>a dual CPU 11/74 IIRC). On these systems it was not feasible from a
>>cost/complexity viewpoint to implement "IO channel" processors. Just as
>>it
>>wasn't reasonable for those systems/CPUs that ultimately resulted in the
>>x86.
>
> The PDP11/74 has 4 KL11C processors. Or KL11D if you are a dec god.
>

Not a "DEC god", but still a DECie :-) As I wrote the side note on the
11/74 (since in general the PDP-11 was uniprocessor) I couldn't remember if
the 74 was 2 or 4 processors. Bit rot in the old brain.

> With a modern PDP11 implementation if would be natural to fill the
> die with processors and cache, since the relevant applications and
> operating
> systems handle it pretty well; and the PDP11 does have some hurdles
> regarding pipelining; but handle cache very well.
>
> Implementing a lightning fast FIFO for such a PDP11 should be pretty
> simple.
>

Now there is an idea a "modern PDP11 implementation" - perhaps this should
move to folklore :-)

>>>>On many (most? all?) of todays SMP systems you can generally (depending
>>>>on
>>>>the OS I suppose) direct IO interrupts to specific CPUs and have CPUs
>>>>that
>>>>may get no interrupts (or only limited interrupts like a clock
>>>>interrupt).
>>>
>>> You would still need to signal other cpu's, but that signal does not
>>> have to be a very precise interrupt. That cpu can easily handle a
>>> few instructions more before responding. It could e.g. easily run it's
>>> pipeline dry first.
>>>
>>
>>For an IO interrupt, the other CPU's don't need to be informed about the
>>interrupt itself. As to when the CPU is ready to handle an interrupt,
>>that
>>is I imagine CPU architecture specific (running pipelines "dry", etc). I
>>believe the general gripe about interrupts (and by extension "faults" in
>>general which tend in many cases to be handled using the same mechanism)
>>is
>>the overhead of saving and subsequently restoring the current context.
>
> So why do that when a pretty simple and effective alternative is in
> place?
>

I don't understand the statement, or perhaps it wasn't directed to me. The
IO interrupt mechanism is conceptually a simple and effective mechanism for
IO interrupts. Some current architectures (like Itanium) have made
interrupt handling more painful. But I'm just a lowly OS and driver
developer, and so far the CPU developers haven't asked me if I'd like
interrupt handling simplified :-). IO interrupts have become fundamental to
the design of many if not all OSes. The typical scheduler for example
(which you can argue is the core of the OS) relies on a clock interrupt to
schedule competing threads of execution without relying on applications to
give up control themselves. Most OSes can schedule interrupts across CPUs
and have CPUs that take no IO interrupts aside from the clock, directed
interrupts, and faults.

I'm not quite sure how the OP asking a question on driver interrupts vs
polling got to this point... but I'm all ears to hear ideas on "interrupt
free" designs that don't involve systems with hundreds/thousands of
CPUs/threads. Today the direction for IO is minimizing interrupts. PCIe
MSI interrupts, smart cards with "offload engines", distributing interrupts.

I think the question is less "interrupt free", as opposed to fewer and
cheaper interrupts.