From: Morten Reistad on
In article <htobl4$lv1$1(a)usenet01.boi.hp.com>,
FredK <fred.nospam(a)dec.com> wrote:
>
>"Terje Mathisen" <"terje.mathisen at tmsw.no"> wrote in message
>news:89c4d7-5go.ln1(a)ntp.tmsw.no...
>> nmm1(a)cam.ac.uk wrote:
>>> In article<htmoal$u5$1(a)usenet01.boi.hp.com>,
>>> FredK<fred.nospam(a)dec.com> wrote:
>>>>
>
>snip
>
>>
>> With proper non-blocking queue handling, those working cores can run flat
>> out with no interrupts as long as there is any work at all to be done,
>> then go to sleep.
>>
>> Using an interrupt from an IO core to get out of sleep and start
>> processing again is a good idea from a power efficiency viewpoint.
>>
>
>The question being - how fast can you bring the CPU out of it's "sleep"
>state, and how do you schedule servicing of the non-blocking queues without
>dedicating one or more cores strictly to handling them. The clock interrupt
>for example is typically the mechanism used for scheduling multiple
>processes competing for CPU time.

In this "USMP" (UnSymmetric MultiProcessing) Terje describes, let
the (many) "small" processors handle I/O; clocks, graphics rendering etc.
If they are ISA-compatible with the (few) fast processors you can even
keep running on a few of the "small" ones if your cpu-requirement is not
too big, and the system can power down a lot of processors when idle.

The interrupts sent to the "large" processors will them be mostly
"attention" interrupts, either to schedule a new process, wake up,
etc.

Since Linux sends processors to sleeps and wakes them with an interrupt
already the wakeup speed has been solved. This was a small issue around
the 80486 clones and Linux 1.2.something; hasn't surfaced in a decade.

My laptop has been to sleep more than 10k times since I started the
previous paragraph.

We should also address the worst bits of the von Neumann bottleneck.
Having queue support in hardware, handling a few k of data in each
instant would be a huge help.

I cannot see that these ideas would require a lot of hardware to
implement.

-- mrr


From: Morten Reistad on
In article <009ddc0e-0446-48f2-985a-5a06f12e07f7(a)k31g2000vbu.googlegroups.com>,
Robert Myers <rbmyersusa(a)gmail.com> wrote:
>On May 28, 4:42�am, Terje Mathisen <"terje.mathisen at tmsw.no">
>wrote:
>
>> When/if we finally get lots of cores, some of which are really
>> low-power, in-order, with very fast context switching, then it makes
>> even more sense to allocate all IO processing to such cores and let the
>> big/power-hungry/OoO cores do the "real" processing.
>
>But it would likely take Microsoft to make such a step of any value in
>the desktop/notebook space, no?
>
>Servers not only have different workloads, they use different
>operating systems, and I'll take a wild guess that almost any server
>OS can take advantage of intelligent I/O better than Desktop Windows,
>which, I speculate, could take advantage of it hardly at all without a
>serious rewrite.

For the I/O handling we would probably have to make a hypervisor, to
be an "os for operating systems". Where the hypervisor presents services
to the OS, just like graphics processors do today. Windows does not
have to know about all the cpus at all. Nor will Linux, for that matter.
Or you can have them coexist on the same machine.

-- mrr


From: Robert Myers on
On May 31, 1:18 pm, Morten Reistad <fi...(a)last.name> wrote:
> In article <009ddc0e-0446-48f2-985a-5a06f12e0...(a)k31g2000vbu.googlegroups..com>,
> Robert Myers  <rbmyers...(a)gmail.com> wrote:
>
>
>
>
>
> >On May 28, 4:42 am, Terje Mathisen <"terje.mathisen at tmsw.no">
> >wrote:
>
> >> When/if we finally get lots of cores, some of which are really
> >> low-power, in-order, with very fast context switching, then it makes
> >> even more sense to allocate all IO processing to such cores and let the
> >> big/power-hungry/OoO cores do the "real" processing.
>
> >But it would likely take Microsoft to make such a step of any value in
> >the desktop/notebook space, no?
>
> >Servers not only have different workloads, they use different
> >operating systems, and I'll take a wild guess that almost any server
> >OS can take advantage of intelligent I/O better than Desktop Windows,
> >which, I speculate, could take advantage of it hardly at all without a
> >serious rewrite.
>
> For the I/O handling we would probably have to make a hypervisor, to
> be an "os for operating systems". Where the hypervisor presents services
> to the OS, just like graphics processors do today. Windows does not
> have to know about all the cpus at all. Nor will Linux, for that matter.
> Or you can have them coexist on the same machine.
>
Ok. Thanks.

Robert.

From: FredK on

"Morten Reistad" <first(a)last.name> wrote in message
news:rf7dd7-r02.ln1(a)laptop.reistad.name...
> In article <htobl4$lv1$1(a)usenet01.boi.hp.com>,
> FredK <fred.nospam(a)dec.com> wrote:
>>
>>"Terje Mathisen" <"terje.mathisen at tmsw.no"> wrote in message
>>news:89c4d7-5go.ln1(a)ntp.tmsw.no...
>>> nmm1(a)cam.ac.uk wrote:
>>>> In article<htmoal$u5$1(a)usenet01.boi.hp.com>,
>>>> FredK<fred.nospam(a)dec.com> wrote:
>>>>>
>>
>>snip
>>
>>>
>>> With proper non-blocking queue handling, those working cores can run
>>> flat
>>> out with no interrupts as long as there is any work at all to be done,
>>> then go to sleep.
>>>
>>> Using an interrupt from an IO core to get out of sleep and start
>>> processing again is a good idea from a power efficiency viewpoint.
>>>
>>
>>The question being - how fast can you bring the CPU out of it's "sleep"
>>state, and how do you schedule servicing of the non-blocking queues
>>without
>>dedicating one or more cores strictly to handling them. The clock
>>interrupt
>>for example is typically the mechanism used for scheduling multiple
>>processes competing for CPU time.
>
> In this "USMP" (UnSymmetric MultiProcessing) Terje describes, let
> the (many) "small" processors handle I/O; clocks, graphics rendering etc.
> If they are ISA-compatible with the (few) fast processors you can even
> keep running on a few of the "small" ones if your cpu-requirement is not
> too big, and the system can power down a lot of processors when idle.
>
> The interrupts sent to the "large" processors will them be mostly
> "attention" interrupts, either to schedule a new process, wake up,
> etc.
>

Why not "Asymmetric" SMP? Something sounds funny about "UnSymmetric".

What is large vs small? With cores apparently becoming "cheap" why
differentiate or build variations? Isn't it easier to stamp out many of th
same kind?

Why the need for a new paradigm? If the cores are all identical, then
simply take N out of the scheduling domain of executable user threads and
direct interrupts to only those CPUs. Now your applications are only
interrupted by the clock and faults.

> Since Linux sends processors to sleeps and wakes them with an interrupt
> already the wakeup speed has been solved. This was a small issue around
> the 80486 clones and Linux 1.2.something; hasn't surfaced in a decade.
>
> My laptop has been to sleep more than 10k times since I started the
> previous paragraph.
>

Well, in some p-state. Most OSes will enter a light sleep state in the idle
loop.

> We should also address the worst bits of the von Neumann bottleneck.
> Having queue support in hardware, handling a few k of data in each
> instant would be a huge help.
>

You need to explain this one to me more fully.

> I cannot see that these ideas would require a lot of hardware to
> implement.
>

Or none at all.


From: FredK on

"Morten Reistad" <first(a)last.name> wrote in message
news:bk7dd7-r02.ln1(a)laptop.reistad.name...
> In article
> <009ddc0e-0446-48f2-985a-5a06f12e07f7(a)k31g2000vbu.googlegroups.com>,
> Robert Myers <rbmyersusa(a)gmail.com> wrote:
>>On May 28, 4:42 am, Terje Mathisen <"terje.mathisen at tmsw.no">
>>wrote:
>>
>>> When/if we finally get lots of cores, some of which are really
>>> low-power, in-order, with very fast context switching, then it makes
>>> even more sense to allocate all IO processing to such cores and let the
>>> big/power-hungry/OoO cores do the "real" processing.
>>
>>But it would likely take Microsoft to make such a step of any value in
>>the desktop/notebook space, no?
>>
>>Servers not only have different workloads, they use different
>>operating systems, and I'll take a wild guess that almost any server
>>OS can take advantage of intelligent I/O better than Desktop Windows,
>>which, I speculate, could take advantage of it hardly at all without a
>>serious rewrite.
>
> For the I/O handling we would probably have to make a hypervisor, to
> be an "os for operating systems". Where the hypervisor presents services
> to the OS, just like graphics processors do today. Windows does not
> have to know about all the cpus at all. Nor will Linux, for that matter.
> Or you can have them coexist on the same machine.
>

I hate hypervisors. Yet another scheduling and abstraction layer to make
things slower and less responsive.

The core of the Windows kernel (NT) and it's IO subsystem are pretty much as
"modern" as it gets. The issues with "intellegent IO" really has more to do
with the software stack that it is trying to plug into - like TCPIP - and
little to do with "interrupts".

It's the old "wheel of reincarnation" where do you push what functionality
and at what cost. I've seen server designs on the boards for decades with
all sorts of smart IO and high speed fabrics, etc, etc.