From: Thomas Womack on
In article <efgu43$dv3$1(a)news-int.gatech.edu>,
Gabriel Loh <my-last-name(a)cc.gatech.edu> wrote:
>
>> Where do you think the point of diminishing returns might
>> be?
>
>I haven't seen the topic come up in the parallel threads, but one
>question that's really interesting to think about (at least to me) is
>how in the world are you going to deliver power to all of these cores?

>BTW, did anyone see if Intel mentioned the total power consumption for
>their 80-core demo system? The wattage for the 80-core processor has to
>be astronomical, or the power budget per core has to be anemic

1.5 watts is an extravagent power budget for something like an ARM
core, and those cores were less than four square millimetres and a
quarter taken up by router.

An ARM Cortex-A8 is 750MHz, three square millimetres in 65nm and
375mW; the ARM11 MPCore is 620MHz, 2.54 square millimetres in 90nm
(with 32kb of cache) and 300mW. I'm slightly surprised that nobody's
made a load-of-ARMs chip even as a proof of concept.

Tom
From: Thomas Womack on
In article <UJCdnbZAYsZoW4bYnZ2dnUVZ_tmdnZ2d(a)metrocastcablevision.com>,
Bill Todd <billtodd(a)metrocast.net> wrote:
>Terje Mathisen wrote:

>> That 80-core Intel demo chip has a vertically mounted SRAM chip as well,
>> providing 20 MB (afair) directly to each code.

I suspect it has 20MB in total, 256kb per core; 1.6GB of SRAM on a
chip is not remotely feasible with current fabrication processes,
whilst 20MB on 300mm^2 is twice the density of Montecito, so just
about right for 65nm.

>Well, since IIRC the processing cores are running at a princely 1.91 MHz
>(allegedly not a typo) I'm not sure how truly impressive that demo's
>performance would be: perhaps better to wait for the real thing in
>around 5 years' time.

That was a different demo, with a stack of boards plugging into a
Socket 7 (Pentium) motherboard, running something capable of enough
x86 to run Windows XP on an FPGA; I believe it was the HDL code for
what Intel intend to use as the 'mini x86_64 core'.

Tom
From: Terje Mathisen on
Joe Seigh wrote:
> Terje Mathisen wrote:
>> You still need some what to handle async inter-core communication!
>> I.e. I believe that you really don't have any choice here, except to
>> make most of your cores interruptible.
>
> Async presumes processors are a scarce commodity and you want to have it
> do other work while it's waiting for something to be done. That goes
> away if you have unlimited numbers of processors.

Unlimited? Yeah, in that case you can do a lot of stuff in new ways.

The problem is that I can't see any easy way around is when you want to
do A which depends on input from both B and C, which might occur in any
order.

It seems like the absolute minimum is to have a wait_for_any(...). The
alternative is to run around in a tight loop polling both B and C to
check if they have any data available, something which could cost you a
lot of memory bandwidth.

Terje

--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
From: Jon Forrest on
Felger Carbon wrote:

> A very few power users - say, 3 to 5 in the world - will be able to use lots
> and lots of cores. The vast majority of the public will not run more than
> one task at a time, which at this time means only one core.

I don't think this is true, at least not for *nix systems running
X-Windows. Think about what happens when you type a character in
an X-term window, or in any other window that echoes keystrokes.
Both the X-term and the X-server have to run at the same time.
Of course, they don't have to run very long at the same time,
but it's a good example of how dual cores help.

If I remember correctly, in the early days of X, slow single
processor systems with limited memory resulted in noticeable
latencies for this reason.

Jon Forrest
From: Joe Seigh on
Terje Mathisen wrote:
> Joe Seigh wrote:
>
>> Terje Mathisen wrote:
>>
>>> You still need some what to handle async inter-core communication!
>>> I.e. I believe that you really don't have any choice here, except to
>>> make most of your cores interruptible.
>>
>>
>> Async presumes processors are a scarce commodity and you want to have it
>> do other work while it's waiting for something to be done. That goes
>> away if you have unlimited numbers of processors.
>
>
> Unlimited? Yeah, in that case you can do a lot of stuff in new ways.
>
> The problem is that I can't see any easy way around is when you want to
> do A which depends on input from both B and C, which might occur in any
> order.
>
> It seems like the absolute minimum is to have a wait_for_any(...). The
> alternative is to run around in a tight loop polling both B and C to
> check if they have any data available, something which could cost you a
> lot of memory bandwidth.
>
Maybe. There's things like bus snooping which doesn't contribute to
memory usage. It doesn't matter. The IPC mechanism might end up looking
totally different than anything we know today. The processors manufacturers
will have to solve it by the time they get up 100's of cores.


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.