From: ranjit_mathews@yahoo.com on

Niels Jørgen Kruse wrote:
> ranjit_mathews(a)yahoo.com <ranjit_mathews(a)yahoo.com> wrote:
>
> > Niels Jørgen Kruse wrote:
> > > An EEtimes article
> > > <http://www.eetimes.com/showArticle.jhtml?articleID=193105767>,
> > > has the following information, not previously public (AFAIK):
> > >
> > > 1) L2 cache is 8 MB total (for the die presumably)
> >
> > That's what it was on the high end NorthStar Power3 based servers
> > nearly a decade back, although not on die.
>
> Northstar != POWER3.

Ah, so? What pre-POWER3 processor ran at 266MHz?

The POWER3 processor has also been designed to span several advances in
CMOS technology, allowing it to more than double its initial product
frequency of 200 MHz over its product lifetime. To date, POWER3
processors have been shipped in RS/6000 products at frequencies ranging
up to 450 MHz.
http://www.research.ibm.com/journal/rd/446/oconnell.html

> The POWER6 has 32 MB of offdie cache too
> (80GB/second bandwidth).

32MB off-die cache has been offered since the beginning of the decade.
It would be more interesting to know what the latency will be. On
high-end Power4 servers, L3 was 100 cycles away from the processor
(eg., 100ns for a 1GHz processor).

> Since the amount is a reduction from POWER5 and
> IBM has discontinued their eDRAM in 65 nm, I guess this must be SRAM.
>
> A lot of slides from the Microprocessor Forum can be seen at
> <http://www.tecchannel.de/news/themen/technologie/450386/>. Those who
> can read german can get something out of the text too :-)
>
> It was a surprise to see the dispatch bandwidth increased to 7
> instructions per clock (from 5 in the POWER5). It was a common
> expectation that POWER6 had to be narrower to achieve the high clock.

I figured it would be something like 8. That was before I knew it would
have high clocks. Be that as it may, how many instructions can be
retired per clock. (7 issues per clock doesn't necessarily imply 7
retires per clock)

> Mvh./Regards, Niels Jørgen Kruse, Vanløse, Denmark

From: ranjit_mathews@yahoo.com on
Niels Jørgen Kruse wrote:
> It was a surprise to see the dispatch bandwidth increased to 7
> instructions per clock (from 5 in the POWER5).

Come to think of it, the 5 was retires, not dispatches. I think 4
intops, 4 fpops and 2 branches could be dispatched per clock. Of
course, because of the lower retire rate, such a dispatch rate couldn't
be sustained indefinitely. Hmm, does a prefetch count as an instruction
in terms of cutting into the limit of how many instructions can be
issued per clock?

From: ranjit_mathews@yahoo.com on

Niels Jørgen Kruse wrote:
> Del Cecchi <cecchinospam(a)us.ibm.com> wrote:
>
> > Niels Jørgen Kruse wrote:
> > > Northstar != POWER3. The POWER6 has 32 MB of offdie cache too
> > > (80GB/second bandwidth). Since the amount is a reduction from POWER5 and
> > > IBM has discontinued their eDRAM in 65 nm, I guess this must be SRAM.
> >
> > eDRAM still appears to be available in 65nm bulk (CU65HP), don't know
> > about in SOI

It was called RLDRAM, AFAIK.
>
> OK, I don't remember exactly where I got that notion. 32MB SRAM is
> certainly possible in 65nm though.

It would be larger than 32MB RLDRAM at 130nm, so whether it would be
practicable might depend on how many cores the high end system will
have and how many cores share an L3. The p690 had 16 32MB L3s; does the
p595 have 32 L3 caches?

From: David Kanter on

ranjit_mathews(a)yahoo.com wrote:
> Niels Jørgen Kruse wrote:
> > Del Cecchi <cecchinospam(a)us.ibm.com> wrote:
> >
> > > Niels Jørgen Kruse wrote:
> > > eDRAM still appears to be available in 65nm bulk (CU65HP), don't know
> > > about in SOI
>
> It was called RLDRAM, AFAIK.

RLDRAM is entirely different than eDRAM.

> > OK, I don't remember exactly where I got that notion. 32MB SRAM is
> > certainly possible in 65nm though.
>
> It would be larger than 32MB RLDRAM at 130nm, so whether it would be
> practicable might depend on how many cores the high end system will
> have and how many cores share an L3. The p690 had 16 32MB L3s; does the
> p595 have 32 L3 caches?

Hint: There's no RLDRAM involved.

DK

From: ranjit_mathews@yahoo.com on

David Kanter wrote:
> ranjit_mathews(a)yahoo.com wrote:
> > Niels Jørgen Kruse wrote:
> > > Del Cecchi <cecchinospam(a)us.ibm.com> wrote:
> > >
> > > > Niels Jørgen Kruse wrote:
> > > > eDRAM still appears to be available in 65nm bulk (CU65HP), don't know
> > > > about in SOI
> >
> > It was called RLDRAM, AFAIK.
>
> RLDRAM is entirely different than eDRAM.

I once read a piece that said it was RLDRAM. If it wasn't, then the
piece was wrong.

> > > OK, I don't remember exactly where I got that notion. 32MB SRAM is
> > > certainly possible in 65nm though.
> >
> > It would be larger than 32MB RLDRAM at 130nm, so whether it would be
> > practicable might depend on how many cores the high end system will
> > have and how many cores share an L3. The p690 had 16 32MB L3s; does the
> > p595 have 32 L3 caches?
>
> Hint: There's no RLDRAM involved.
>
> DK