|
Prev: Will Barcelona surpass Core Duo, Woodcrest in CPU speed, or just multi-chip performance?
Next: AMD/ATI GPU spec
From: Bill Todd on 28 Jan 2007 07:49 Anton Ertl wrote: > "kroger(a)princeton.edu" <kroger(a)princeton.edu> writes: >> Though folks here might have a good handle on this. I've seen >> conflicting reports that AMD's chip due this summer will exceed >> current Intel chips. Are they talking about better performance CPU for >> CPU, or just the aggregate performance of four cores? > > The stuff I have seen seemed to talk about aggregate performance > (SpecFP-rate, and some TCP benchmark). > > BTW, I wonder why AMD is not pulling the same trick as Intel to get a > quad-core in the mean-time: put two dual-cores in one package. Possibly because it wouldn't be worth the effort (and any potential risk of souring people on '4-core' AMD products before the real ones appeared) - just to close a window that's scheduled to be only 6 months wide. Intel, by contrast, plans to continue its multi-chip 'quad-core' products through the next (45 nm.) generation, which makes its own efforts in that area much more amortizable (as well as possibly being easier to mate to its existing bus-oriented architecture, in contrast to the asymmetry that David already mentioned for an Opteron implementation of that ilk). Finally, AMD arguably just doesn't need it: they've already got systems that scale up as far as their current HT performance can take them (to 8 or 16 cores, depending upon the nature of the workload), and creating such pseudo-quad-core beasts might not increase the total usable core count at all (just reduce the socket count while increasing heat dissipation challenges). Intel scored a major win with Core2Duo, but Core2Quad (or whatever they're calling it today) seems largely fanboy service (how many games can actually *use* more than two cores to real advantage?). As Intel's chipsets increase in bandwidth their 'quad cores' may become more genuinely useful - just as Barcelona will when the next HT generation debuts. .... > Should not be more effort than the Athlon FX-72 nonsense, I haven't looked at all closely, but had the impression that the FX-7x setups were nothing more than normal two-socket Opteron systems - in which case that effort was virtually nil. - bill
From: Bill Todd on 29 Jan 2007 11:02 Quadibloc wrote: > Bill Todd wrote: >> Finally, AMD arguably just doesn't need it: they've already got systems >> that scale up as far as their current HT performance can take them (to 8 >> or 16 cores, depending upon the nature of the workload), and creating >> such pseudo-quad-core beasts might not increase the total usable core >> count at all (just reduce the socket count while increasing heat >> dissipation challenges). > > That's not an argument - given Microsoft's licensing policies. Correction: it's not an argument for people who run multi-threaded Microsoft products on servers with more than two cores (I guess there are some who do, but as the core count increases the number of Microsoft-based servers plummets), and even then is mitigated by the fact that for multiple reasons a top-of-the-line quad-core package generates less (in some cases *far* less) than twice the performance of two top-of-the-line dual-core packages. - bill
From: Terje Mathisen on 12 Feb 2007 13:53 Morten Reistad wrote: > One example is codec transforms, mostly audio. These work in > 20 millisecond samples with from 12 to 160 bytes per sample. > The transformation code is in the low hundreds of K in size. > Some must be performed in sequence, others do not have this > restriction. > > It gets interesting when you are to transform thousands of streams. > > The data for several thousand streams plus the code will fit in > cache. That is nice. :-) > Likewise, VPN streams are cpu-burners, even with well thought > out stuff like AES. Well, it shouldn't be (a cpu-burner)! A 1996-era 200 Mhz PentiumPro could handle AES encryption/decryption for a 100 Mbit/s full duplex link, which means that _very_ few current servers need a single full core to handle all the available bandwidth for VPN traffic. Multiple Gbit/s streams? Terje -- - <Terje.Mathisen(a)hda.hydro.com> "almost all programming can be viewed as an exercise in caching"
From: Terje Mathisen on 13 Feb 2007 06:05 Nick Maclaren wrote: > In article <a66rqe.uro.ln(a)via.reistad.name>, > Morten Reistad <first(a)last.name> writes: > |> The data for a frame varies from 320 bytes in slin, to 160 in > |> u/a law, to 80 in g726/adpcm down to 33 bytes in gsm and even > |> less in g729. the "class 2" codecs needs the last frame and > |> some digested information available. 500 bytes for the worst case > |> transcoding data has room to spare. > > From the above, I would suggest employing someone like Terje (or even me, Morten can't employ me, but I'd still love to have a quick look at what his code is doing. From all his informed writings I really doubt it can be nearly as bad as it seems. I.e. there _must_ be some really good reasons for why it is taking a long time. One such reason is encodings similar to CABAC (BluRay/HD-DVD) which likes to generate several mostly-unpredictable branches per _bit_ of decoded data. Morten, my inbox is waiting! > though I doubt I am available!) to look at the code, rather than spending > time parallelising. 41 milliseconds to convert 160 bytes, especially as > it is much less on the slower systems, smacks of seriously sub-optimal > code. Terje -- - <Terje.Mathisen(a)hda.hydro.com> "almost all programming can be viewed as an exercise in caching"
From: Morten Reistad on 21 Feb 2007 05:55
In article <9mp6a4-jaq.ln1(a)osl016lin.hda.hydro.com>, Terje Mathisen <terje.mathisen(a)hda.hydro.com> wrote: >Morten Reistad wrote: >> One example is codec transforms, mostly audio. These work in >> 20 millisecond samples with from 12 to 160 bytes per sample. >> The transformation code is in the low hundreds of K in size. >> Some must be performed in sequence, others do not have this >> restriction. >> >> It gets interesting when you are to transform thousands of streams. >> >> The data for several thousand streams plus the code will fit in >> cache. > >That is nice. :-) > >> Likewise, VPN streams are cpu-burners, even with well thought >> out stuff like AES. > >Well, it shouldn't be (a cpu-burner)! > >A 1996-era 200 Mhz PentiumPro could handle AES encryption/decryption for >a 100 Mbit/s full duplex link, which means that _very_ few current >servers need a single full core to handle all the available bandwidth >for VPN traffic. Multiple Gbit/s streams? And small packets. Voip tends to generate streams of 100 pps per call, as calls are quantized in 20ms intevals. -- mrr |