From: Betov on
o//annabee <fack(a)szmyggenpv.com> ?crivait news:op.s57rx1o1ce7g4q(a)bonus:

> you cannot lie in asm

:]]]

This is surely the reason why this nerd wrote HLA.

:]]]

Betov.

< http://rosam.org >


From: Phil Carmody on
"ldb" <ldb_nospam(a)hotmail.com> writes:
> However he also says they use "simd intrinsics". To me, using
> intrinsics is essentially assembly programming.

Dan Bernstein's written a C-like pseudo-assembly language that
understands the kinds of operations in most processors' SIMD
instruction sets, such that you can program in what appears to
be a HLL, and yet where possible you get a 1-1 mapping of
operations onto instructions. (He has a pretty nifty register
allocator to this end. And a scheduler too.) If not, then it
turns them into good old-fashined blocks of SISD. Unfortunately,
he's not exactly released this assembler yet, but has shown a
few sample outputs that demonstrate it does what it says on the
tin.

http://cr.yp.to/

Phil
--
What is it: is man only a blunder of God, or God only a blunder of man?
-- Friedrich Nietzsche (1844-1900), The Twilight of the Gods
From: Dragontamer on

ldb wrote:
> However he also says they use "simd intrinsics". To me, using
> intrinsics is essentially assembly programming. Doing your for-loop in
> C++, and having the meat being all SIMD instrinics seems like a logical
> way to proceed. You aren't going to get much performance increase by
> switching the entire loop into assembly.

I agree, although there were a few libraries that basically had a
"vector" class
that would be like multiply(vector_1, vector_2) or something... I
remember
just the demo-code for this library and it's been a while. So some
abstractions
can take place, and keep everything as fast with a great improvement in
readability
and reuseability.

Although if he means SIMD intrinsics as in messing with prefetch and
stuff like
that :) That is most definitly just assembly language.

> I think his main point (from skimming the talk) talks about 80% of the
> computation time being "embarassingly" parrallel and the need for a
> language that can encapsulate that more simply. Really, it seems, we
> are headed to a better GPU language that handles SIMD operations more
> effeciently. I think that's the main point.

I think parallel-based languages should come out either way, as
dual-core CPUs
become more popular, who knows how soon quad-core become popular, and
then
oct-core, and then 128-core :-p

--Dragontamer

From: randyhyde@earthlink.net on

Dragontamer wrote:

> I think parallel-based languages should come out either way, as
> dual-core CPUs
> become more popular, who knows how soon quad-core become popular, and
> then
> oct-core, and then 128-core :-p

SMPs tend to reach the point of dimishing returns around 16 processors.
At that point, cache and bus coherency problems tend to cause too many
delays. The few system that have a large number of processors tend to
have *very* expensive busses and don't support the kind of fine-grained
multiprocessing you get with SMPs (if you want to call that
"fine-grained").

That's not to say that parallel languages won't be able to take
advantage of such systems. Quite the contrary, they may make such
systems more accessible. But I doubt you'll see typical PCs or
workstations exceeding 16 processors anytime soon (or even anytime far
away). At least, not in a configuration where a single application can
easily migrate between the processors during execution.
Cheers,
Randy Hyde

From: randyhyde@earthlink.net on

Phil Carmody wrote:
> "ldb" <ldb_nospam(a)hotmail.com> writes:
> > However he also says they use "simd intrinsics". To me, using
> > intrinsics is essentially assembly programming.
>
> Dan Bernstein's written a C-like pseudo-assembly language that
> understands the kinds of operations in most processors' SIMD
> instruction sets, such that you can program in what appears to
> be a HLL, and yet where possible you get a 1-1 mapping of
> operations onto instructions. (He has a pretty nifty register
> allocator to this end. And a scheduler too.) If not, then it
> turns them into good old-fashined blocks of SISD. Unfortunately,
> he's not exactly released this assembler yet, but has shown a
> few sample outputs that demonstrate it does what it says on the
> tin.
>
> http://cr.yp.to/
>


Hi Phil,
I couldn't find the reference. Which of the lins on this page contains
the sample code?
Cheers,
Randy Hyde