From: Ketil Malde on

Hi,

I can't seem to find any (post-Montecito) numbers on IA64 and
performance on 32-bit code. Does anybody know about relevant
benchmarks or even approximate performance numbers?

-k
--
If I haven't seen further, it is by standing in the footprints of giants
From: Rick Jones on
Ketil Malde <ketil+news(a)ii.uib.no> wrote:
> I can't seem to find any (post-Montecito) numbers on IA64 and
> performance on 32-bit code. Does anybody know about relevant
> benchmarks or even approximate performance numbers?

IIRC there are at least three architectures for which there are
emulators on IA64:

*) "x86" under Linux
*) PA-RISC under HP-UX
*) SPARC under something from Fujitsu

And since it is often a point of confusion, even though you did say
emulated in the subject but just "performance on 32-bit code" in the
body, there is also native Itanium 32-bit application support in
HP-UX.

Hopefully that will help you narrow your search.

rick jones
--
portable adj, code that compiles under more than one compiler
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: Ketil Malde on
Rick Jones <rick.jones2(a)hp.com> writes:

>> I can't seem to find any (post-Montecito) numbers on IA64 and
>> performance on 32-bit code. Does anybody know about relevant
>> benchmarks or even approximate performance numbers?

> IIRC there are at least three architectures for which there are
> emulators on IA64:

> *) "x86" under Linux

(And under Windows, surely?)

Anyway, I apologize for not making myself clear enough. The problem
is that I have a program compiled for x86 (on a P4), and a potential
user of this program with an IA64 box. One particular test runs in
about ten seconds on my machine, but a similar test takes about thirty
minutes on his.

I don't have access to IA64 hardware, so I was wondering if that kind
of slowdown was to be expected, or what - and perhaps also if there is
something to be done - code to be avoided, etc.

I'm not sure exactly which incarnation of IA64 he is using, but a
factor of 180 vs x86 seems rather worse than I'd expect, even for an
older chip.

And no, I still can't find anything through search engines.

-k
--
If I haven't seen further, it is by standing in the footprints of giants
From: robertwessel2@yahoo.com on

Ketil Malde wrote:
> Anyway, I apologize for not making myself clear enough. The problem
> is that I have a program compiled for x86 (on a P4), and a potential
> user of this program with an IA64 box. One particular test runs in
> about ten seconds on my machine, but a similar test takes about thirty
> minutes on his.
>
> I don't have access to IA64 hardware, so I was wondering if that kind
> of slowdown was to be expected, or what - and perhaps also if there is
> something to be done - code to be avoided, etc.
>
> I'm not sure exactly which incarnation of IA64 he is using, but a
> factor of 180 vs x86 seems rather worse than I'd expect, even for an
> older chip.
>
> And no, I still can't find anything through search engines.


For current Itaniums running with the binary translator (not using the
older hardware x86 support), it's rare to see x86 code executing slower
than a P4 at half the I2's clock speed. IOW, a 1.6GHz I2 should run
most x86 programs at least as fast as a 800MHz P4 would. That's
generally pessimistic.

A 180-fold penalty is *not* common. Self-modifying code, dynamically
generated code, and/or code and data intermixed on a page might lead to
that.

If the end user is running an Itanium 1, you could probably expect
~100MHz P4 equivalent performance with the hardware emulation.

From: Spoon on
Ketil Malde wrote:

> Anyway, I apologize for not making myself clear enough. The problem
> is that I have a program compiled for x86 (on a P4), and a potential
> user of this program with an IA64 box. One particular test runs in
> about ten seconds on my machine, but a similar test takes about thirty
> minutes on his.
>
> I don't have access to IA64 hardware, so I was wondering if that kind
> of slowdown was to be expected, or what - and perhaps also if there is
> something to be done - code to be avoided, etc.
>
> I'm not sure exactly which incarnation of IA64 he is using, but a
> factor of 180 vs x86 seems rather worse than I'd expect, even for an
> older chip.
>
> And no, I still can't find anything through search engines.

Hello Ketil,

As far as I understand, Intel calls it IA-32 Execution Layer.

They published a technical whitepaper in 2003 titled

IA-32 Execution Layer: a two-phase dynamic translator designed
to support IA-32 applications on Itanium-based systems

Have you already found that?

( http://www.intel.com/cd/ids/developer/asmo-na/eng/93086.htm )

They compare execution of native IA-64 binaries vs translated IA-32
binaries. mcf runs faster as a translated IA-32 binary! :-)

They also compare runtimes on a 1.6 GHz Xeon (NetBurst) vs IA-32
translation on a 1.5GHZ Itanium 2 (Madison AFAIK).

Hope this helps.