From: Skybuck Flying on
Yeah baby !

Skybuck's Universal Code must have impressed AMD LOL.

They added a LZCNT instruction.

Leading Zero Count.

Nice !

Might come in handy in the future !

My point:

If intel CPU does not have LZCNT instruction then they can forget about me
buying their fricking chip.

I'll definetly go for AMD chip... just for this single instruction yeah baby
! ;)

Just because they added this instruction I am going to do some research into
some areas if this allows new algorithms and fast methods for all kinds of
stuff.

Ofcourse my lips will be sealed for now because this is secret research but
some fruits might come out of it ! ;) :D

Quite inspiring.

Bye,
Skybuck.


From: alex on
Skybuck Flying wrote:
> Yeah baby !
>
> Skybuck's Universal Code must have impressed AMD LOL.
>
> They added a LZCNT instruction.
>
> Leading Zero Count.
>
> Nice !
>
> Might come in handy in the future !
>
> My point:
>
> If intel CPU does not have LZCNT instruction then they can forget about me
> buying their fricking chip.
>
> I'll definetly go for AMD chip... just for this single instruction yeah baby
> ! ;)
>
> Just because they added this instruction I am going to do some research into
> some areas if this allows new algorithms and fast methods for all kinds of
> stuff.
>
> Ofcourse my lips will be sealed for now because this is secret research but
> some fruits might come out of it ! ;) :D
>
> Quite inspiring.
>
> Bye,
> Skybuck.
>
>

hopefully there are ROFL and LOL instructions for both 32 and 64bit
platforms too.
From: robertwessel2 on
On Nov 20, 11:39 pm, "Skybuck Flying" <s...(a)hotmail.com> wrote:
> Yeah baby !
>
> Skybuck's Universal Code must have impressed AMD LOL.
>
> They added a LZCNT instruction.
>
> Leading Zero Count.
>
> Nice !
>
> Might come in handy in the future !
>
> My point:
>
> If intel CPU does not have LZCNT instruction then they can forget about me
> buying their fricking chip.
>
> I'll definetly go for AMD chip... just for this single instruction yeah baby
> ! ;)
>
> Just because they added this instruction I am going to do some research into
> some areas if this allows new algorithms and fast methods for all kinds of
> stuff.
>
> Ofcourse my lips will be sealed for now because this is secret research but
> some fruits might come out of it ! ;) :D


As usual, if you had done even the slightest amount of research you
would have discovered that BSR has been in the ISA since the 486.
LZCNT is a somewhat improved version. The new instruction fixes a
design problem in that the old instruction does not deal with an all
zero input in a particularly useful way (it sets a condition code),
where as the new instruction sets the result to 32 (or 64). The use
of the condition code is problematic for high speed implementations on
heavily pipelined/out of order processors.
From: Skybuck Flying on

<robertwessel2(a)yahoo.com> wrote in message
news:3bac3f75-697b-4f5d-a835-89cf7375ba1f(a)j20g2000hsi.googlegroups.com...
> On Nov 20, 11:39 pm, "Skybuck Flying" <s...(a)hotmail.com> wrote:
>> Yeah baby !
>>
>> Skybuck's Universal Code must have impressed AMD LOL.
>>
>> They added a LZCNT instruction.
>>
>> Leading Zero Count.
>>
>> Nice !
>>
>> Might come in handy in the future !
>>
>> My point:
>>
>> If intel CPU does not have LZCNT instruction then they can forget about
>> me
>> buying their fricking chip.
>>
>> I'll definetly go for AMD chip... just for this single instruction yeah
>> baby
>> ! ;)
>>
>> Just because they added this instruction I am going to do some research
>> into
>> some areas if this allows new algorithms and fast methods for all kinds
>> of
>> stuff.
>>
>> Ofcourse my lips will be sealed for now because this is secret research
>> but
>> some fruits might come out of it ! ;) :D
>
>
> As usual, if you had done even the slightest amount of research you
> would have discovered that BSR has been in the ISA since the 486.
> LZCNT is a somewhat improved version. The new instruction fixes a
> design problem in that the old instruction does not deal with an all
> zero input in a particularly useful way (it sets a condition code),
> where as the new instruction sets the result to 32 (or 64). The use
> of the condition code is problematic for high speed implementations on
> heavily pipelined/out of order processors.

BSR sets the bit index.
If no one is found it's undefined.

I am not sure how LZCNT is supposed to work.

But I guess it returns a "count" which is different then a bit index.

So there are differences.

Try converting a BSR to a LZCNT, I like to see you do it...

How many instruction is it going to take you to mimic LZCNT with BSR + extra
instructions ? Hmmmmm ? ;)

Bye,
Skybuck.


From: robertwessel2 on
On Nov 21, 5:29 pm, "Skybuck Flying" <s...(a)hotmail.com> wrote:
> <robertwess...(a)yahoo.com> wrote in message
>
> news:3bac3f75-697b-4f5d-a835-89cf7375ba1f(a)j20g2000hsi.googlegroups.com...
>
>
>
>
>
> > On Nov 20, 11:39 pm, "Skybuck Flying" <s...(a)hotmail.com> wrote:
> >> Yeah baby !
>
> >> Skybuck's Universal Code must have impressed AMD LOL.
>
> >> They added a LZCNT instruction.
>
> >> Leading Zero Count.
>
> >> Nice !
>
> >> Might come in handy in the future !
>
> >> My point:
>
> >> If intel CPU does not have LZCNT instruction then they can forget about
> >> me
> >> buying their fricking chip.
>
> >> I'll definetly go for AMD chip... just for this single instruction yeah
> >> baby
> >> ! ;)
>
> >> Just because they added this instruction I am going to do some research
> >> into
> >> some areas if this allows new algorithms and fast methods for all kinds
> >> of
> >> stuff.
>
> >> Ofcourse my lips will be sealed for now because this is secret research
> >> but
> >> some fruits might come out of it ! ;) :D
>
> > As usual, if you had done even the slightest amount of research you
> > would have discovered that BSR has been in the ISA since the 486.
> > LZCNT is a somewhat improved version. The new instruction fixes a
> > design problem in that the old instruction does not deal with an all
> > zero input in a particularly useful way (it sets a condition code),
> > where as the new instruction sets the result to 32 (or 64). The use
> > of the condition code is problematic for high speed implementations on
> > heavily pipelined/out of order processors.
>
> BSR sets the bit index.
> If no one is found it's undefined.
>
> I am not sure how LZCNT is supposed to work.
>
> But I guess it returns a "count" which is different then a bit index.
>
> So there are differences.
>
> Try converting a BSR to a LZCNT, I like to see you do it...
>
> How many instruction is it going to take you to mimic LZCNT with BSR + extra
> instructions ? Hmmmmm ? ;)


If you're "not sure how LZCNT is supposed to work," how can you
seriously ask any of these questions?

First, the count and index values are identical. IOW, the "count" of
leading zeros in 0x40000000 (assuming a 32 bit value) is 3. The index
of the first one bit is 3.

As to equivalence:


lzcnt eax,ebx


bsr eax,ebx
jnz notzero
mov eax,32
notzero:


Like I said, bsr handles the zero input case in a less than completely
useful way (and the way the target register is handled in that case
introduces an unnecessary dependency). OTOH, if you're searching for
the first set bit in a larger structure, it doesn't matter, since
you'd use the zero flag to loop.

Lzcnt also sets the carry and zero flags differently. Lzcnt sets the
carry flag if the input is zero (which is what bsr does to the zero
flag), and sets the zero flag to reflect the returned value. In both
cases the other flags are mostly undefined.

So when are you going to learn how to read documentation? Hmmmmmm?