From: Terje Mathisen on
Skybuck Flying wrote:
> Still though, your version Terja is not even partial... since even then it
My name is Terje, not Terja
> fails horribly hihihihi lol.

I guess I could compile & verify the throw-away code I wrote for you,
but I really don't care enough to do that. However, you are indeed right
that my initial version had some problems, i.e. the mask
generation/usage was buggy.

void WriteLongwordBits( unsigned long Value, unsigned int BitCount,
unsigned long int *DestAddress, unsigned int DestBitIndex )
{
/* Normalize bit index */
DestAddress += DestBitIndex >> 5;
DestBitIndex &= 31;

/* Index of last inserted bit: */
unsigned int lastbit = DestBitIndex + BitCount - 1;

/* Generate a mask of 0 bits below the first inserted bit */
unsigned int bottom_mask = ((unsigned int) -1) << DestBitIndex;

/* There must be 0 to 31 unmodified bits after the last inserted: */
unsigned int top_mask = ((unsigned int) -1) >> (31-(lastbit&31));

/* One or two affected words? */
if (lastbit >= 32) {
DestAddress[0] = (DestAddress[0] & ~bottom_mask) |
(Value << DestBitIndex);
/* Since we modify two words, DestBitIndex MUST be > 0! */
DestAddress[1] = (DestAddress[1] & ~top_mask) |
(Value >> (32-DestBitIndex));
}
else { /* It all fits in a single word! */
DestAddress[0] = (DestAddress[0] & ~(bottom_mask & top_mask)) |
(Value << DestBitIndex);
}
}

Do you like this version better?

If will have a runtime of about 10-12 cycles when correctly predicted to
not straddle a word boundary, while a predicted cross-over adds another
2-3 cycles.

Mispredicts take a long time of course. :-(

It is actually quite hard to generate branchless 32-bit code without
using 64-bit temporary values, which then require function calls to
handle the shift operations, and the SHRD/SHLD opcode is quite slow as well.

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
From: Terje Mathisen on
Skybuck Flying wrote:
> Your new version is better, but still has a serious shortcoming:
>
> the DestBitIndex should be able to have any range.
>
> Your method assumes parameter DestBitIndex has range 0 to 31, this is too
> restricted.

Huh?

The very first operation I do is to mask/update the base pointer and the
bit index, specifically to handle index values above 31

> Your method writes garbage bits from the value as well...
>
> Suppose the value is:
>
> LSB ------------------------------> MSB
> | |
> 1111 1111 1111 0000 0000 0000 0000 0000
>
> Suppose the bit count is: 3


This is simply bogus, you should never try to write a value which would
not fit inside the target field, or if you have to do so, then you
should have a separate function which masks the input value before
calling the base/fast function.

BTW, you even translated my normalization code to Pascal, and still
didn't read it. You're amazing!

> // Normalize bit index
> longword(DestAddress) := longword(DestAddress) + (DestBitIndex shr 5);
> DestBitIndex := DestBitIndex and 31;

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
From: Skybuck Flying on
The specification for parameter DestAddress is an untyped pointer.

Your code does not follow this specification.

Change your code to:

void WriteLongwordBits( unsigned long Value, unsigned int BitCount,
void *DestAddress, unsigned int DestBitIndex )

Then try again.

Bye,
Skybuck.


From: Skybuck Flying on

"Terje Mathisen" <terje.mathisen(a)hda.hydro.com> wrote in message
news:DqKdnRwYdexXxrzVnZ2dnUVZ8qSnnZ2d(a)giganews.com...
> Skybuck Flying wrote:
>> Your new version is better, but still has a serious shortcoming:
>>
>> the DestBitIndex should be able to have any range.
>>
>> Your method assumes parameter DestBitIndex has range 0 to 31, this is too
>> restricted.
>
> Huh?
>
> The very first operation I do is to mask/update the base pointer and the
> bit index, specifically to handle index values above 31

1. Your code is using compiler magic.

2. Your code is dangerous, it is assuming a certain operator precedence.

3. Your code is not following specifications, your code is instead using a
typed pointer.

If you change your code to follow specifications, the problems and danger
will become apperent to you.

Why did you change the specification ?

Is your weakness showing ?

Is the C language weakness showing ?

Are you unable to work with untyped pointers ? (void pointers)

>> Your method writes garbage bits from the value as well...
>>
>> Suppose the value is:
>>
>> LSB ------------------------------> MSB
>> | |
>> 1111 1111 1111 0000 0000 0000 0000 0000
>>
>> Suppose the bit count is: 3
>
>
> This is simply bogus, you should never try to write a value which would
> not fit inside the target field, or if you have to do so, then you should
> have a separate function which masks the input value before calling the
> base/fast function.

Your routine was supposed to do that. Read the specification !

What do you think the BitCount parameter is for ?

The whole concept of this specification is to take away difficulties and
make it easy to simply write/copy some bits from somewhere to somewhere at
some bit destination.

So far you have consistently failed to meet expectations of the
specification :)

I shall cut you some slack, and fix your bugged Delphi version by using
Delphi's Compiler Magic as well:

However I will not fix your horrible C code, it would be nice to see if you
managed to fix it yourself for a change ;) :)

Your method falls under classification A2B1.

Given your track record I would not be surprised if there are still some
bugs in it somewhere :)

For now I will assume it is correct and it may enter the contest for now...

procedure WriteLongwordBitsTerje2Fixed
(
Value : longword;
BitCount : longword;
DestAddress : pointer;
DestBitIndex : longword
);
var
vLastBit : longword;
vBottomMask : longword;
vTopMask : longword;
begin
// Normalize bit index
Inc( Plongword(DestAddress), DestBitIndex shr 5 ); // compiler magic at
work, I don't like it but ok.
DestBitIndex := DestBitIndex and 31;

// Index of last inserted bit:
vLastBit := DestBitIndex + BitCount - 1;

// Generate a mask of 0 bits below the first inserted bit */
vBottomMask := ( longword(-1) ) shl DestBitIndex;

// There must be 0 to 31 unmodified bits after the last inserted: */
vTopMask := ( longword(-1) ) shr (31-(vLastBit and 31));

// One or two affected words? */
if (vLastBit >= 32) then
begin
Plongword(DestAddress)^ := (Plongword(DestAddress)^ and not vBottomMask)
or (Value shl DestBitIndex);

Longword(DestAddress) := Longword(DestAddress) + 4;

// Since we modify two words, DestBitIndex MUST be > 0! */
Plongword(DestAddress)^ := (Plongword(DestAddress)^ and not vTopMask) or
(Value shr (32-DestBitIndex));
end else
begin
// It all fits in a single word !
Plongword(DestAddress)^ := (Plongword(DestAddress)^ and not (vBottomMask
and vTopMask)) or (Value shl DestBitIndex); // topmask incorrect.
end;
end;

Bye,
Skybuck.