From: Skybuck Flying on
Well,

For my video compression purposes... writing 26 to 31 bits might be enough.

So writing 32 bits maybe not necessary.

So this could circumvent the shr 32 problem all together.

Hehe.

Nice.

Though it's always handy to have a complete/full version available.

Still though, your version Terja is not even partial... since even then it
fails horribly hihihihi lol.

Bye,
Skybuck.


From: Terje Mathisen on
Skybuck Flying wrote:
> Still though, your version Terja is not even partial... since even then it
My name is Terje, not Terja
> fails horribly hihihihi lol.

I guess I could compile & verify the throw-away code I wrote for you,
but I really don't care enough to do that. However, you are indeed right
that my initial version had some problems, i.e. the mask
generation/usage was buggy.

void WriteLongwordBits( unsigned long Value, unsigned int BitCount,
unsigned long int *DestAddress, unsigned int DestBitIndex )
{
/* Normalize bit index */
DestAddress += DestBitIndex >> 5;
DestBitIndex &= 31;

/* Index of last inserted bit: */
unsigned int lastbit = DestBitIndex + BitCount - 1;

/* Generate a mask of 0 bits below the first inserted bit */
unsigned int bottom_mask = ((unsigned int) -1) << DestBitIndex;

/* There must be 0 to 31 unmodified bits after the last inserted: */
unsigned int top_mask = ((unsigned int) -1) >> (31-(lastbit&31));

/* One or two affected words? */
if (lastbit >= 32) {
DestAddress[0] = (DestAddress[0] & ~bottom_mask) |
(Value << DestBitIndex);
/* Since we modify two words, DestBitIndex MUST be > 0! */
DestAddress[1] = (DestAddress[1] & ~top_mask) |
(Value >> (32-DestBitIndex));
}
else { /* It all fits in a single word! */
DestAddress[0] = (DestAddress[0] & ~(bottom_mask & top_mask)) |
(Value << DestBitIndex);
}
}

Do you like this version better?

If will have a runtime of about 10-12 cycles when correctly predicted to
not straddle a word boundary, while a predicted cross-over adds another
2-3 cycles.

Mispredicts take a long time of course. :-(

It is actually quite hard to generate branchless 32-bit code without
using 64-bit temporary values, which then require function calls to
handle the shift operations, and the SHRD/SHLD opcode is quite slow as well.

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
From: Terje Mathisen on
Skybuck Flying wrote:
> Your new version is better, but still has a serious shortcoming:
>
> the DestBitIndex should be able to have any range.
>
> Your method assumes parameter DestBitIndex has range 0 to 31, this is too
> restricted.

Huh?

The very first operation I do is to mask/update the base pointer and the
bit index, specifically to handle index values above 31

> Your method writes garbage bits from the value as well...
>
> Suppose the value is:
>
> LSB ------------------------------> MSB
> | |
> 1111 1111 1111 0000 0000 0000 0000 0000
>
> Suppose the bit count is: 3


This is simply bogus, you should never try to write a value which would
not fit inside the target field, or if you have to do so, then you
should have a separate function which masks the input value before
calling the base/fast function.

BTW, you even translated my normalization code to Pascal, and still
didn't read it. You're amazing!

> // Normalize bit index
> longword(DestAddress) := longword(DestAddress) + (DestBitIndex shr 5);
> DestBitIndex := DestBitIndex and 31;

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
From: Skybuck Flying on
The specification for parameter DestAddress is an untyped pointer.

Your code does not follow this specification.

Change your code to:

void WriteLongwordBits( unsigned long Value, unsigned int BitCount,
void *DestAddress, unsigned int DestBitIndex )

Then try again.

Bye,
Skybuck.


From: Skybuck Flying on

"Terje Mathisen" <terje.mathisen(a)hda.hydro.com> wrote in message
news:DqKdnRwYdexXxrzVnZ2dnUVZ8qSnnZ2d(a)giganews.com...
> Skybuck Flying wrote:
>> Your new version is better, but still has a serious shortcoming:
>>
>> the DestBitIndex should be able to have any range.
>>
>> Your method assumes parameter DestBitIndex has range 0 to 31, this is too
>> restricted.
>
> Huh?
>
> The very first operation I do is to mask/update the base pointer and the
> bit index, specifically to handle index values above 31

1. Your code is using compiler magic.

2. Your code is dangerous, it is assuming a certain operator precedence.

3. Your code is not following specifications, your code is instead using a
typed pointer.

If you change your code to follow specifications, the problems and danger
will become apperent to you.

Why did you change the specification ?

Is your weakness showing ?

Is the C language weakness showing ?

Are you unable to work with untyped pointers ? (void pointers)

>> Your method writes garbage bits from the value as well...
>>
>> Suppose the value is:
>>
>> LSB ------------------------------> MSB
>> | |
>> 1111 1111 1111 0000 0000 0000 0000 0000
>>
>> Suppose the bit count is: 3
>
>
> This is simply bogus, you should never try to write a value which would
> not fit inside the target field, or if you have to do so, then you should
> have a separate function which masks the input value before calling the
> base/fast function.

Your routine was supposed to do that. Read the specification !

What do you think the BitCount parameter is for ?

The whole concept of this specification is to take away difficulties and
make it easy to simply write/copy some bits from somewhere to somewhere at
some bit destination.

So far you have consistently failed to meet expectations of the
specification :)

I shall cut you some slack, and fix your bugged Delphi version by using
Delphi's Compiler Magic as well:

However I will not fix your horrible C code, it would be nice to see if you
managed to fix it yourself for a change ;) :)

Your method falls under classification A2B1.

Given your track record I would not be surprised if there are still some
bugs in it somewhere :)

For now I will assume it is correct and it may enter the contest for now...

procedure WriteLongwordBitsTerje2Fixed
(
Value : longword;
BitCount : longword;
DestAddress : pointer;
DestBitIndex : longword
);
var
vLastBit : longword;
vBottomMask : longword;
vTopMask : longword;
begin
// Normalize bit index
Inc( Plongword(DestAddress), DestBitIndex shr 5 ); // compiler magic at
work, I don't like it but ok.
DestBitIndex := DestBitIndex and 31;

// Index of last inserted bit:
vLastBit := DestBitIndex + BitCount - 1;

// Generate a mask of 0 bits below the first inserted bit */
vBottomMask := ( longword(-1) ) shl DestBitIndex;

// There must be 0 to 31 unmodified bits after the last inserted: */
vTopMask := ( longword(-1) ) shr (31-(vLastBit and 31));

// One or two affected words? */
if (vLastBit >= 32) then
begin
Plongword(DestAddress)^ := (Plongword(DestAddress)^ and not vBottomMask)
or (Value shl DestBitIndex);

Longword(DestAddress) := Longword(DestAddress) + 4;

// Since we modify two words, DestBitIndex MUST be > 0! */
Plongword(DestAddress)^ := (Plongword(DestAddress)^ and not vTopMask) or
(Value shr (32-DestBitIndex));
end else
begin
// It all fits in a single word !
Plongword(DestAddress)^ := (Plongword(DestAddress)^ and not (vBottomMask
and vTopMask)) or (Value shl DestBitIndex); // topmask incorrect.
end;
end;

Bye,
Skybuck.