From: Skybuck Flying on
Hello,

Skybuck's Super Fast WriteBits Algorithm (32 bits):

Assumptions:

Buffer is cleared.
Trailing garbage bits in buffer not a problem.
Buffer is large enough.

Value bits:

Initial Value pointer
|
| Secondary Value pointer
| |
| |
01234567 89012345 67890123 45678901 ????????
|
|
Dangerous overflow but not really if
it's solved like so:

Value shr 8
Value shl ShiftValue;

// problem solved no pointer needed.

Buffer Bits:

x = taken
0 = free

xxx00000 00000000 00000000 00000000

Mission copy value bits like so:

DestAddress + (DestBitIndex shr 3)
|
| (DestBitIndex and 7) = (shift value)
| |
xxx01234 56789012 34567890 12345678 90100000

Plan:

Initialization:

DestAddress := DestAddress + (DestBitIndex shr 3);

DestBitIndex := DestBitIndex and 7;

Step 1:

DestAddress^ := DestAddress^ or (Value shl DestBitIndex)

Step 2:

(DestAddress+1)^ := (DestAddress+1)^ or ( (Value shr 8) shl DestBitIndex);

Done.

Delphi 2007 implementation:

// 19 instructions when not inlined.
// 14 instructions when inlined.
// better algorithm and code, doesn't have the double garbage problem of
previous version :)
procedure WriteLongwordBitsA1B1v4( Value : longword; DestAddress : pointer;
DestBitIndex : longword ); inline;
begin
// calculate first destination byte
longword(DestAddress) := longword(DestAddress) + (DestBitIndex shr 3); //
div 8

// calculate shift value, DestBitIndex will now function as the shift
value.
DestBitIndex := DestBitIndex and 7; // mod 8

// or destination with byte 0 to 3
Plongword(DestAddress)^ := Plongword(DestAddress)^ or (Value shl
DestBitIndex);

// incrementation destination with 1 byte.
longword(DestAddress) := longword(DestAddress) + 1;

// or (this next) destination with byte 1 to 4
Plongword(DestAddress)^ := Plongword(DestAddress)^ or ( (Value shr 8) shl
DestBitIndex);
end;

// Generated assembler for non-inlined routine:

Project1.dpr.2082: begin
00409058 53 push ebx
00409059 56 push esi
0040905A 8BD9 mov ebx,ecx
Project1.dpr.2084: longword(DestAddress) := longword(DestAddress) +
(DestBitIndex shr 3); // div 8
0040905C 8BCB mov ecx,ebx
0040905E C1E903 shr ecx,$03
00409061 01CA add edx,ecx
Project1.dpr.2087: DestBitIndex := DestBitIndex and 7; // mod 8
00409063 83E307 and ebx,$07
Project1.dpr.2090: Plongword(DestAddress)^ := Plongword(DestAddress)^ or
(Value shl DestBitIndex);
00409066 8BCB mov ecx,ebx
00409068 8BF0 mov esi,eax
0040906A D3E6 shl esi,cl
0040906C 0932 or [edx],esi
Project1.dpr.2093: longword(DestAddress) := longword(DestAddress) + 1;
0040906E 42 inc edx
Project1.dpr.2096: Plongword(DestAddress)^ := Plongword(DestAddress)^ or (
(Value shr 8) shl DestBitIndex);
0040906F 8BCB mov ecx,ebx
00409071 C1E808 shr eax,$08
00409074 D3E0 shl eax,cl
00409076 0902 or [edx],eax
Project1.dpr.2097: end;
00409078 5E pop esi
00409079 5B pop ebx
0040907A C3 ret

// Generated assembler for inlined-routine:

Project1.dpr.2172: WriteLongwordBitsA1B1v4( Value, @Buffer, BitPointer );
0040A208 8BC7 mov eax,edi
0040A20A 8BCA mov ecx,edx
0040A20C C1E903 shr ecx,$03
0040A20F 01C8 add eax,ecx
0040A211 83E207 and edx,$07
0040A214 8BCA mov ecx,edx
0040A216 8BF3 mov esi,ebx
0040A218 D3E6 shl esi,cl
0040A21A 0930 or [eax],esi
0040A21C 40 inc eax
0040A21D 8BCA mov ecx,edx
0040A21F C1EB08 shr ebx,$08
0040A222 D3E3 shl ebx,cl
0040A224 0918 or [eax],ebx

}

Nice.

Bye,
Skybuck.


From: Skybuck Flying on
Hmmm,

I just realized something.

Because these kinds of methods allow garbage bits to end up in the buffer,
the methods won't work if garbage bits are present... because of the or's...

So then the buffer will contain incorrect values because of garbage bits
being or-ed the next time the routine is called/used.

However the methods can still safely be used for sequential writing if one
makes sure the value does not contain any garbage bits.

Good thing to know ;)

Bye,
Skybuck :)