From: Skybuck Flying on
Decoding of Skybuck's Universal Code 6 can be accelerated by two new
instructions:

The instructions would be called:

ScanMarkerForward (SMF)
ScanMarkerBackward (SMB)

Two different versions are thinkable:

Version 1 uses a BitPosition to indicate where to start scanning to prevent
the need for masking unwanted bits to zero.
(Version 2 uses a MaximumBitCount to indicate a default value for when no
marker/one bit is found.)
Version 1 is to be preferred because of probably faster operation.
Therefore version 1 will be discussed below:

Input operands would be:

<SourceData>, <DestinationCounter>, <BitPosition>

Output operands would be:

<MarkerEnded>, <DestinationCounter>

Operation would be as follows:

Initialize <MarkerEnded> to zero.

Scan the <SourceData>.

For forward:

Start at <BitPosition> in the <SourceData>.

For backward:

Start at <BitPosition-1> in the <SourceData>.

Proceed in the direction indicated by the instruction's name.

Increment <DestinationCounter> by 1 for each bit scanned.
(Optimization:Internal counter could be used and added to destination when
done.)

Stop scanning if scanned bit is 1.

Set <MarkerEnded> to 1 if scanned bit is 1.

Stop scanning if end of <SourceData> is reached.

Description:

Scan <SourceData> starting at <BitPosition> proceed in the direction as
indicated by the instruction's name/code/encoding.
Stop when a 1 is encountered or Stop when the end of the register is
reached. Increment <DestinationCounter> for each bit scanned.
Set <MarkerEnded> to zero if no 1 encountered. Set <MarkerEnded> to one if 1
encountered.

32 bit register example for ScanMarkerBackward:

0 1 2 3
01234567890123456789012345678901
xxxxxxxxxxxxxxxxxCCCMMMxxxxxxxxx
| |
LSB MSB


0 1 2 3
01234567890123456789012345678901
xxxxxxxxxxxxxxxxxCCC100xxxxxxxxx
| ^ |
LSB | MSB
|
|
^
For backwards scanning BitPosition is one beyond the bits to be scanned.
In the example BitPosition would be 23.

Bits to be scanned are at bit positions: 0 to 22.

Scan would stop at bit position: 20.


0 1 2 3
01234567890123456789012345678901
xxxxxxxxxxxxxxxxxCCC100xxxxxxxxx
| | |
LSB | MSB
^
1 encountered

Output:

MarkerEnded is set to 1/true.
DestinationCounter is incremented with 3.
(Marker is 100: 3 bits)


One last example for ScanMarkerForward:


0 1 2 3
01234567890123456789012345678901
xxxxxxxMMMMCCCCxxxxxxxxxxxxxxxxx
| ^ |
LSB | MSB
|
^
BitPosition is 7

Scanning starts at 7


0 1 2 3
01234567890123456789012345678901
xxxxxxx0001CCCCxxxxxxxxxxxxxxxxx
| |
LSB MSB

Output is:
DestinationCounter incremented with 4.
MarkerEnded is set to 1/true.

Further remarks:

Input operand <SourceData> could be overwritten with output operand
<MarkerEnded>. (This is to be preferred)

Alternatively <MarkerEnded> could be implemented as a flag.

(
If it's to be implemented as a flag then:
Carry flag is to be preferred because it can be added with adc (add with
carry) branchless.
Alternatively zero flag could be used as well for consistency and can be
copied with cmovz/cmovnz (conditional move zero/not zero).
)

Desireable ranges:

For 8 bit version, BitPosition range can go from 0 to 8.
For 16 bit version, BitPosition range can go from 0 to 16.
For 32 bit version, BitPosition range can go from 0 to 32.
For 64 bit version, BitPosition range can go from 0 to 64.

Not all versions have to be implemented, a 32 bit version would suffice.

(If it's undesirable from a hardware point of view to allow the BitPosition
one beyond the register bit size for SMB then the specification
can be changed to exclude this feature and simply start from BitPosition for
backward scanning, the programmer will then have to use an extra instruction
to subtract 1 for SMB usage as intended in this original specification).

The implementation should be as fast as possible, a clock/cycle/latency
count of 1 or 2 is desirable, even 5 would still be an improvement.

Example of a real world situation could be:

smb eax, edx, [ecx]

Input:

eax = <SourceData in register>
edx = <BitPosition in register>
ecx = <pointer in register to DestinationCounter (in memory)>

Output (preferred implementation):
eax = <MarkerEnded>
[ecx] = <DestinationCounter incremented with marker bit count>

Output (alternative implementation):
eax = <MarkerBitCount>
[ecx] = <DestinationCounter incremented with marker bit count>
ZF = <MarkerEnded>

Example of a real world pascal implementation:

function ScanMarkerBackward( ParaLongword : longword; ParaBitPosition :
longword; var ParaBitCounter : longword ) : boolean; register;
asm
smb eax, edx, [ecx]
end;

function ScanMarkerForward( ParaLongword : longword; ParaBitPosition :
longword; var ParaBitCounter : longword ) : boolean; register;
asm
smf eax, edx, [ecx]
end;

Bye,
Skybuck.