Prev: OctaOS
Next: DIV overflow
From: Guga on
On Mar 28, 12:02 am, Herbert Kleebauer <k...(a)unibwm.de> wrote:
> Guga wrote:
>
> > The convertion i got from:
> > 1699504104824251512520704 = 0167E2__04D586300__00000000
>
> > that is the same as on the site.
>
> > But.. when i input this:
> > 7458784146511699504104824251512520704 = my resultant code is
> > 059C8273__06B2C5BFC__0F549DDB__0E000000
>
> > that is different from the site. the site it results:
>
> > 7458784146511699504104824251512520704 =
> > 059C8273__06B2C5C00__00000000__00000000
>
> > So.. or mine is more precise, or his is correct.
>
> Increment your input number by one and try again.
> The Hex value then also should be on higher.- Hide quoted text -
>
> - Show quoted text -



Ok..my last test for tonight

7458784146511699504104824251512520705 =
059C8273__06B2C5BFC__0F549DDB__0E000001

Yes.. it is also incrementing

Now.. Guga --> Bed , hehehe

Best Regards,

Guga

From: /o//annabee on
På Wed, 28 Mar 2007 09:54:46 +0200, skrev Guga <GugaGTG(a)gmail.com>:

> On Mar 27, 11:46 pm, /\\\\o//\\annabee <Wanna...(a)thewannabee.org>
> wrote:
>> På Wed, 28 Mar 2007 09:44:35 +0200, skrev /\\o//\annabee
>> <Wanna...(a)thewannabee.org>:
>>
>> > På Wed, 28 Mar 2007 09:12:09 +0200, skrev Guga <Guga...(a)gmail.com>:
>>
>> >> Ok, guys
>>
>> >> i think i finally got it working.. It seems that the function can
>> >> works without any size multiple of 32 bits. So it seems to be working
>> >> on 32bits, 64 bits, 96 bits, 128 bits and so on. (well.. not sure
>> >> yet.. if it is really working.. i have no way how to check some test
>> >> number)
>>
>> > Use Calc. It can do those numbers.
>>
>> woops. No it seems it cant. Sorry. I assumed...
>
> Calc is only limited to 64 Bit convertion.

Yes. seem it is. I allways assumed it could perform
correctly on really long numbers, using the toggle between
dec and hex. I am bit surpriced to say the least.

> Tomorrow i´ll check to see if the result is ok.
>
> I can try a binary convertion now to check back to see if it correct,
> because i´m awfully tired. It´s 5:00 AM here :):):)
>
> So... tomorrow i´ll se the results.

Cool. Good night again to you Brasil.

> Best Regards,
>
> Guga
From: Guga on
On Mar 27, 10:02 am, Frank Kotler <fbkot...(a)verizon.net> wrote:
> Guga wrote:
>
> ...
>
> > "With my New Improved (mis)Understanding, this looks a lot easier. I'm
> > not sure what you'd do with a tbyte/tword integer, but if you can
> > multiply a tword by ten, you've got it made. Might be worth developing
> > an "arbitrary size" multiply, rather than specifically tword. Lemme
> > think on this a little..."
>
> > Tks frank :)
>
> > an arbitraty size ?
>
> > How ?
>
> Like Herbert showed you - or look at my "translation" to Nasm if his
> syntax baffles you. (It baffles *me*! I disassemble the thing with
> ndisasm and work from there!)
>
> > I 1st thought in making loops inside the alldecmul to get the
> > remainders, but then.. i got stuck because i couldn´t knew how to get
> > the remainders for the tword, and put them on different registers then
> > edx:eax.
>
> > I mean.. if using the convertino to 80 bit.. 3 registers can hold the
> > value,
>
> Yeah... two and a half registers, if we want to be stingy -
> cx:edx:eax... or use a segreg for the odd two bytes :)
>
> But what's the point? What are you going to do with this 80-bit integer
> once you've got it? Strange size for an integer - but used for
> extended-precision floats, which is what threw me off.
>
> > 128 bits, it may be 4 registers (or putting the generated data
> > on a structure with the proper size)
>
> You won't want to tie up four registers (or 2-1/2) for long, so you'll
> be wanting to store results in memory in any case, I would imagine.
> Herberts routine won't do a tword exactly, since "size" is specified in
> dwords, but the idea could be modified so it would, if that's what you
> really want. Seems an odd size to write a special routine for. 128 bits
> ("oword"? I think that's what RosAsm calls 'em) makes more sense - just
> set Herbert's "size" to 4...
>
> Best,
> Frank

Hi frank

no.. i didnt use herbert´s code or randal´´s code.. It would take
meloger translating, ratter then actually follow the logic behind all
of this.
I did a variation of mine "alldecmul" function, and it seems to be
working. Tomorrow i´ll make further testings to see how many bugs it
have (i it have any.. so far, for my testings, the results are ok..
but i´m awfully tired to check with more attention)

Best Regards,

Guga

From: Wolfgang Kern on
Hello Guga,
[..]

<quote Guga..>
Hi wolfgang.. tks for the reply.

Have a good party :):):)

I can remove the usage of stack frames. I�m just used to them for
readability purposes mainly. I distinguish better a functin from
another when i see the 1st "Proc" on the begginning of a line.

About thye checkings for errors and limits on the ascii string.. yes..
they are used in other routine. On the example i provided, i built it
only when the string was already checked.

"80 bit conversion could be done in three registers, but for 128 bit
I'm afraid you either need a few LOCALs or use SSE to speed it up. "

Yes.. this is what i was thinking. Using 80 bit in 3 registers, and on
128 bit, using local to compute the data, and returning them in 4
registers (or returning in inside a global data - like a structure)


"I can extract the method for fix-sized conversion and convert it into
readable ASM."

Tks.. i�ll appreciate it :)

But.. if you suceed to do for 128 bit.. is SSE really needed ?
</quote>

The party ended somehow heavy this morning :)

I found my old (KESYS1998) 80-bit conversion which is short but
slow (later versions don't support this 'odd' IEEE-754 format anymore).
It first compressed the string to BCD and used FBLD followed by
FISTP and needed some rounding overhead,
but you asked for an 80-bit integer and not the 80-bit FPU format ?
btw: where is this required ?

For speed reasons I now use my tiny calculator routines which
works with a DEC<->2^n LUT on 256 bit variables.
This table is quite long (78*9 entries 32 byte each ~22KB),
[maximal 77.1 decimal digits can be represented with 256 unsigned bits,
only nine entries per decade in the table (a partial log-LUT)]
and so it's also usable for many other calculation.

For the rare used 512-bit values I use a shorter table
which contain just every 10th digitvalue but needs one line multiply.

In your 128/80-bit case the LUT would need 39*9*16 bytes [~5.5 KB]
and you can use it for 64-bit conversion as well.

this then would work like (somehow fast):
___________________
LUT_CONV_ASCII2BIN:

XOR esi,esi ;result go to three regs for 80/96 bit
MOV ebx,esi ;
MOV edx,esi ;

MOV ecx str_len -1 ;this is power10 of 1st digit (MSD)
L1:
MOVZX eax B$strptr+ecx ;we start with MSD, just for fun?
SHL al,4 ;mul by 16 (entry size) and get rid of 030
JZ L2> ;skip if 0

; LEA edi,D$ecx+ecx*8+table_ptr ;not sure if RosAsm accept this ?
; so you might need to split it into two lines:

LEA edi D$ecx+ecx*8 ;mul digits power by 9
ADD edi table_ptr ;table offset for power

ADD edi,eax ;table offset for digit
; as above, LEA could combine the two ADD lines

;now just add the table entry to your destination:
;ie 80 bit:
ADD ebx D$edi
ADC edx D$edi+4
ADC si W$edi+8
L2:
DEC ecx | JNS L1< ;next digit, and we include "+0" .
done:
RET
_____________

For 128-bit the story can be similar, and if you could avoid
the stack-frames then ebp could be the 'missing forth' register. ;)

If you need the code for table creation or just the table
I can mail it to you.
But it just contains binary expressed decimals starting
with 1..9,10..90, and so on. So I'm sure you can do it as well.


And No, SSE is not really required, even it may be faster than
a plain register/buffer line MUL solution.

__
wolfgang



From: Guga on
On Mar 28, 9:02 am, "Wolfgang Kern" <nowh...(a)never.at> wrote:
> Hello Guga,
> [..]
>
> <quote Guga..>
> Hi wolfgang.. tks for the reply.
>
> Have a good party :):):)
>
> I can remove the usage of stack frames. I´m just used to them for
> readability purposes mainly. I distinguish better a functin from
> another when i see the 1st "Proc" on the begginning of a line.
>
> About thye checkings for errors and limits on the ascii string.. yes..
> they are used in other routine. On the example i provided, i built it
> only when the string was already checked.
>
> "80 bit conversion could be done in three registers, but for 128 bit
> I'm afraid you either need a few LOCALs or use SSE to speed it up. "
>
> Yes.. this is what i was thinking. Using 80 bit in 3 registers, and on
> 128 bit, using local to compute the data, and returning them in 4
> registers (or returning in inside a global data - like a structure)
>
> "I can extract the method for fix-sized conversion and convert it into
> readable ASM."
>
> Tks.. i´ll appreciate it :)
>
> But.. if you suceed to do for 128 bit.. is SSE really needed ?
> </quote>
>
> The party ended somehow heavy this morning :)
>
> I found my old (KESYS1998) 80-bit conversion which is short but
> slow (later versions don't support this 'odd' IEEE-754 format anymore).
> It first compressed the string to BCD and used FBLD followed by
> FISTP and needed some rounding overhead,
> but you asked for an 80-bit integer and not the 80-bit FPU format ?
> btw: where is this required ?
>
> For speed reasons I now use my tiny calculator routines which
> works with a DEC<->2^n LUT on 256 bit variables.
> This table is quite long (78*9 entries 32 byte each ~22KB),
> [maximal 77.1 decimal digits can be represented with 256 unsigned bits,
> only nine entries per decade in the table (a partial log-LUT)]
> and so it's also usable for many other calculation.
>
> For the rare used 512-bit values I use a shorter table
> which contain just every 10th digitvalue but needs one line multiply.
>
> In your 128/80-bit case the LUT would need 39*9*16 bytes [~5.5 KB]
> and you can use it for 64-bit conversion as well.
>
> this then would work like (somehow fast):
> ___________________
> LUT_CONV_ASCII2BIN:
>
> XOR esi,esi ;result go to three regs for 80/96 bit
> MOV ebx,esi ;
> MOV edx,esi ;
>
> MOV ecx str_len -1 ;this is power10 of 1st digit (MSD)
> L1:
> MOVZX eax B$strptr+ecx ;we start with MSD, just for fun?
> SHL al,4 ;mul by 16 (entry size) and get rid of 030
> JZ L2> ;skip if 0
>
> ; LEA edi,D$ecx+ecx*8+table_ptr ;not sure if RosAsm accept this ?
> ; so you might need to split it into two lines:
>
> LEA edi D$ecx+ecx*8 ;mul digits power by 9
> ADD edi table_ptr ;table offset for power
>
> ADD edi,eax ;table offset for digit
> ; as above, LEA could combine the two ADD lines
>
> ;now just add the table entry to your destination:
> ;ie 80 bit:
> ADD ebx D$edi
> ADC edx D$edi+4
> ADC si W$edi+8
> L2:
> DEC ecx | JNS L1< ;next digit, and we include "+0" .
> done:
> RET
> _____________
>
> For 128-bit the story can be similar, and if you could avoid
> the stack-frames then ebp could be the 'missing forth' register. ;)
>
> If you need the code for table creation or just the table
> I can mail it to you.
> But it just contains binary expressed decimals starting
> with 1..9,10..90, and so on. So I'm sure you can do it as well.
>
> And No, SSE is not really required, even it may be faster than
> a plain register/buffer line MUL solution.
>
> __
> wolfgang


Hi wolgang.

tks for the reply.

Yesterday, i suceedd to make the function works in all bit size
multiple of 32. so, it can work on 32, 64, 96, 128, 160, and so on.

The code is messy.. So i´ll clean up today.

It seems to be generating accurated results, so i´ll make further
tests and clean the code before i post.

Best Regards,

Guga

First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: OctaOS
Next: DIV overflow