Fast big integer division [ASM]

Prev: Computer nostalgia song
Next: If you don't immediately see your CLAX post...

From: H. Peter Anvin on 13 Apr 2010 00:11

On 04/12/2010 08:26 PM, Robert Redelmeier wrote:
> In alt.lang.asm Branimir Maksimovic <bmaxa(a)nospicedham.hotmail.com> wrote in part:
>> Problem is how to do integer division.
>
> Of course it is. The difficulty of factoring large number
> is the basis of public key crypto.
>
>> I have found newton-rafson method of interpolation, so you can
>> aproximate division of x1/x2 to x1*1/x2 but that requires floating
>> / fixed point math. Do someone knows method for fast big integer
>> division which does not requires floating / fixed point math?
>> Obvioux method is just to count how many times x2 fits x1 but
>> that's n^2 and is slow for really big ints.
>
> There is always the method of long division -- shift, compare
> & subtract. About n^1.7
>

Where do you get 1.7 from? It really seems O(n^2) to me, or perhaps
more accurately O(nm) with n and m being the lengths of the dividend and
divisor, respectively.

-hpa

From: Terje Mathisen "terje.mathisen at on 13 Apr 2010 00:59

Branimir Maksimovic wrote:
> Problem is how to do integer division. I have found newton-rafson
> method of interpolation, so you can aproximate division of x1/x2 to
> x1*1/x2 but that requires floating / fixed point math.

First of all, the canonical answer is to look at how the Gnu
Multi-Precision package does it!

The obvious method is also close to optimal:

Start with a guess for what y=1/x2 should be, then run log(n) iterations
of N-R to get n bits of precision.

> Do someone knows method for fast big integer division which does not
> requires floating / fixed point math?

The only halfway tricky part is that initial guess, the obvious method
is to approximate x2 with a double precision fp value (simply combine
the leading ~64 bits or so), then calculate the fp reciprocal and
convert back to.

The guess takes O(1) time, then the N-R stages work with twice as many
bits during each iteration, so the sum is simply twice the time of the
final stage, i.e. still O(log(n)).

The final multiplication by the calculated reciprocal is also O(log(n)).

Using this approach a division is only about 3 times slower than a
multiplication.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

From: Alexei A. Frounze on 13 Apr 2010 02:27

On Apr 12, 9:59 pm, Terje Mathisen <"terje.mathisen at
tmsw.no"@giganews.com> wrote:
> Branimir Maksimovic wrote:
> > Problem is how to do integer division. I have found newton-rafson
> > method of interpolation, so you can aproximate division of x1/x2 to
> > x1*1/x2 but that requires floating / fixed point math.
>
> First of all, the canonical answer is to look at how the Gnu
> Multi-Precision package does it!
>
> The obvious method is also close to optimal:
>
> Start with a guess for what y=1/x2 should be, then run log(n) iterations
> of N-R to get n bits of precision.
>
> > Do someone knows method for fast big integer division which does not
> > requires floating / fixed point math?
>
> The only halfway tricky part is that initial guess, the obvious method
> is to approximate x2 with a double precision fp value (simply combine
> the leading ~64 bits or so), then calculate the fp reciprocal and
> convert back to.
>
> The guess takes O(1) time, then the N-R stages work with twice as many
> bits during each iteration, so the sum is simply twice the time of the
> final stage, i.e. still O(log(n)).
>
> The final multiplication by the calculated reciprocal is also O(log(n)).
>
> Using this approach a division is only about 3 times slower than a
> multiplication.

And then, I guess, the resultant rounding for the integer case (is it
just 1 ULP if everything's done right?) may be corrected by
multiplying the quotient by the divisor and looking at the difference
between that product and the dividend.

Alex

From: Terje Mathisen "terje.mathisen at on 13 Apr 2010 03:59

Alexei A. Frounze wrote:
> On Apr 12, 9:59 pm, Terje Mathisen<"terje.mathisen at
>> The final multiplication by the calculated reciprocal is also O(log(n)).
>>
>> Using this approach a division is only about 3 times slower than a
>> multiplication.
>
> And then, I guess, the resultant rounding for the integer case (is it
> just 1 ULP if everything's done right?) may be corrected by
> multiplying the quotient by the divisor and looking at the difference
> between that product and the dividend.

Assuming the input numbers are exact integers with n bits, you only need
to calculate the reciprocal with n+2 bits or more, then round that up
(i.e. increment the bottom bit) before doing the multiplication.

The result will be slightly higher than the true result, but the error
must be less than an ulp (i.e. less than 1.0), so you can simply
truncate the final answer at the decimal point and it will be exact!

This is exactly the same algorithm which most compilers use these days
to replace constant division by a reciprocal multiplication and a shift.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

From: Robert Redelmeier on 13 Apr 2010 08:07

In alt.lang.asm H. Peter Anvin <hpa(a)zytor.com> wrote in part:
>> There is always the method of long division -- shift, compare
>> & subtract. About n^1.7
>>
>
> Where do you get 1.7 from? It really seems O(n^2) to me,
> or perhaps more accurately O(nm) with n and m being the
> lengths of the dividend and divisor, respectively.

On ~one-half the compares, you bail early (~half-way).
Maybe that should be n^1.5.

-- Robert

First | Prev | Next | Last
Pages: 1 2 3
Prev: Computer nostalgia song
Next: If you don't immediately see your CLAX post...