Python -- floating point arithmetic [Python]

Prev: Python 3 - Is PIL/wxPython/PyWin32 supported?
Next: Python -- floating point arithmetic

From: Wolfram Hinderer on 7 Jul 2010 19:13

On 7 Jul., 19:32, Ethan Furman <et...(a)stoneleaf.us> wrote:
> Nobody wrote:
> > On Wed, 07 Jul 2010 15:08:07 +0200, Thomas Jollans wrote:
>
> >> you should never rely on a floating-point number to have exactly a
> >> certain value.
>
> > "Never" is an overstatement. There are situations where you can rely
> > upon a floating-point number having exactly a certain value.
>
> It's not much of an overstatement. How many areas are there where you
> need the number
> 0.100000000000000005551115123125782702118158340454101562500000000000?
>
> If I'm looking for 0.1, I will *never* (except by accident ;) say
>
> if var == 0.1:
>
> it'll either be <= or >=.

The following is an implementation of a well-known algorithm.
Why would you want to replace the floating point comparisons? With
what?

(This is toy-code.)

#####
from random import random

def min_cost_path(cost_right, cost_up):
""" return minimal cost and its path in a rectangle - going up and
right only """
cost = dict()
size_x, size_y = max(cost_right)

#compute minimal cost
cost[0, 0] = 0.0
for x in range(size_x):
cost[x + 1, 0] = cost[x, 0] + cost_right[x, 0]
for y in range(size_y):
cost[0, y + 1] = cost[0, y] + cost_up[0, y]
for x in range(size_x):
cost[x + 1, y + 1] = min(cost[x, y + 1] + cost_right[x, y
+ 1],
cost[x + 1, y] + cost_up[x + 1,
y])

#compute path (reversed)
x = size_x
y = size_y
path = []
while x != 0 and y != 0:
if x == 0:
y -= 1
path.append("u")
elif y == 0:
x -= 1
path.append("r")
elif cost[x - 1, y] + cost_right[x - 1, y] == cost[x, y]: # fp
compare
x -= 1
path.append("r")
elif cost[x, y - 1] + cost_up[x, y - 1] == cost[x, y]: # fp
compare
y -= 1
path.append("u")
else:
raise ValueError

return cost[size_x, size_y], "".join(reversed(path))

if __name__ == "__main__":
size = 100
cost_right = dict(((x, y), random()) for x in range(size) for y in
range(size))
cost_up = dict(((x, y), random()) for x in range(size) for y in
range(size))
print min_cost_path(cost_right, cost_up)

From: Zooko O'Whielacronx on 7 Jul 2010 23:41

I'm starting to think that one should use Decimals by default and
reserve floats for special cases.

This is somewhat analogous to the way that Python provides
arbitrarily-big integers by default and Python programmers only use
old-fashioned fixed-size integers for special cases, such as
interoperation with external systems or highly optimized pieces (in
numpy or in native extension modules, for example).

Floats are confusing. I've studied them more than once over the years
but I still can't really predict confidently what floats will do in
various situations.

And most of the time (in my experience) the inputs and outputs to your
system and the literals in your code are actually decimal, so
converting them to float immediately introduces a lossy data
conversion before you've even done any computation. Decimal doesn't
have that problem.

>From now on I'll probably try to use Decimals exclusively in all my
new Python code and switch to floats only if I need to interoperate
with an external system that requires floats or I have some tight
inner loop that needs to be highly optimized.

Regards,

Zooko

From: David Cournapeau on 8 Jul 2010 00:04

On Thu, Jul 8, 2010 at 5:41 AM, Zooko O'Whielacronx <zooko(a)zooko.com> wrote:
> I'm starting to think that one should use Decimals by default and
> reserve floats for special cases.
>
> This is somewhat analogous to the way that Python provides
> arbitrarily-big integers by default and Python programmers only use
> old-fashioned fixed-size integers for special cases, such as
> interoperation with external systems or highly optimized pieces (in
> numpy or in native extension modules, for example).

I don't think it is analogous at all. Arbitrary-bit integers have a
simple tradeoff: you are willing to lose performance and memory for
bigger integer. If you leave performance aside, there is no downside
that I know of for using big int instead of "machine int". Since you
are using python, you already bought this kind of tradeoff anyway.

Decimal vs float is a different matter altogether: decimal has
downsides compared to float. First, there is this irreconcilable fact
that no matter how small your range is, it is impossible to represent
exactly all (even most) numbers exactly with finite memory - float and
decimal are two different solutions to this issue, with different
tradeoffs. Decimal are more "intuitive" than float for numbers that
can be represented as decimal - but most numbers cannot be represented
as (finite) decimal.

Except for some special usecases, you cannot expect to exactly
represent real numbers. Once you accept that fact, you can make a
decision on decimal, fraction, float or whatever format you see fit.

> And most of the time (in my experience) the inputs and outputs to your
> system and the literals in your code are actually decimal, so
> converting them to float immediately introduces a lossy data
> conversion before you've even done any computation. Decimal doesn't
> have that problem.

That's not true anymore once you start doing any computation, if by
decimal you mean finite decimal. And that will be never true once you
start using non trivial computation (i.e. transcendental functions
like log, exp, etc...).

David

From: Zooko O'Whielacronx on 8 Jul 2010 00:53

On Wed, Jul 7, 2010 at 10:04 PM, David Cournapeau <cournape(a)gmail.com> wrote:
>
> Decimal vs float is a different matter altogether: decimal has
> downsides compared to float. First, there is this irreconcilable fact
> that no matter how small your range is, it is impossible to represent
> exactly all (even most) numbers exactly with finite memory - float and
> decimal are two different solutions to this issue, with different
> tradeoffs. Decimal are more "intuitive" than float for numbers that
> can be represented as decimal - but most numbers cannot be represented
> as (finite) decimal.

This is not a downside of decimal as compared to float, since most
numbers also cannot be represented as float.

>> And most of the time (in my experience) the inputs and outputs to your
>> system and the literals in your code are actually decimal, so
>> converting them to float immediately introduces a lossy data
>> conversion before you've even done any computation. Decimal doesn't
>> have that problem.
>
> That's not true anymore once you start doing any computation, if by
> decimal you mean finite decimal.

I don't understand. I described two different problems: problem one is
that the inputs, outputs and literals of your program might be in a
different encoding (in my experience they have most commonly been in
decimal). Problem two is that your computations may be lossy. If the
inputs, outputs and literals of your program are decimal (as they have
been for most of my programs) then using decimal is better than using
float because of problem one. Neither has a strict advantage over the
other in terms of problem two.

(There is also problem zero, which is that floats more confusing,
which is how this thread got started. Problem zero is probably the
most important problem for many cases.)

> And that will be never true once you
> start using non trivial computation (i.e. transcendental functions
> like log, exp, etc...).

I'm sorry, what will never be true? Are you saying that decimals have
a disadvantage compared to floats? If so, what is their disadvantage?

(And do math libraries like http://code.google.com/p/dmath/ help ?)

Regards,

Zooko

From: Steven D'Aprano on 8 Jul 2010 03:32

On Thu, 08 Jul 2010 06:04:33 +0200, David Cournapeau wrote:

> On Thu, Jul 8, 2010 at 5:41 AM, Zooko O'Whielacronx <zooko(a)zooko.com>
> wrote:
>> I'm starting to think that one should use Decimals by default and
>> reserve floats for special cases.
>>
>> This is somewhat analogous to the way that Python provides
>> arbitrarily-big integers by default and Python programmers only use
>> old-fashioned fixed-size integers for special cases, such as
>> interoperation with external systems or highly optimized pieces (in
>> numpy or in native extension modules, for example).
>
> I don't think it is analogous at all. Arbitrary-bit integers have a
> simple tradeoff: you are willing to lose performance and memory for
> bigger integer. If you leave performance aside, there is no downside
> that I know of for using big int instead of "machine int".

Well, sure, but that's like saying that if you leave performance aside,
there's no downside to Bubblesort instead of Quicksort.

However, I believe that in Python at least, the performance cost of
arbitrary-sized longs is quite minimal compared to the benefit, at least
for "reasonable" sized ints, and so the actual real-world cost of
unifying the int and long types is minimal.

On the other hand, until Decimal is re-written in C, it will always be
*significantly* slower than float.

$ python -m timeit "2.0/3.0"
1000000 loops, best of 3: 0.139 usec per loop
$ python -m timeit -s "from decimal import Decimal as D" "D(2)/D(3)"
1000 loops, best of 3: 549 usec per loop

That's three orders of magnitude difference in speed. That's HUGE, and
*alone* is enough to disqualify changing to Decimal as the default
floating point data type.

Perhaps in the future, if and when Decimal has a fast C implementation,
this can be re-thought.

> Since you are
> using python, you already bought this kind of tradeoff anyway.
>
> Decimal vs float is a different matter altogether: decimal has downsides
> compared to float. First, there is this irreconcilable fact that no
> matter how small your range is, it is impossible to represent exactly
> all (even most) numbers exactly with finite memory - float and decimal
> are two different solutions to this issue, with different tradeoffs.

Yes, but how is this a downside *compared* to float? In what way does
Decimal have downsides that float doesn't? Neither can represent
arbitrary real numbers exactly, but if anything float is *worse* compared
to Decimal for two reasons:

* Python floats are fixed to a single number of bits, while the size of
Decimals can be configured by the user;

* floats can represent sums of powers of two exactly, while Decimals can
represent sums of powers of ten exactly. Not only does that mean that any
number exactly representable as a float can also be exactly represented
as a Decimal, but Decimals can *additionally* represent exactly many
numbers of human interest that floats cannot.

> Decimal are more "intuitive" than float for numbers that can be
> represented as decimal - but most numbers cannot be represented as
> (finite) decimal.

True, but any number that can't be, also can't be exactly represented as
a float either, so how does using float help?

[...]
>> And most of the time (in my experience) the inputs and outputs to your
>> system and the literals in your code are actually decimal, so
>> converting them to float immediately introduces a lossy data conversion
>> before you've even done any computation. Decimal doesn't have that
>> problem.
>
> That's not true anymore once you start doing any computation, if by
> decimal you mean finite decimal. And that will be never true once you
> start using non trivial computation (i.e. transcendental functions like
> log, exp, etc...).

But none of those complications are *inputs*, and, again, floats suffer
from exactly the same problem.

--
Steven

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: Python 3 - Is PIL/wxPython/PyWin32 supported?
Next: Python -- floating point arithmetic