From: Wolfram Hinderer on 7 Jul 2010 19:13 On 7 Jul., 19:32, Ethan Furman <et...(a)stoneleaf.us> wrote: > Nobody wrote: > > On Wed, 07 Jul 2010 15:08:07 +0200, Thomas Jollans wrote: > > >> you should never rely on a floating-point number to have exactly a > >> certain value. > > > "Never" is an overstatement. There are situations where you can rely > > upon a floating-point number having exactly a certain value. > > It's not much of an overstatement. How many areas are there where you > need the number > 0.100000000000000005551115123125782702118158340454101562500000000000? > > If I'm looking for 0.1, I will *never* (except by accident ;) say > > if var == 0.1: > > it'll either be <= or >=. The following is an implementation of a well-known algorithm. Why would you want to replace the floating point comparisons? With what? (This is toy-code.) ##### from random import random def min_cost_path(cost_right, cost_up): """ return minimal cost and its path in a rectangle - going up and right only """ cost = dict() size_x, size_y = max(cost_right) #compute minimal cost cost[0, 0] = 0.0 for x in range(size_x): cost[x + 1, 0] = cost[x, 0] + cost_right[x, 0] for y in range(size_y): cost[0, y + 1] = cost[0, y] + cost_up[0, y] for x in range(size_x): cost[x + 1, y + 1] = min(cost[x, y + 1] + cost_right[x, y + 1], cost[x + 1, y] + cost_up[x + 1, y]) #compute path (reversed) x = size_x y = size_y path = [] while x != 0 and y != 0: if x == 0: y -= 1 path.append("u") elif y == 0: x -= 1 path.append("r") elif cost[x - 1, y] + cost_right[x - 1, y] == cost[x, y]: # fp compare x -= 1 path.append("r") elif cost[x, y - 1] + cost_up[x, y - 1] == cost[x, y]: # fp compare y -= 1 path.append("u") else: raise ValueError return cost[size_x, size_y], "".join(reversed(path)) if __name__ == "__main__": size = 100 cost_right = dict(((x, y), random()) for x in range(size) for y in range(size)) cost_up = dict(((x, y), random()) for x in range(size) for y in range(size)) print min_cost_path(cost_right, cost_up)
From: Zooko O'Whielacronx on 7 Jul 2010 23:41 I'm starting to think that one should use Decimals by default and reserve floats for special cases. This is somewhat analogous to the way that Python provides arbitrarily-big integers by default and Python programmers only use old-fashioned fixed-size integers for special cases, such as interoperation with external systems or highly optimized pieces (in numpy or in native extension modules, for example). Floats are confusing. I've studied them more than once over the years but I still can't really predict confidently what floats will do in various situations. And most of the time (in my experience) the inputs and outputs to your system and the literals in your code are actually decimal, so converting them to float immediately introduces a lossy data conversion before you've even done any computation. Decimal doesn't have that problem. >From now on I'll probably try to use Decimals exclusively in all my new Python code and switch to floats only if I need to interoperate with an external system that requires floats or I have some tight inner loop that needs to be highly optimized. Regards, Zooko
From: David Cournapeau on 8 Jul 2010 00:04 On Thu, Jul 8, 2010 at 5:41 AM, Zooko O'Whielacronx <zooko(a)zooko.com> wrote: > I'm starting to think that one should use Decimals by default and > reserve floats for special cases. > > This is somewhat analogous to the way that Python provides > arbitrarily-big integers by default and Python programmers only use > old-fashioned fixed-size integers for special cases, such as > interoperation with external systems or highly optimized pieces (in > numpy or in native extension modules, for example). I don't think it is analogous at all. Arbitrary-bit integers have a simple tradeoff: you are willing to lose performance and memory for bigger integer. If you leave performance aside, there is no downside that I know of for using big int instead of "machine int". Since you are using python, you already bought this kind of tradeoff anyway. Decimal vs float is a different matter altogether: decimal has downsides compared to float. First, there is this irreconcilable fact that no matter how small your range is, it is impossible to represent exactly all (even most) numbers exactly with finite memory - float and decimal are two different solutions to this issue, with different tradeoffs. Decimal are more "intuitive" than float for numbers that can be represented as decimal - but most numbers cannot be represented as (finite) decimal. Except for some special usecases, you cannot expect to exactly represent real numbers. Once you accept that fact, you can make a decision on decimal, fraction, float or whatever format you see fit. > And most of the time (in my experience) the inputs and outputs to your > system and the literals in your code are actually decimal, so > converting them to float immediately introduces a lossy data > conversion before you've even done any computation. Decimal doesn't > have that problem. That's not true anymore once you start doing any computation, if by decimal you mean finite decimal. And that will be never true once you start using non trivial computation (i.e. transcendental functions like log, exp, etc...). David
From: Zooko O'Whielacronx on 8 Jul 2010 00:53 On Wed, Jul 7, 2010 at 10:04 PM, David Cournapeau <cournape(a)gmail.com> wrote: > > Decimal vs float is a different matter altogether: decimal has > downsides compared to float. First, there is this irreconcilable fact > that no matter how small your range is, it is impossible to represent > exactly all (even most) numbers exactly with finite memory - float and > decimal are two different solutions to this issue, with different > tradeoffs. Decimal are more "intuitive" than float for numbers that > can be represented as decimal - but most numbers cannot be represented > as (finite) decimal. This is not a downside of decimal as compared to float, since most numbers also cannot be represented as float. >> And most of the time (in my experience) the inputs and outputs to your >> system and the literals in your code are actually decimal, so >> converting them to float immediately introduces a lossy data >> conversion before you've even done any computation. Decimal doesn't >> have that problem. > > That's not true anymore once you start doing any computation, if by > decimal you mean finite decimal. I don't understand. I described two different problems: problem one is that the inputs, outputs and literals of your program might be in a different encoding (in my experience they have most commonly been in decimal). Problem two is that your computations may be lossy. If the inputs, outputs and literals of your program are decimal (as they have been for most of my programs) then using decimal is better than using float because of problem one. Neither has a strict advantage over the other in terms of problem two. (There is also problem zero, which is that floats more confusing, which is how this thread got started. Problem zero is probably the most important problem for many cases.) > And that will be never true once you > start using non trivial computation (i.e. transcendental functions > like log, exp, etc...). I'm sorry, what will never be true? Are you saying that decimals have a disadvantage compared to floats? If so, what is their disadvantage? (And do math libraries like http://code.google.com/p/dmath/ help ?) Regards, Zooko
From: Steven D'Aprano on 8 Jul 2010 03:32
On Thu, 08 Jul 2010 06:04:33 +0200, David Cournapeau wrote: > On Thu, Jul 8, 2010 at 5:41 AM, Zooko O'Whielacronx <zooko(a)zooko.com> > wrote: >> I'm starting to think that one should use Decimals by default and >> reserve floats for special cases. >> >> This is somewhat analogous to the way that Python provides >> arbitrarily-big integers by default and Python programmers only use >> old-fashioned fixed-size integers for special cases, such as >> interoperation with external systems or highly optimized pieces (in >> numpy or in native extension modules, for example). > > I don't think it is analogous at all. Arbitrary-bit integers have a > simple tradeoff: you are willing to lose performance and memory for > bigger integer. If you leave performance aside, there is no downside > that I know of for using big int instead of "machine int". Well, sure, but that's like saying that if you leave performance aside, there's no downside to Bubblesort instead of Quicksort. However, I believe that in Python at least, the performance cost of arbitrary-sized longs is quite minimal compared to the benefit, at least for "reasonable" sized ints, and so the actual real-world cost of unifying the int and long types is minimal. On the other hand, until Decimal is re-written in C, it will always be *significantly* slower than float. $ python -m timeit "2.0/3.0" 1000000 loops, best of 3: 0.139 usec per loop $ python -m timeit -s "from decimal import Decimal as D" "D(2)/D(3)" 1000 loops, best of 3: 549 usec per loop That's three orders of magnitude difference in speed. That's HUGE, and *alone* is enough to disqualify changing to Decimal as the default floating point data type. Perhaps in the future, if and when Decimal has a fast C implementation, this can be re-thought. > Since you are > using python, you already bought this kind of tradeoff anyway. > > Decimal vs float is a different matter altogether: decimal has downsides > compared to float. First, there is this irreconcilable fact that no > matter how small your range is, it is impossible to represent exactly > all (even most) numbers exactly with finite memory - float and decimal > are two different solutions to this issue, with different tradeoffs. Yes, but how is this a downside *compared* to float? In what way does Decimal have downsides that float doesn't? Neither can represent arbitrary real numbers exactly, but if anything float is *worse* compared to Decimal for two reasons: * Python floats are fixed to a single number of bits, while the size of Decimals can be configured by the user; * floats can represent sums of powers of two exactly, while Decimals can represent sums of powers of ten exactly. Not only does that mean that any number exactly representable as a float can also be exactly represented as a Decimal, but Decimals can *additionally* represent exactly many numbers of human interest that floats cannot. > Decimal are more "intuitive" than float for numbers that can be > represented as decimal - but most numbers cannot be represented as > (finite) decimal. True, but any number that can't be, also can't be exactly represented as a float either, so how does using float help? [...] >> And most of the time (in my experience) the inputs and outputs to your >> system and the literals in your code are actually decimal, so >> converting them to float immediately introduces a lossy data conversion >> before you've even done any computation. Decimal doesn't have that >> problem. > > That's not true anymore once you start doing any computation, if by > decimal you mean finite decimal. And that will be never true once you > start using non trivial computation (i.e. transcendental functions like > log, exp, etc...). But none of those complications are *inputs*, and, again, floats suffer from exactly the same problem. -- Steven |