Floating Point Multiplication - 32 and 64 bit differences [Fortran]

Prev: negative zeros and dot_product matmul
Next: F2003 Standard question to constructors (type name = generic name)

From: brady_fht on 18 May 2010 13:45

Okay - I have changed the program slightly (Real declarations, added
parentheses, and eliminated one variable):

--------------------------------------------------
PROGRAM DPStudy
IMPLICIT NONE
REAL(KIND=KIND(0.D0)) :: FVF
REAL(KIND=KIND(0.D0)) :: THR
REAL(KIND=KIND(0.D0)) :: Dum1

FVF = 3.3D0
THR = 3.0D0

Dum1 = (FVF * THR) * DSQRT(THR)

1 FORMAT(A, ES28.22)
WRITE(*,1) 'Dum1 = ', Dum1
STOP
ENDPROGRAM DPStudy
--------------------------------------------------

Now I want to focus on only two scenarios:

--------------------------------------------------------
COMPILER:
gfortran -m32 -O0 main.f90
OUTPUT:
Dum1 = 1.7147302994931884256857E+01
--------------------------------------------------------

--------------------------------------------------------
COMPILER:
gfortran -m64 -O0 main.f90
OUTPUT:
Dum1 = 1.7147302994931880704144E+01
--------------------------------------------------------

So, neglecting optimization the 32 and 64 bit versions of this program
yield a different result. And the program now uses parentheses to
force the order of multiplication.

Also, here is the binary representation (all 64 bits) of each result
(1st line is 32 bit program result, and 2nd line is the 64 bit program
result):

0110101110101011100101000110010110101101101001001000110000000010
1010101110101011100101000110010110101101101001001000110000000010

So the only two bits that are different are the two least significant
bits.

I guess I am just confused as to how this number is being computed
differently. Maybe I'm missing something obvious here or maybe
somebody is already slapping me in the face with the answer, but I
can't wrap my head around it.

Thanks!

From: brady_fht on 18 May 2010 13:49

On May 18, 11:41 am, Tobias Burnus <bur...(a)net-b.de> wrote:
> On 05/18/2010 05:39 PM, brady_fht wrote:
>
> > I need some help understanding floating point operations. I am seeing
> > a small difference in a program when it is compiled for 32 and 64 bit
> > machines. Here is the program:
>
> > Pretty simple. I am using gfortran 4.3.3 on Ubuntu 9.04. The following
> > shows the different compiler options I use and the resulting output
> > from the above program:
> > gfortran -m32 -O0 main.f90
> > gfortran -m64 -O0 main.f90
>
> By default GCC on x84 (ia32, i686, ...) uses the floating-point
> coprocessor "x87" (-mfpmath=386) while on x86-64 (AMD64, Intel64, em64t,
> ...) by default SSE is used. While SSE uses proper rounding, the x87
> processor has some excess precision, which is used for intermediate
> results. One way to force the rounding is to store the intermediate
> result in memory (opposed to the x87 registers); in GCC/gfortran you can
> use -ffloat-store to force this.*
>
> When you use -O1 (or higher), the compiler plugs in the values of the
> variables and calculates the result at compile time (with proper rounding).
>
> If you want to see what the compiler does, you can use the
> -fdump-tree-all option, which dumps a (simplified, C-syntax-like)
> internal representation. The "original" and "optimized" are probably
> most interesting.
>
> > For what it's worth - similar behavior was observed using the Intel
> > Fortran Compiler 11.1 for Windows.
>
> (Newer) Intel compilers default to SSE.
>
> * = GCC 4.5 and 4.6 also offer -fexcess-precision=, but I think this
> option is not yet supported for GCC's Fortran front end. Additionally,
> GCC 4.5 and 4.6 have now a compiler compile-time configuration option,
> which allows to enable SSE by default (--with-fpmath=sse).
> For details, seehttp://gcc.gnu.org/gcc-4.5/changes.html
>
> * * *
>
> General remark: Don't look too closely at the last digits in
> floating-point calculations and do not assume that changing the order
> won't change the result. I would also suggest you to read the Goldberg
> paper.
>
> a) Goldberg paper:http://www.google.de/search?q=Goldberg+What+Every+Computer+Scientist+...
>
> b) Monniaux paper:http://hal.archives-ouvertes.fr/hal-00128124
>
> c) Semantics of Floating Point Math in GCChttp://gcc.gnu.org/wiki/FloatingPointMath
>
> d) Note on the x87 coprocessor (for GCC)http://gcc.gnu.org/wiki/x87notehttp://gcc.gnu.org/wiki/Math_Optimization_Flags
>
> e) IEEE 754:2008 "IEEE Standard for Floating-Point Arithmetic"http://ieeexplore.ieee.org/servlet/opac?punumber=4610933http://www.validlab.com/754R/drafts/archive/2006-10-04.pdf(last
> publicly available draft I found)
>
> Tobias

Thanks Tobias! A lot of good info here that I will dive into. The "-
ffloat-store" flag did the trick and your explanation makes sense.

Just FYI - I couldn't get the Intel compiler to give the same behavior
using the program I posted but I have a much larger program that was
exhibiting similar behavior and turning optimization on eliminated the
differences in results. I will have to look at it closer.

From: Tim Prince on 18 May 2010 14:04

On 5/18/2010 10:49 AM, brady_fht wrote:
> On May 18, 11:41 am, Tobias Burnus<bur...(a)net-b.de> wrote:

>
> Just FYI - I couldn't get the Intel compiler to give the same behavior
> using the program I posted but I have a much larger program that was
> exhibiting similar behavior and turning optimization on eliminated the
> differences in results. I will have to look at it closer.
When you use ifort, by default, the x87 precision mode is set to 53-bit,
so you won't see as many differences between SSE2 and x87 double
precision as you would with gfortran. x87 single precision x87
expressions are still promoted to double implicitly.
You may also be interested in the list of options which have to be set
to make ifort comply with standards:
http://software.intel.com/en-us/forums/showthread.php?t=74331&o=d&s=lr

--
Tim Prince

From: glen herrmannsfeldt on 18 May 2010 14:19

brady_fht <brady.m.adams(a)gmail.com> wrote:
(snip)

> Thanks Tobias! A lot of good info here that I will dive into. The "-
> ffloat-store" flag did the trick and your explanation makes sense.

> Just FYI - I couldn't get the Intel compiler to give the same behavior
> using the program I posted but I have a much larger program that was
> exhibiting similar behavior and turning optimization on eliminated the
> differences in results. I will have to look at it closer.

Note, though, that if your program is sensitive to the difference
in results in the last bit of a product, then you should fix the
program.

If you are trying to compare different versions of a program,
then the comparison should be such that LSB bit errors are
tolerated.

-- glen

From: robin on 18 May 2010 23:18

"brady_fht" <brady.m.adams(a)gmail.com> wrote in message
news:ad8f6663-4c10-4c02-9aae-f8430808cfe3(a)a16g2000vbr.googlegroups.com...
| Okay - I have changed the program slightly (Real declarations, added
| parentheses, and eliminated one variable):

But you haven't changed DSQRT to SQRT.
Also, you didn't read in the variables.
Have you tried examining the bits of 3.3? in both cases?

| --------------------------------------------------
| PROGRAM DPStudy
| IMPLICIT NONE
| REAL(KIND=KIND(0.D0)) :: FVF
| REAL(KIND=KIND(0.D0)) :: THR
| REAL(KIND=KIND(0.D0)) :: Dum1
|
| FVF = 3.3D0
| THR = 3.0D0
|
| Dum1 = (FVF * THR) * DSQRT(THR)
|
| 1 FORMAT(A, ES28.22)
| WRITE(*,1) 'Dum1 = ', Dum1
| STOP
| ENDPROGRAM DPStudy
| --------------------------------------------------
|
| Now I want to focus on only two scenarios:
|
| --------------------------------------------------------
| COMPILER:
| gfortran -m32 -O0 main.f90
| OUTPUT:
| Dum1 = 1.7147302994931884256857E+01
| --------------------------------------------------------
|
| --------------------------------------------------------
| COMPILER:
| gfortran -m64 -O0 main.f90
| OUTPUT:
| Dum1 = 1.7147302994931880704144E+01
| --------------------------------------------------------
|
|
| So, neglecting optimization the 32 and 64 bit versions of this program
| yield a different result. And the program now uses parentheses to
| force the order of multiplication.
|
| Also, here is the binary representation (all 64 bits) of each result
| (1st line is 32 bit program result, and 2nd line is the 64 bit program
| result):
|
| 0110101110101011100101000110010110101101101001001000110000000010
| 1010101110101011100101000110010110101101101001001000110000000010
|
| So the only two bits that are different are the two least significant
| bits.
|
| I guess I am just confused as to how this number is being computed
| differently. Maybe I'm missing something obvious here or maybe
| somebody is already slapping me in the face with the answer, but I
| can't wrap my head around it.
|
| Thanks!

First | Prev | Next | Last
Pages: 1 2 3
Prev: negative zeros and dot_product matmul
Next: F2003 Standard question to constructors (type name = generic name)