From: Ron Shepard on
In article <1jdujsc.utdmji1u3elbsN%nospam(a)see.signature>,
nospam(a)see.signature (Richard Maine) wrote:

> steve <kargls(a)comcast.net> wrote:
>
> > On Feb 13, 2:32 am, tho...(a)antispam.ham wrote:
> > > cubsfan <cubsfan...(a)gmail.com> writes:
>
> > > If so, then simply multiply by 4E16 and then subtract 2E16.
> >
> > The subtraction of two large number can lead to catastrophic
> > cancellation, which may lead to some undesired bias in the
> > final sequence. It might be better to multiply the result
> > of random_number by 2 and then subtraction 1, and then multiply
> > by 2e16.
>
> The subtraction of two small numbers can lead to the same kind of
> cancellation. Cancellation happens when you subtract two nearly equal
> numbers of any magnitude. It is independent of scale (at least with
> floatting point, which these numbers are). The above-proposed change
> should make no difference in terms of cancellation.
>
> I would tend to do the subtraction before the scaling, but for a reason
> having nothing to do with cancellation. I would do it just because it
> keeps the scaling factor in one place in the equation instead of two.

After

call random_number(x)
y = 2.0 * x - 1.0

then y is in the range -1.0 <= y < 1.0. That is, the range is not
symmetric. Also, there are a lot of possible small values for x that all
map to y=-1.0 if the expression rounds to nearest. It has never been
clear to me if PRNGs are supposed to generate these small numbers or not.

For most applications I use random numbers for, this doesn't matter, so
I don't need to separate out the special cases. However, I've always
thought that things would have been much simpler if random numbers were
generated in the range 0.5 <= x < 1.0 so that they would all have the
same binary exponent. This does not eliminate problems associated with
subtracting two almost equal numbers (that is an intrinsic issue with
floating point), but it would eliminate the problems associated with the
large exponent range.

Are there well-known solutions to these problems? If you want to
generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat
distribution, and with all possible floating point values in that range
being generated, what is the best way?

$.02 -Ron Shepard
From: William Clodius on
Ron Shepard <ron-shepard(a)NOSPAM.comcast.net> wrote:

> <snip>
> After
>
> call random_number(x)
> y = 2.0 * x - 1.0
>
> then y is in the range -1.0 <= y < 1.0. That is, the range is not
> symmetric. Also, there are a lot of possible small values for x that all
> map to y=-1.0 if the expression rounds to nearest. It has never been
> clear to me if PRNGs are supposed to generate these small numbers or not.
>
> For most applications I use random numbers for, this doesn't matter, so
> I don't need to separate out the special cases. However, I've always
> thought that things would have been much simpler if random numbers were
> generated in the range 0.5 <= x < 1.0 so that they would all have the
> same binary exponent. This does not eliminate problems associated with
> subtracting two almost equal numbers (that is an intrinsic issue with
> floating point), but it would eliminate the problems associated with the
> large exponent range.
>
> Are there well-known solutions to these problems? If you want to
> generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat
> distribution, and with all possible floating point values in that range
> being generated, what is the best way?
>
> $.02 -Ron Shepard

Normally psuedorandom number generators initially generate a set of
integers that are uniformly distributed over a range 0...2**n-1. These
are then converted to a (near) uniform distribution over 0<=x< 1 by
multiplying the integer by 2.**(-n) or its double precision equivalent.
This give samples at the start of each interval. It can in many cases be
converted to a sample at the center of each interval by adding
2.**(-n-1) to each sample, provided you know the n of the integer
distribution. If the conversion is possible then the samples are over
0<x<1. Typically n is 16, 32 or 64, as they match natural integer sizes
for common processors, but sometimes other sizes are used.

The above assumes that 2**(-n) is larger than the relative precision of
the floating point type. It will not work for n=32 for a single precison
float or n=64 for a double precision float. In those cases rounding will
cause the samples intended to be near 1 to be exactly 1. I have recently
been thinking than an n=48 would be useful for double precision random
number generators for that reason. An n=48 can be implemented by
combininb a high quality n=32 with a high qhality n=16 that are
uncorrelated with one another.

--
Bill Clodius
los the lost and net the pet to email
From: Louis Krupp on
Ron Shepard wrote:
<snip>
> Are there well-known solutions to these problems? If you want to
> generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat
> distribution, and with all possible floating point values in that range
> being generated, what is the best way?

To quote E P Chandler in another part of this thread, if your random
number generator returns values in [0, 1):

.... draw a second random number, U2. If U2 < 0.5 then U1=-U1.

The only problem I can see is that zero will turn up twice as often as
it should.

I suppose you could do this, as you mentioned:

call random_number(x)
y = 2.0 * x - 1.0

and toss out y values of -1. If, as you also mentioned, you want to
restrict the random number range to 0.5 <= y < 1.0, you could throw away
values that are outside that range.

Louis
From: glen herrmannsfeldt on
Ron Shepard <ron-shepard(a)nospam.comcast.net> wrote:
(snip)

> call random_number(x)
> y = 2.0 * x - 1.0

> then y is in the range -1.0 <= y < 1.0. That is, the range is not
> symmetric. Also, there are a lot of possible small values for x that all
> map to y=-1.0 if the expression rounds to nearest. It has never been
> clear to me if PRNGs are supposed to generate these small numbers or not.

If it does generate them, those bits will be lost in the subtraction.
The ones I have seen in detail take a fixed number of bits, add
an exponent, and then normalize it. In that case, no extra bits
will be generated.

People like to mention the asymmetry in the range, but I am not
convinced that it matters. With 0.0<=x<1.0 the mean is 0.5 and
standard deviation, if I figured it right, of 1/24th. The standard
deviation of the mean, then, is 1/24/sqrt(N) averaging N values.
For a statistically significant deviation from the mean, it has
to be greater than 1/24/sqrt(N). I suppose if the RNG only
supplies 16 random bits then you could notice, but 24 bits in
single precision will require N to be more than 2**68.
With 53 bits in double precision N should be more than 2**110.

> For most applications I use random numbers for, this doesn't matter, so
> I don't need to separate out the special cases. However, I've always
> thought that things would have been much simpler if random numbers were
> generated in the range 0.5 <= x < 1.0 so that they would all have the
> same binary exponent. This does not eliminate problems associated with
> subtracting two almost equal numbers (that is an intrinsic issue with
> floating point), but it would eliminate the problems associated with the
> large exponent range.

The standard requires very little, other than a uniform distribution
that is 0.0<=x<1.0.

> Are there well-known solutions to these problems? If you want to
> generate PRNs that satisfy, say, -1.0 < y < 1.0, with a flat
> distribution, and with all possible floating point values in that range
> being generated, what is the best way?

The chance of those small numbers being generated is extremely small.
Generate more than 1024 random bits, count the number of leading
zeros, take the next 53 bits, and add the appropriate exponent.

The probability of a uniform binary number having N leading zeros
is 2.**(-N). Not likely you will ever see those really small values.

-- glen