From: kanze on
Allan W wrote:
> kanze wrote:
> > Allan W wrote:
> > > > Diego Martins wrote:
> > > > > double zero_one_rand()
> > > > > {
> > > > > double r = static_cast<double>(rand()) / rand();
> > > > > return r <= 1 ? r : 1/r;
> > > > > }

> [Snip]

> > > I was surprised. Other than the "Err" outputs, the results
> > > were pretty evenly distributed. The line graph of the results
> > > were practically a straight line. There was a *Slight* bias
> > > towards either zero or one.

> > I'm not sure what you were measuring here. The line graph
> > of what results?

> I graphed the return values verses the number of times each
> value was seen. The graph was very nearly linear, meaning
> (for instance) that the value 0.35 was seen almost exactly the
> same number of times as the value 0.36.

In other words, not only was the graph linear, but it had a
slope of 0. A horizontal line, in sum.

> > My tests with the actual random number generator on my
> > system shows that (int)(10.0 * r) is occasionally >= 1; for
> > 100000 trials, I get between 1 and 4 in count[10] in my test
> > above.

> I'm surprised! That's not supposed to EVER happen, is it?

Well, as I wrote it above, it should actually happen about 90%
of the time:-). What I meant (and what apparently you
understood) is that r is occasionally >= 1, or rather, that
(int)(10.0 * r) is occasionally >= 10.0. For memory, the
code giving r were:

double n = (double)rand() ;
double d = (double)rand() ;
double r = n / d ;
if ( r >= 1.0 ) {
r = 1.0 / r ;
}

In this case, r will be equal 1.0 (and thus >= 1.0) if the two
calls to random return the same value; this will occasionally be
true if the random generator is a good one, and has a period
longer than RAND_MAX. Given that RAND_MAX on my system is
defined as 32767, it would be a pretty poor generator if this
never happened.

Another potential explination would be because of rounding
errors in the floating point arithmetic. With RAND_MAX at
32767, I think that the granularity of n and d is enough to
ensure that this can never occur, but when the number of bits
needed to represent RAND_MAX becomes close to the number of
bits of precision in a double, it cannot be excluded.

> > But what I want isn't experimental data. I'm more
> > interested in actual analysis. I think my first test,
> > above, comes the closest to simulating a full cycle, and it
> > makes me very suspicious.

> But I think I did a full cycle too, and got pretty different
> results. I'll try to take a closer look when I have more
> time...

I suspect that the granularity of the results using short cycles
can create suspicious results, even if there is nothing wrong
with the underlying principle. That's why I did the extra
distribution check.

Keith Duggar's analysis has convinced me that my suspicions were
wrong. I'll admit that I've not yet had the time to analyse the
equations in the link posted by sstump. I think, however, that
part of the problem is that my version generates random values
over the interval [0,1], and not [0,1). In the first case,
intuitively at least, an average value of 0.5 seems correct; in
the second, of course, it isn't -- intuitively, I would expect
0.5 - G/2, where G is the granularity.

But I'll admit that I'm no mathematician, so I could be
completely off concerning these last two paragraphs.

--
James Kanze GABI Software
Conseils en informatique orient?e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]