From: Ulrich Eckhardt on
Carlos Moreno wrote:
> Ulrich Eckhardt wrote:
[...]
>> A more sophisticated way is to first determine the largest multiple of
>> your range that is smaller or equal to RAND_MAX, call rand() until the
>> value is
>> in that range and then simply apply the modulo operator to the result.
>
> NO!!!!
>
> Do *NOT* use modulo for this sort of things -- in the case you
> describe, integer division works perfectly fine, and gives you
> a perfectly uniform distribution (i.e., it introduces no
> "disuniformities" in the distribution, with respect to the
> base RNG -- if rand() is perfectly uniform, then drawing
> numbers until you get one below the highest multiple of your
> range, then dividing by the right number would give you a
> perfectly uniform distribution in the given range)

I don't get you. I for sure did not say to use 'rand()%max_value', which
introduces these uneven distributions and is therefore wrong - although I
would like to amend that the nonuniformity is a function of max_value and
RAND_MAX, for small max_values this might be good enough.

>> The reason is easy:
>> Assume RAND_MAX=9, i.e. rand() gives you numbers from 0 to 9. If you need
>> numbers in the range 0 to 3, the following mapping applies:
>> 0 -> 0
>> 1 -> 1
>> 2 -> 2
>> 3 -> 3
>> 4 -> 0
>> 5 -> 1
>> 6 -> 2
>> 7 -> 3
>> 8 -> reroll
>> 9 -> reroll
>
> Good, except that it should be:
>
> 0 -> 0
> 1 -> 0
> 2 -> 1
> 3 -> 1
> 4 -> 2
> 5 -> 2
> 6 -> 3
> 7 -> 3
> 8 -> reroll
> 9 -> reroll

Assuming rand() is evenly distributed, both mappings are equivalent, or am I
missing something?

Uli


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: kevin.hall on
> It's not universally true, nor even as broadly true as many
> people seem to believe.

Ok, fine. But again, the C library implementation of rand() is
notorious for this behavior -- and that is what people in this thread
are recommending. Perhaps not all rand() implementations are bad --
but some certainly are.

Anyway, it seems that many people are fighting to defend their practice
of using modulo. Division uses higher order bits which should be the
most random in all good PRNGs. Modulo uses the lowest order bits which
is known to NOT be very random in some cases. So which is better
practice for general programming?

> No, it addresses a different, real, problem.

I should have been more clear. I understand that some slots in general
will get more hits than others. I was trying to address the problem
from a different perspective. In my example, I ended up with 8 valid
slots (just like Ulrich's example) and that is what I meant by "This is
the same as above".

Does this make sense?


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: kevin.hall on
What denotes a good random number generator? No PSEUDO-random number
generator is good compared to a TRUE random number generator. But the
C library rand() (which Ulrich Eckhardt and others in this thread have
recommended) is usually implemented as a linear congruential PRNG. And
these are notorious for having less random lower order bits.

What PRNG do you use? I use the mersenne-twister for anything that
really matters in my work. Though I believe the MT is less susceptible
to having lower order bits be less random, I haven't yet seen a study
that proves that. I would like to though -- seriously I would. If you
can point me to one or to a study that proves that the lower order bits
for the PRNGs that you use are not less random, please post a link.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Francis Glassborow on
In article <1139433651.899829.311980(a)g47g2000cwa.googlegroups.com>,
kevin.hall(a)motioneng.com writes
>Modulo is a poor operation in helping find a random number. The reason
>is that the lowest bits in a psuedo-random number are less random than
>the upper bits. It's much better to use division:

And that is completely irrelevant even if it were true (which it
generally isn't) unless you are dividing by a power of two. However this
is definitely wandering far from the subject matter for this newsgroup.

However I see so many requests for functionality to select at random one
element from a range that I begin to wonder if we should not have a
standard library function template that simply returns a random member
from a container. We could then apply it to an array of int when we
wanted to select an integer value from a small range.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Thomas Maeder on
Carlos Moreno <moreno_at_mochima_dot_com(a)mailinator.com> writes:

>> the following mapping applies:
>> 0 -> 0
>> 1 -> 1
>> 2 -> 2
>> 3 -> 3
>> 4 -> 0
>> 5 -> 1
>> 6 -> 2
>> 7 -> 3
>> 8 -> reroll
>> 9 -> reroll
>
> Good, except that it should be:
>
> 0 -> 0
> 1 -> 0
> 2 -> 1
> 3 -> 1
> 4 -> 2
> 5 -> 2
> 6 -> 3
> 7 -> 3
> 8 -> reroll
> 9 -> reroll

If the numbers 0 to 7 occur with equal probability, I don't see a
fundamental difference why your suggestion should be better.

Care to elaborate?


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]