From: orz on
This thread is intended for discussion of statistical tests for random
number generators and software packages that include such tests.

Specifically, the software packages that I'm aware of:

TestU01, by Richard Simard and Pierre L'Ecuyer
http://www.iro.umontreal.ca/~simardr/testu01/tu01.html

RaBiGeTe, by Cristiano
http://www.webalice.it/cristiano.pi/rabigete/

Diehard, by George Marsaglia
http://www.stat.fsu.edu/pub/diehard/

ENT, by John Walker
http://www.fourmilab.ch/random/

Dieharder, by Robert G. Brown (rgb) and Dirk Eddelbuettel and David
Bauer
http://www.phy.duke.edu/~rgb/General/dieharder.php

PractRand, by myself, not yet released but I intend to release this
weekend.
From: orz on
On my list of software packages I forgot to include the NIST
package.

On Aug 13, 8:20 am, "Cristiano" <cristiaN...(a)gmail.com> wrote:
> orz wrote:
> > I have not been very impressed with Diehard or the NIST stuff or RaBiGeTe
> > or ENT.
>
> As I told you, I got little feedback for RaBiGeTe. What should I do to get
> RaBiGeTe more "impressive"?
>
> Cristiano

Well... things I'd like to see:
1. A clearly defined standard set of tests. Or several with clear
and simple definitions of what they're specialized for. ie a set
optimized for finding bias efficiently on a per-time basis and another
for finding bias efficiently on a per-bit basis.

2. A list of how RaBiGeTes standard batteries of tests perform on a
long list of different RNGs. And preferably a comparison to how those
RNGs perform on other batteries of tests. Grab the TestU01 paper and
take a look at pages 28 and 29 - there's a list there of like 100
different RNGs, with the number of subtests failed and the number of
subtests that gave suspicious results for each RNG on SmallCrush,
Crush, and BigCrush. Failure there is defined as p < 10^-10 (or >
1-10^-10), suspicious as p<10^-6 (or > 1-10^-6). You can glance back
a few posts to see an informal poorly formated semi-equivalent for my
test suite. Hopefully your list of RNGs would include a great deal of
diversity... I'd like to see a spectrum of low to high quality RNGs
for each of simple arithmatic/bitwise type RNGs, simple multiplicative
RNGs, small medium and large cyclic buffer (fibonaci-style) RNGs,
small and medium indirection-based RNGs, and complex / special RNGs.
Preferably including single-cycle reversible RNGs, multicycle
reversible RNGs, and multicycle irreversible RNGs in each category,
and maybe a few single-cycle irreversible RNGs.

3. Hopefully the stuff documented in #2 above shows you in a good
light. TestU01, for instance, finds flaws in a variety of RNGs,
including some that are considered good quality by some folks like
MT19937, one version of Marsaglia's KISS, etc. Some of those bias
detections seem a bit brittle - I initially failed to reproduce the
MT19937 bias detection because I accidentally walked the buffer
backwards in my implementation of MT19937 and TestU01 couldn't find
flaws with that tiny change made, but still, any bias detection in
MT19937 is pretty good. My own tests fail to find flaws in MT19937
(though they will if the tempering is removed from MT19937, or if the
GFSR size is reduced, and those bias detections are not brittle), but
they find flaws in everything else that TestU01 BigCrush does that
I've tested so far (though in some cases it takes up to twice as
long), and my stuff finds flaws in quite a bit of stuff that TestU01
misses, including some that my stuff finds quickly.

4. The ability to pass random bits directly to the tests without the
overhead or limitations of writing them to disk first. You say that's
supported by RaBiGeTe, but I find no mention of it in the
documentation accompanying the latest release and I didn't see it
mentioned on the website. There is source code there, so I considered
adapting it to patch my RNG output straight in to your tests, but
there were problems with that:
4 A. I did not see documentation on the interfaces for calling your
tests or test suites directly. Maybe I should have looked harder?
4 B. When I glanced at your code, I came away with the impression
that it was GPLed. Software that must be linked with to be useful
does not mix well with viral licensing. On further look now, the
license picture appears not so bad.
4 C. If you're intending other people to link their code with you,
perhaps you should be compiling to a library instead of or in addition
to an executable? That's what TestU01 does anyway, and what I'm doing
in the package I'm cleaning up for release (though not what I do in
the package I'm using for research).

5. A standard output that is easy to interpret even if you've never
seen it before. TestU01 does really well in that respect. RaBiGeTe
does okay, not great. My own stuff... not so well, though not nearly
as bad as some I've seen. Hopefully I'll get that improved soon.
From: orz on
For my own package, I consider my strengths relative to TestU01 to be:

1. Most of the RNGs incorporated are actually intended for real world
practical use, not research. Meaning both a nicer interface for
mundane software unrelated to RNG research, and a set of RNG
algorithms that seem more appropriate for real world usage - some are
significantly faster than any published RNG that I can find of
comarable bias level and significantly lower bias than any published
RNG I can find of comparable speed. In other words, it works (or at
least is intended to work) as a random number generation library for
normal RNG users, not just for researchers.

2. Original tests. TestU01 seems to implement pretty much every test
that has ever appeared in prominent literature, which is pretty nice,
and use smart parameterizations of them in its main test batteries,
which is even nicer. But my test suite mostly focuses on original
tests, particularly ones that in my limited testing can distinguish a
wider variety of RNGs than commonly used tests, especially RNGs that
do well on other tests.

3. A clearer focus on testing binary data rather than floating point
data. This is both a strength and a weakness, but I consider TestU01s
decision to completely ignore the bottom bit of output in its best
supported test batteries to be just bizarre (TestU01 Rabbit tests all
the bits, but Rabbit crashes for me if I give it a long sequence of
data - I think some of the other batteries test all bits as well, but
their descriptions left me with the impression that I should mainly
stick to SmallCrush/Crush/BigCrush).
From: Dann Corbit on
In article <1206e4fe-074b-48bf-acbf-a44b30172cc6
@x24g2000pro.googlegroups.com>, cdhorz(a)gmail.com says...
>
> This thread is intended for discussion of statistical tests for random
> number generators and software packages that include such tests.
>
> Specifically, the software packages that I'm aware of:
>
> TestU01, by Richard Simard and Pierre L'Ecuyer
> http://www.iro.umontreal.ca/~simardr/testu01/tu01.html
>
> RaBiGeTe, by Cristiano
> http://www.webalice.it/cristiano.pi/rabigete/
>
> Diehard, by George Marsaglia
> http://www.stat.fsu.edu/pub/diehard/
>
> ENT, by John Walker
> http://www.fourmilab.ch/random/
>
> Dieharder, by Robert G. Brown (rgb) and Dirk Eddelbuettel and David
> Bauer
> http://www.phy.duke.edu/~rgb/General/dieharder.php
>
> PractRand, by myself, not yet released but I intend to release this
> weekend.

NIST statistical test:
http://csrc.nist.gov/groups/ST/toolkit/rng/documentation_software.html