From: Richard Heathfield on
Jens Thoms Toerring wrote:
> In comp.unix.programmer Karthik Balaguru <karthikbalaguru79(a)gmail.com> wrote:
>> I came across the 'Infinite Monkey Theorem'.
>> http://en.wikipedia.org/wiki/Infinite_monkey_theorem
>
>> I wonder how can a monkey hitting keys at random on
>> a typewriter keyboard for an infinite amount of time will
>> almost surely type a given text, such as the complete
>> works of William Shakespeare ?
>
> What could be not in an infinite set?

The infinite set of even integers contains only half the integers. None
of the odd integers (of which there are infinitely many) is present in
the set.

The infinite set of prime integers contains almost none of the integers
- just an infinite handful that share an odd characteristic. Most of the
integers are absent from that infinite set.

The infinite set of integer powers of two contains practically no
integers at all - merely infinitely many. If the infinite set of
integers were an infinitely large barrel, the infinite set of integer
powers of two would be an infinitesimally small apple rolling around at
the bottom. (Admittedly it would be an infinitely large infinitesimally
small apple...)

The set of things that could not be in a given infinite set is
infinitely large. In fact, there are many such sets. Now, consider the
set of *all* such sets... Er, actually no, let's not go there...

> You will have not
> only the works of Shakespeare, but also all his works
> with all kinds of typos, readers digest versions etc.;-)

That isn't certain by any means. See above.

<snip>

> Well, instead of using a single monkey, giving it infinite
> time, you can use a large number of monkeys for a shorter
> time. Now, since the works of Shakespeare actually have been
> written (assming that Shakespeare was a kind of monkey and
> you don't instsit on the typewriter part), the theorem thus
> has been experimentally proven (as a possibly uninteded side
> effect of the mice having earth produced for finding "the"
> question).

I think we can agree that the works of Shakespeare have actually been
written. But you are assuming:

(a) that Shakespeare wrote them, which is apparently a matter of some
dispute;

(b) that Shakespeare was a kind of monkey, which is very unlikely to be
true. Both are primates, but then cars and bicycles are both vehicles;
that doesn't mean a bike is a kind of car or vice versa.

And in any case the theory is about duplicating the works of
Shakespeare, not originating them.

Let's try to quantify it a little, shall we?

We start by defining our terms and our tools. We will begin with a
simple monkey, eventually replacing him with a computer.

Let's assume that we have *one* monkey, then, hitting *one* typewriter
*once* per second, purely at random. The typewriter has *two* keys, 0
and 1. (When we install the MonkeyBrain XII later on, we can soup up the
speed a bit.)

Here is a Shakespeare sonnetette: 10010111. We want to know whether, if
we'd installed our monkey attadawnatime, he would have a reasonable
probability of having produced this sonnetette by now. Of course, we'd
like to be more general than that.

Attadawnatime, they say, was 14,000,000,000 years ago. That's about
441504000000000000 seconds, which we'll double (in case the scientists
got it wrong, which is not exactly unheard of), giving us a million
million million seconds to work with. Bags of time.

Now, the first seven times our monkey hits the keys, he can't possibly
produce the sonnetette. But the eighth time, he *may* have produced a
sonnetette. That is, if the bit length of the target text T is L, then
we have to produce at least L bits.

The probability of the first L bits producing T is 1/(2^L). More
specifically, for L=8 it's 1/256 = 0.00390625 = p (probability of
matching L bits of data against all L bits of text, in a given position
in the bitstream).

Now if we produce *more* bits, it gets a bit(sigh) more interesting.
Let's assume we produce 9 bits. We now have two stabs at the T cherry: a
match of T against the first 8 bits of the 9, and a match against the
last 8 bits of the 9. Any one match will do, so both matches have to
fail for the monkey to fail. We calculate this failure probability F by
raising (1-p) to the power of the number of match attempts:

F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625

More generally, if we produce B bits (where B>=L):

F = (1 - (1/(2^L))^(B+1-L)

and this will take us B/R seconds, where R is our bit production rate
(bits per second). Clearly, the probability P of success is 1-F.

At this point, we have a program spec.

Inputs: L (the length of the test corpus, in bits)
R (bit production rate)
K (seconds sinceadawnatime - assume 10^18)

Calculation:
Attempts = ((R * K) + 1) - L
SingleFailure = 1.0 - (1.0 / pow(2.0, L))
F = pow(SingleFailure, Attempts)
P = 1.0 - F

The implementation is left as an exercise for the terminally bored reader.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
From: Noob on
Richard Heathfield wrote:

> Let's assume that we have *one* monkey, then, hitting *one* typewriter
> *once* per second, purely at random. The typewriter has *two* keys, 0
> and 1. (When we install the MonkeyBrain XII later on, we can soup up the
> speed a bit.)
>
> Here is a Shakespeare sonnetette: 10010111. We want to know whether, if
> we'd installed our monkey attadawnatime, he would have a reasonable
> probability of having produced this sonnetette by now. Of course, we'd
> like to be more general than that.
>
> Attadawnatime, they say, was 14,000,000,000 years ago. That's about
> 441504000000000000 seconds, which we'll double (in case the scientists
> got it wrong, which is not exactly unheard of), giving us a million
> million million seconds to work with. Bags of time.
>
> Now, the first seven times our monkey hits the keys, he can't possibly
> produce the sonnetette. But the eighth time, he *may* have produced a
> sonnetette. That is, if the bit length of the target text T is L, then
> we have to produce at least L bits.
>
> The probability of the first L bits producing T is 1/(2^L). More
> specifically, for L=8 it's 1/256 = 0.00390625 = p (probability of
> matching L bits of data against all L bits of text, in a given position
> in the bitstream).
>
> Now if we produce *more* bits, it gets a bit(sigh) more interesting.
> Let's assume we produce 9 bits. We now have two stabs at the T cherry: a
> match of T against the first 8 bits of the 9, and a match against the
> last 8 bits of the 9. Any one match will do, so both matches have to
> fail for the monkey to fail. We calculate this failure probability F by
> raising (1-p) to the power of the number of match attempts:
>
> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625

I don't think so. The two attempts are not independent.
From: jellybean stonerfish on
On Mon, 08 Mar 2010 01:56:54 -0800, Nick Keighley wrote:

> shakespere didn't generate his plays by random means.

It was random that there even was a Shakespeare.
From: Richard Heathfield on
Noob wrote:
> Richard Heathfield wrote:

<snip>

>> We calculate this failure probability F by
>> raising (1-p) to the power of the number of match attempts:
>>
>> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625
>
> I don't think so. The two attempts are not independent.

It's a fair cop. Perhaps you could explain how to perform the
calculation correctly?

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
From: mike on
In article <yKWdnfd7BvIrFgjWnZ2dnUVZ8txi4p2d(a)bt.com>,
rjh(a)see.sig.invalid says...
> Noob wrote:
> > Richard Heathfield wrote:
>
> <snip>
>
> >> We calculate this failure probability F by
> >> raising (1-p) to the power of the number of match attempts:
> >>
> >> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625
> >
> > I don't think so. The two attempts are not independent.
>
> It's a fair cop. Perhaps you could explain how to perform the
> calculation correctly?
>
Unfortunately, the calculation depends on the specific text*. So for
your binary version of Hamlet (or was it Othello) there is no simple
answer.

* to prove this, imagine that the works of Shakespeare can be expressed
as a two character binary string (maybe the Condendsed Books version)
and the random monkey has typed three symbols. If the condensed string
is '00' then there is a 5/8 chance that it will not appear in a random
three character string, but if the string is '01' then there is a 1/2
chance it will not be found in a random three character string.

Mike