From: Richard Heathfield on 8 Mar 2010 05:59 Jens Thoms Toerring wrote:> In comp.unix.programmer Karthik Balaguru wrote: >> I came across the 'Infinite Monkey Theorem'. >> http://en.wikipedia.org/wiki/Infinite_monkey_theorem > >> I wonder how can a monkey hitting keys at random on >> a typewriter keyboard for an infinite amount of time will >> almost surely type a given text, such as the complete >> works of William Shakespeare ? > > What could be not in an infinite set? The infinite set of even integers contains only half the integers. None of the odd integers (of which there are infinitely many) is present in the set. The infinite set of prime integers contains almost none of the integers - just an infinite handful that share an odd characteristic. Most of the integers are absent from that infinite set. The infinite set of integer powers of two contains practically no integers at all - merely infinitely many. If the infinite set of integers were an infinitely large barrel, the infinite set of integer powers of two would be an infinitesimally small apple rolling around at the bottom. (Admittedly it would be an infinitely large infinitesimally small apple...) The set of things that could not be in a given infinite set is infinitely large. In fact, there are many such sets. Now, consider the set of *all* such sets... Er, actually no, let's not go there... > You will have not > only the works of Shakespeare, but also all his works > with all kinds of typos, readers digest versions etc.;-) That isn't certain by any means. See above. > Well, instead of using a single monkey, giving it infinite > time, you can use a large number of monkeys for a shorter > time. Now, since the works of Shakespeare actually have been > written (assming that Shakespeare was a kind of monkey and > you don't instsit on the typewriter part), the theorem thus > has been experimentally proven (as a possibly uninteded side > effect of the mice having earth produced for finding "the" > question). I think we can agree that the works of Shakespeare have actually been written. But you are assuming: (a) that Shakespeare wrote them, which is apparently a matter of some dispute; (b) that Shakespeare was a kind of monkey, which is very unlikely to be true. Both are primates, but then cars and bicycles are both vehicles; that doesn't mean a bike is a kind of car or vice versa. And in any case the theory is about duplicating the works of Shakespeare, not originating them. Let's try to quantify it a little, shall we? We start by defining our terms and our tools. We will begin with a simple monkey, eventually replacing him with a computer. Let's assume that we have *one* monkey, then, hitting *one* typewriter *once* per second, purely at random. The typewriter has *two* keys, 0 and 1. (When we install the MonkeyBrain XII later on, we can soup up the speed a bit.) Here is a Shakespeare sonnetette: 10010111. We want to know whether, if we'd installed our monkey attadawnatime, he would have a reasonable probability of having produced this sonnetette by now. Of course, we'd like to be more general than that. Attadawnatime, they say, was 14,000,000,000 years ago. That's about 441504000000000000 seconds, which we'll double (in case the scientists got it wrong, which is not exactly unheard of), giving us a million million million seconds to work with. Bags of time. Now, the first seven times our monkey hits the keys, he can't possibly produce the sonnetette. But the eighth time, he *may* have produced a sonnetette. That is, if the bit length of the target text T is L, then we have to produce at least L bits. The probability of the first L bits producing T is 1/(2^L). More specifically, for L=8 it's 1/256 = 0.00390625 = p (probability of matching L bits of data against all L bits of text, in a given position in the bitstream). Now if we produce *more* bits, it gets a bit(sigh) more interesting. Let's assume we produce 9 bits. We now have two stabs at the T cherry: a match of T against the first 8 bits of the 9, and a match against the last 8 bits of the 9. Any one match will do, so both matches have to fail for the monkey to fail. We calculate this failure probability F by raising (1-p) to the power of the number of match attempts: F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625 More generally, if we produce B bits (where B>=L): F = (1 - (1/(2^L))^(B+1-L) and this will take us B/R seconds, where R is our bit production rate (bits per second). Clearly, the probability P of success is 1-F. At this point, we have a program spec. Inputs: L (the length of the test corpus, in bits) R (bit production rate) K (seconds sinceadawnatime - assume 10^18) Calculation: Attempts = ((R * K) + 1) - L SingleFailure = 1.0 - (1.0 / pow(2.0, L)) F = pow(SingleFailure, Attempts) P = 1.0 - F The implementation is left as an exercise for the terminally bored reader. -- Richard Heathfield Email: -http://www. +rjh@ "Usenet is a strange place" - dmr 29 July 1999 Sig line vacant - apply within From: Noob on 8 Mar 2010 07:35 Richard Heathfield wrote: > Let's assume that we have *one* monkey, then, hitting *one* typewriter > *once* per second, purely at random. The typewriter has *two* keys, 0 > and 1. (When we install the MonkeyBrain XII later on, we can soup up the > speed a bit.) > > Here is a Shakespeare sonnetette: 10010111. We want to know whether, if > we'd installed our monkey attadawnatime, he would have a reasonable > probability of having produced this sonnetette by now. Of course, we'd > like to be more general than that. > > Attadawnatime, they say, was 14,000,000,000 years ago. That's about > 441504000000000000 seconds, which we'll double (in case the scientists > got it wrong, which is not exactly unheard of), giving us a million > million million seconds to work with. Bags of time. > > Now, the first seven times our monkey hits the keys, he can't possibly > produce the sonnetette. But the eighth time, he *may* have produced a > sonnetette. That is, if the bit length of the target text T is L, then > we have to produce at least L bits. > > The probability of the first L bits producing T is 1/(2^L). More > specifically, for L=8 it's 1/256 = 0.00390625 = p (probability of > matching L bits of data against all L bits of text, in a given position > in the bitstream). > > Now if we produce *more* bits, it gets a bit(sigh) more interesting. > Let's assume we produce 9 bits. We now have two stabs at the T cherry: a > match of T against the first 8 bits of the 9, and a match against the > last 8 bits of the 9. Any one match will do, so both matches have to > fail for the monkey to fail. We calculate this failure probability F by > raising (1-p) to the power of the number of match attempts: > > F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625 I don't think so. The two attempts are not independent. From: jellybean stonerfish on 8 Mar 2010 09:44 On Mon, 08 Mar 2010 01:56:54 -0800, Nick Keighley wrote: > shakespere didn't generate his plays by random means. It was random that there even was a Shakespeare. From: Richard Heathfield on 8 Mar 2010 18:44 Noob wrote:> Richard Heathfield wrote: >> We calculate this failure probability F by >> raising (1-p) to the power of the number of match attempts: >> >> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625 > > I don't think so. The two attempts are not independent. It's a fair cop. Perhaps you could explain how to perform the calculation correctly? -- Richard Heathfield Email: -http://www. +rjh@ "Usenet is a strange place" - dmr 29 July 1999 Sig line vacant - apply within From: mike on 9 Mar 2010 16:02 In article , rjh(a)see.sig.invalid says...> Noob wrote: > > Richard Heathfield wrote: > > > > >> We calculate this failure probability F by > >> raising (1-p) to the power of the number of match attempts: > >> > >> F = (1-p)^2 = 0.99609375^2 = 0.9922027587890625 > > > > I don't think so. The two attempts are not independent. > > It's a fair cop. Perhaps you could explain how to perform the > calculation correctly? > Unfortunately, the calculation depends on the specific text*. So for your binary version of Hamlet (or was it Othello) there is no simple answer. * to prove this, imagine that the works of Shakespeare can be expressed as a two character binary string (maybe the Condendsed Books version) and the random monkey has typed three symbols. If the condensed string is '00' then there is a 5/8 chance that it will not appear in a random three character string, but if the string is '01' then there is a 1/2 chance it will not be found in a random three character string. Mike