Around 1974 or so I attended a lecture by a Yale professor about monkeys: if you have a room full of monkeys, each banging on a typewriter, eventually you’ll get all of Shakespeare’s works. Using a random number generator to generate characters simulates this process with fewer bananas required.  We talked about this at lunch the other day.

 

Of course, not all letters occur with the same frequency. If the typewriter noticed that the monkey typed a “Q”, then it could force the next letter to be “U” with a particular probability. Similarly for other letter pair combinations. One can imagine using some existing text as a model from which to calculate frequencies of letter pairs. How often is “t” followed by “h” ? by “a” ? One can also imagine that extrapolating such 2 letter patterns to n letter patterns can produce even better results.

 

I wrote a program in C about 20 years ago to simulate the monkeys. It used an array of 27 by 27 by 27 for storing counts of 3 letter patterns. (26 letters + 1 for “ “). I wrote it again in VFP, and found it surprisingly simple and short to write using a cursor rather than arrays. This also made it trivial to extend the pattern length to much larger than 3 and accommodate upper and lower case chars.

 

This program can be used on other text, such as programs, or different languages, such as German.

 

Try experimenting with different pattern lengths.

 

It’s surprising how readable the output is

 

WarAndPeace.Txt can be found here

 

Here’s a sample of output:

 

The Guards had already left Petersburg on the tenth of August

and her son who had remained in Moscow for his equipment was to join

them on the march to Radzivilov

  It was St Natalias day and the name day of two of the Rostovs the

mother and the youngest daughter both named Nataly Ever since the

morning carriages with six horses had been coming and going

continually bringing visitors to the Countess Rostovas big house

on the Povarskaya so well known to all Moscow The countess herself

and her handsome eldest daughter who was four

years older than her sister and behaved already like a grownup

person were Nicholas and Sonya the niece Sonya was a slender

little brunette with a tender look in her eyes which were veiled by

long lashes thick black plaits coiling twice round her head and a

tawny tint in her complexion and especially in the color of her

slender but graceful and muscular arms and neck By the grace of her

movements by the softness and flexibility of her small limbs and

by a certain coyness and reserve of manner she reminded one of a

pretty halfgrown kitten which promises to become a beautiful

little cat She evidently considered it proper to show an interest

in the general conversation by smiling but in spite of herself her

eyes under their thick long lashes watched her cousin who was going to

join the army with such passionate girlish adoration that her smile

could not for a single instant impose upon anyone and it was clear

that the kitten had settled down only to spring up with more energy

and again play with her cousin as soon as they too could like Natasha

and Boris escape from the drawing room He evidently tried to find

something to say but failed Boris on the contrary at once found

his footing and related quietly and humorously how he had know that

doll Mimi when she was still quite a young lady before her nose was

broken how she had aged during the five years he had known her and

how her head had cracked right across the skull Having said this he

glanced at Natasha She turned away from him and glanced at her

 

 

CLEAR ALL

CLEAR

 

nPatLen=3

CREATE CURSOR letters (chrs c(nPatLen), cnt i)

INDEX ON chrs tag chrs

fd=FOPEN("warandpeace.txt")

*fd=FOPEN("..\genmenu.prg")

?fd

IF fd <0

            ?"err opening file"

            RETURN

ENDIF

cprior=""

nCnt=0

do while !FEOF(fd) and !CHRSAW() and nCnt<100000

            ch=FREAD(fd,1)

            nCnt=nCnt+1

            nAsc=ASC(LOWER(ch))

            IF (nAsc>96 and nAsc <=122) or nAsc=32 or nasc=13 or nasc=10

                        IF LEN(cprior) >= nPatLen

                                    cprior=SUBSTR(cprior,2)+ch

                                    IF SEEK(cprior)

                                                REPLACE cnt with cnt+1

                                    ELSE

                                                INSERT into letters values (cprior,1)

                                    ENDIF

                        ELSE

                                    cprior=cprior+ch

                        ENDIF

            *          ?cprior

            ENDIF

ENDDO

 

 

FCLOSE(fd)

LOCATE

BROWSE last nowa

ACTIVATE WINDOW (PROGRAM())

RAND(1)

RAND()

GO INT(RAND()*RECCOUNT()+1)

cPrior=chrs

 

do while !CHRSAW()

            cLast=SUBSTR(cPrior,2)

*          cLast=RIGHT(cPrior,1)

            SEEK cLast

            SUM  cnt while chrs = cLast to nCnt

            nr=INT(RAND()*nCnt+1)

            SEEK cLast

            SCAN while nr >0

                        nr=nr-cnt

            ENDSCAN

            SKIP -1

            cNewLet=RIGHT(chrs,1)

            ??cNewLet

            cPrior=SUBSTR(cPrior,2)+cNewLet

ENDDO

INKEY()