Around 1974 or so I attended a lecture by a Yale professor about monkeys: if you have a room full of monkeys, each banging on a typewriter, eventually you’ll get all of Shakespeare’s works. Using a random number generator to generate characters simulates this process with fewer bananas required. We talked about this at lunch the other day.
Of course, not all letters occur with the same frequency. If the typewriter noticed that the monkey typed a “Q”, then it could force the next letter to be “U” with a particular probability. Similarly for other letter pair combinations. One can imagine using some existing text as a model from which to calculate frequencies of letter pairs. How often is “t” followed by “h” ? by “a” ? One can also imagine that extrapolating such 2 letter patterns to n letter patterns can produce even better results.
I wrote a program in C about 20 years ago to simulate the monkeys. It used an array of 27 by 27 by 27 for storing counts of 3 letter patterns. (26 letters + 1 for “ “). I wrote it again in VFP, and found it surprisingly simple and short to write using a cursor rather than arrays. This also made it trivial to extend the pattern length to much larger than 3 and accommodate upper and lower case chars.
This program can be used on other text, such as programs, or different languages, such as German.
Try experimenting with different pattern lengths.
It’s surprising how readable the output is
WarAndPeace.Txt can be found here
Here’s a sample of output:
The Guards had already left Petersburg on the tenth of August
and her son who had remained in Moscow for his equipment was to join
them on the march to Radzivilov
It was St Natalias day and the name day of two of the Rostovs the
mother and the youngest daughter both named Nataly Ever since the
morning carriages with six horses had been coming and going
continually bringing visitors to the Countess Rostovas big house
on the Povarskaya so well known to all Moscow The countess herself
and her handsome eldest daughter who was four
years older than her sister and behaved already like a grownup
person were Nicholas and Sonya the niece Sonya was a slender
little brunette with a tender look in her eyes which were veiled by
long lashes thick black plaits coiling twice round her head and a
tawny tint in her complexion and especially in the color of her
slender but graceful and muscular arms and neck By the grace of her
movements by the softness and flexibility of her small limbs and
by a certain coyness and reserve of manner she reminded one of a
pretty halfgrown kitten which promises to become a beautiful
little cat She evidently considered it proper to show an interest
in the general conversation by smiling but in spite of herself her
eyes under their thick long lashes watched her cousin who was going to
join the army with such passionate girlish adoration that her smile
could not for a single instant impose upon anyone and it was clear
that the kitten had settled down only to spring up with more energy
and again play with her cousin as soon as they too could like Natasha
and Boris escape from the drawing room He evidently tried to find
something to say but failed Boris on the contrary at once found
his footing and related quietly and humorously how he had know that
doll Mimi when she was still quite a young lady before her nose was
broken how she had aged during the five years he had known her and
how her head had cracked right across the skull Having said this he
glanced at Natasha She turned away from him and glanced at her
CLEAR ALL
CLEAR
nPatLen=3
CREATE CURSOR letters (chrs c(nPatLen), cnt i)
fd=FOPEN("warandpeace.txt")
*fd=FOPEN("..\genmenu.prg")
?fd
IF fd <0
?"err opening file"
RETURN
ENDIF
cprior=""
nCnt=0
do
while !FEOF(fd) and !CHRSAW() and nCnt<100000
ch=FREAD(fd,1)
nCnt=nCnt+1
nAsc=ASC(LOWER(ch))
IF (nAsc>96
and nAsc <=122) or nAsc=32
or nasc=13 or nasc=10
IF LEN(cprior) >= nPatLen
cprior=SUBSTR(cprior,2)+ch
IF SEEK(cprior)
REPLACE cnt with cnt+1
ELSE
INSERT
into letters values
(cprior,1)
ENDIF
ELSE
cprior=cprior+ch
ENDIF
* ?cprior
ENDIF
ENDDO
FCLOSE(fd)
LOCATE
BROWSE last nowa
ACTIVATE WINDOW (PROGRAM())
GO INT(
cPrior=chrs
do
while !CHRSAW()
cLast=SUBSTR(cPrior,2)
* cLast=RIGHT(cPrior,1)
SEEK cLast
SUM
cnt
while chrs = cLast to nCnt
nr=INT(
SEEK cLast
SCAN while nr >0
nr=nr-cnt
ENDSCAN
SKIP -1
cNewLet=RIGHT(chrs,1)
??cNewLet
cPrior=SUBSTR(cPrior,2)+cNewLet
ENDDO
INKEY()