Computer reveals startling word patterns
Among the oft-derided Christian literalists, it is said that the Bible is the wholly inerrant Word of God, and that Holy Spirit guided the mind and hand of its human authors. Orthodox Jews are even more extreme in their literalism: Among them, tradition holds not merely that every word of the Bible is inspired, but that every letter of the Torah (the Pentateuch, the first five books of the Hebrew Bible) was dictated directly by God to Moses in a precise and unerring sequence. So-called "higher criticism" and modern linguistic analyses have tended to undercut these claims, critiquing them with what is generally regarded as superior scientific method. Few of the methods used, however, meet the rigorous criteria of hard science and mathematical statistics.
In 1988 an obscure paper was published–in a prominent, rigorous, indeed premier, scientific journal–with results that may demolish the claims of the "higher" critics, and support, rather, the Orthodox Jewish contention as to the nature of the Torah. The paper, by Doron Witztum, Eiyahu Rips and Yoav Rosenberg of the Jerusalem College of Technology and the Hebrew University, is innocuously entitled "Equidistant Letter Sequences of the Book of Genesis" and was published in the eminent Journal of the Royal Statistical Society.1 It generated a brief flurry of public attention (and a wave of activity within Orthodox Jewish circles) but was ultimately lost from general view both because of its rather technical nature and because of the sheer outrageousness of its findings, which remain, however, unrefuted as far as I know.
The authors, mathematical statisticians, discovered words encoded into the Hebrew text that could not have been accidental–nor placed there by human hand.
After publication, the authors continued their work and found that some pairs of words were predictive–that is, they could not have been known to the supposedly human authors of the Hebrew text because they occurred long after the Bible was composed.
The authors submitted a subsequent paper to the referreed journal Statistical Science (such review journals generally represent the pinnacle of scientific publishing), where, not surprisingly, it met with considerable skepticism–but also with admirable scientific objectivity. The reviewers insisted on a somewhat larger-than-usual number of challenges and revisions, but in the end, they published it. In the words of Robert Kass, the journal editor:
Our referees were baffled: their prior beliefs made them think the Book of Genesis could not possibly contain meaningful references to modern day individuals, yet when the authors carried out additional analyses and checks the effect persisted. The paper is thus offered to Statistical Science readers as a challenging puzzle.
In August 1994, the paper was published in Statistical Science under the title "Equidistant Letter Sequences in the Book of Genesis."
I hasten to ward the reader that the results do not reveal any secret messages encoded in the Bible, but they do demonstrate certain sequences of letters forming words that cannot be the result of chance. The implications are for the reader to decide.
The authors of course worked from the Hebrew text, not a translation. They focused on the book of Genesis as transmitted in the Orthodox tradition (the textus receptus or Masoretic text, usually abbreviated MT). They regarded the text as simply a string of letters (without spaces), as the tradition claims–in site of the many sophisticated, modern, apparently scientific arguments to the contrary. The tradition treats the letter-by-letter sequence as no less sacrosanct than the prima facie meaning and intent of the words suggested by the word sequences. Was there meaning cryptographically embedded in the text that went beyond the meaning of the words as written? they asked.
To explain what they were about, the researchers used this illustration: Consider a text that may either have meaning in a foreign language or be a meaningless sequence of letters. Not knowing the language, if it is one, makes it very difficult to decide between these two possibilities. The researches go on to explain:
Suppose now that we are equipped with a very partial dictionary, which enables us to recognize a small portion of the words in the text: "hammer" here and "chair" there, and maybe even "umbrella" elsewhere. Can we now decide between the two possibilities?
Not yet. But suppose now that, aided with the partial dictionary, we can recognize in the text a pair of conceptually related words, like "hammer" and "anvil." We check to see if there is a tendency of their appearances in the text to be in "close proximity." If the text is meaningless, we do not expect to see such a tendency, since there is no reason for it to occur. Next, we widen our check; we may identify some other pairs of conceptually related words: like "chair and "table," or "rain" and "umbrella." Thus we have a sample of such pairs, and we check the tendency of each pair to appear in close proximity in the text. If the text is meaningless, there is no reason to expect such a tendency. However, a strong tendency of such pairs to appear in close proximity indicates that the text might be meaningful.
This in effect is what the researchers have found embedded in the Hebrew test of the Torah–a whole series of meaningful word-pairs in close proximity, something that they demonstrate cannot have happened by chance. These words they found in close proximity are not simply the words of the text (as would be the case in the analogy above of an unknown potential language). They were rather words composed of letters selected at various equal skip distances, for example, every second or third or fourth letter. It was as though "behind" the surface meaning of the Hebrew there was a second, hidden level of embedded meaning.
The researchers were led to this phenomenon by an observation of a certain Rabbi Weissmandel in 1958. The rabbi noticed that selecting sequences of equally spaced letters in the text, he could find certain words or phrases, such as, say, "hammer" and "anvil." He called these "equidistant letter sequences" or ELS for short. However, he had no way of determining if these occurrences were due merely to the enormous quality of combinations of words and expressions that can be constructed by searching out such "arithmetic progressions" in the text.
The mathematicians and statisticians who formed the research team decided to study systematically the phenomenon that Rabbi Weissmandel had observed to see whether it should be explained purely on the basis of fortuitous combinations.
The researchers in effect set out the text of the Torah in what mathematicians call a two-dimensional array, which is simply all the letters in sequence (without spaces) with so many letters in each row, row after row. The letters of the word HAMMER might appear as:
and might run vertically, horizontally or diagonally, as in the examples at right.
The researchers in fact tested the Hebrew words for "hammer" (patishe, PTYS) and "anvil" (sadan, SDN) in equidistant letter sequences. These are short words and on general probability grounds they may be expected to appear close to each other quite often in any text, as in fact they do in Genesis. But they also found that even if they restricted the search for such for such appearances to only the minimum equal length skip distance, such word-pairs still occurred much too frequently to be accounted for by chance. And other combinations also appear so often that it begins to look not so much like a random happening, but like something carefully embedded in the text. For example, the researchers also found the pair Zedekiah ( a sixth century B.C.E. king of Judah), and Matanya, Zedekiah's original name (see 2 Kings 24:17); and the pair Hanukkah (the Jewish festival that commemorates the re-dedication of the Temple after it was recaptured from the Assyrians in the second century B.C.E.) and Hasmoneans (the family name of the leaders of the Jewish forces that managed to wrest the Temple from the Assyrian monarch Antiochus IV Epiphanes). Note that these names and events found encoded in the text of the Torah involved people who lived, and events that occurred, long after the Torah was composed, whether by a divine or human hand.
In their 1988 paper, the researchers selected, at random, 300 such Hebrew word-pairs with obviously related meanings, and looked for the words embedded in the text by treating the entire book of Genesis as a long cryptographic string. They would start at the beginning until they came to the first letter, then look to see if a second letter could be found two letters away. If so, they then looked for the third letter two letters away; if not, they stopped and searched for the next appearance of the first letter and repeated the process. They continued until they found an occurrence of the entire word spelled out at every second letter. If not, they performed the same procedure looking at every third letter instead.
In this fashion they searched first every other letter, then every third, and so on (including reverse order). When they found the first instance (that is, at the minimum skip distance) of the first word of the pair, they then searched until they found the first instance of the second word of the pair, also at minimum skip distance, and measured its proximity within the text-string to the first. They did this for all 300 word-pairs.
As a control, they performed the same search for the same pairs on numerous random scramblings of the Genesis text. The authors found that each of the 300 word-pairs were found in close proximity in the actual Genesis text, but not in randomized control texts.
The published results show that this finding was significant at a level of 1.8 x 10-17, that is, the odds of its occurring merely by chance are less than 1 in 50 quadrillion. (A quadrillion is one with 15 zeros after it.) A finding in most scientific journals is considered significant at chance levels of anything less than 1 in 20.
The capacity to embed so many, meaningfully related, randomly selected word-pairs in a body of text with a coherent surface meaning is stupendously beyond the intellectual capacity of any human being or group of people, however brilliant, and equally beyond the capacity of any conceivable computing device. Furthermore, that the word-pairs were randomly selected strongly suggests that all possible word-pairs are so embedded.
Following publication of this paper, a public statement was issued, signed by five mathematical scholars–two from Harvard, two from Hebrew University and one from Yale.2 "The present work," they said, "represents serious research carried out by serious investigators. Since the interpretation of the phenomenon in question is enigmatic and controversial, one may want to demand a level of statistical significance beyond what would he demanded for more routine conclusions... [T]he results obtained are sufficiently striking to deserve a wider audience and to encourage further study." The work was also critiqued and endorsed by Dr. Andrew Goldfinger, a senior research physicist at Johns Hopkins University in Baltimore, and by Harold Gans, an analyst with the U.S. Department of Defense.
According to Jewish tradition, the Torah contains all knowledge; therefore the codes embedded in the Torah also encompass information that transcends the limitation of time. The Vilna Gaon, the great l8th century Rabbi of Vilna, Lithuania, a child prodigy and one of the most brilliant men in Jewish history, wrote that "all that was, is, and will be unto the end of time is included in the Torah...and not merely in a general sense, but including the details of every species and of each person individually, and the most minute details of everything that happened to him from the day of his birth until his death."
Some may be reminded of the words of the Rabbi from Nazareth, seen in a different light: "I tell you the truth, until heaven and earth disappear, not the smallest letter, not the least stroke of a pen,
will by any means disappear from the Torah [Law] until everything is accomplished" (Matthew 5:18 ).
For their second paper, published in 1994, the researchers went one step further, as if guided by the Vilna Gaon. Instead of looking for pairs of related words, they looked for pairs of words that were time-related from a period long after the Torah had been composed. They took the names of the 34 most prominent men (measured by the length of their biographies in a standard Hebrew reference book, the Encyclopedia of Great Men in Israel) from the ninth to the nineteenth centuries, and (using the standard Jewish way of abbreviating name they paired the name with the date of the man's birth or death (Hebrew month and day). There was no way the author of the Torah could have known, at the time the Torah was composed, of the existence of these men, and certainly not the dates of their births or deaths–unless of course He was divine. Nevertheless, the researchers were able to demonstrate that the names and the dates of their birth or death were encoded into the text in close proximity; that is, using the minimum skip distance the names and the dates of birth or death were found embedded in the text of Genesis in significantly close proximity.
Understandably, skeptical scholars at the Statistical Science journal then asked the authors to repeat the test on another sample of the next 34 most prominent men. In this group, the dates of death for two of the men were not known, so the second test included only 32 men. The results were the same, however. In short, for all 66 men, their names and birth or death dates were found in close proximity.
The likelihood that this occurred by chance on the set of 32 names is less than 1 in 50,000; on both sets, less than 1 in 2,500,000,000. The first figure is reported in the article in Statistical Science. The second number is not included; as yet another measure of conservatism, the editors insisted that after having found a successful result on the first set, the authors repeat it on the second and report the results of the second set alone.
It is also significant the researchers tried to find the same phenomenon by using the Samaritan Pentateuch, which varies slightly from the traditional Jewish textus receptus. But the phenomenon was utterly lacking in letter-level variants of the Pentateuch, such as the Samaritan Pentateuch. Nor could it be found in other texts, sacred or otherwise. One of the reviews had them try the same test on Tolstoy's War and Peace; so the researches chose a section of the Hebrew translation that was the same length as Genesis, but the phenomenon did not appear in War and Peace. With respect to the other sacred texts, the phenomenon would not be expected because even the best manuscripts of the text vary; there is no letter-for-letter sacred text as there is for the Torah. Even the rest of the Hebrew Bible outside of the Torah lacks such a tradition; hence there are innumerable textual variants.
What are the implications of these findings ?
The phenomenon cannot be attributed to anything within the known physical universe, human beings included. Moreover, the rigorous proof of the existence and validity of the phenomenon requires both high speed computation and only recently developed techniques of statistical analysis.
On the other hand, though statistically powerful, the phenomenon is a relatively weak one. "Proximity" is defined only statistically, and the phenomenon only makes itself apparent in the aggregation of many examples that on average show much greater proximity than would be expected.
It should also be noted that there is no way of extracting the encoded information without knowing it already. Because the information cannot be extracted in advance, the method cannot be used to foretell the future. (And of course the Torah itself forbids such practices.) The future long ago embedded in the Torah must become our past before it can be retrieved.
How has the paper been received? The authors note with disappointment, but not surprise, that responses so far have mostly fallen into two categories: a priori acceptance or a priori rejection. The former, by believers and enthusiasts (especially those without mathematical training), is indeed not surprising. But the latter is–or should be. Since to date no one has discovered a flaw in the authors' work, it is reasonable to ask of scientifically trained, a priori skeptics (who are certain these results must be a fluke), "What standard of proof would you accept as an indication that the phenomenon might be genuine?" The most frequent answer by far is "There is no standard. I will not believe it regardless."
One is reminded of the persistent (but after 80 years at last weakening) skepticism that greeted certain results in quantum mechanics research: for example, that what happens in every part of the universe instantaneously–or even backwards in time–influences, in measurable degree, what happens everywhere else. Should the "codes in the Torah" phenomenon remain undefeated, perhaps in the light of such astonishing findings in modern science it, too, will one day seem not so preposterous.
What then was the purpose of encoding this information into the text? Some would say it is the Author's signature. Is it His way of assuring us that at this particular, late moment–when our scientific, materialistic doubt has reached its apotheosis, when we have been driven to the brink of radical skepticism–that He is precisely who He said He is in that astonishing, radical core document of the Judeo-Christian tradition?