Chomsky and Generative Grammar: Science is built on a foundation of failed theories.

Noam Chomsky was originally a linguist. He made himself famous arguing for a theory of a universal generative grammar. For Chomsky, language is a recursive structure, and grammatical features are universally distributed; language, for Chomsky, is a “self extracting archive”, so to speak.

The problem with this theory, which is elegant and simple and thus scientifically attractive, is that it is empirically untrue. Beyond the existence of nouns and verbs there are many serious linguistic divergences which are inconsistent with Chomsky’s thesis and in my opinion invalidate it. No matter how attractive a theory is, however parsimonious it is, whatever explanatory power it offers, if it does not match up with all the empirical facts then it must be modified or rejected.

Leave aside for a minute the idea that if language is an inherent structure to humanity, and unique to humanity implies that knowledge is in some sense genetic. A twisted person could argue that Chomsky proffers a theory which is ultimately genetic, i.e. racist. Chomsky is hardly a racist but if language is a biological function then it is one step from essentailism into tribalism and it ends with race supremacism. Also leave aside the “Jane Goodall” critique, that we increasingly see examples of communication among various animal species, most famously Coco the ape who signs. Leave aside also the idea that perhaps other animals communicate complex ideas but using methods which either evade our ability to perceive them as being outside our frequency of hearing or using a radically different grammar, or which we have not cognized as communicative. There are many bird songs we cannot replicate with our vocal chords, yet that does not mean we do not speak or sing.

For once I disagree with Aristotle; Aristotle believed that other animals only emote, and do not think or communicate, rather only feel. That too appears empirically disproven at least among the most developed animals: some great apes do sign. However, even within an anthro-centric view which limits itself to human speech the generative grammar thesis does not line up with the facts of real-world languages.

There are several examples of unique grammar structures found in some languages but not others. I address them briefly here to show why Chomsky’s theory, though elegant, is empirically incorrect.

Probably the easiest example for those who do not know several foreign languages is word order. While mostlanguages use the subject-verb-object model for declaratory statements and verb-subject-object for questions word order in languages otherwise differs. Some languages, for example Mandarin, have fairly rigid word orders, a few have in theory no fixed word order (e.g., Latin) and most have a mix of rigid and flexible rules, English being one example there. However, the mix of rules and flex varies from language to language, and anyone who speaks German the rather different rules of word order found in that language. German has a more rigid word order than English, and its ordering of words, beyond SVO or VSO is fairly different from English even though it is a declined language.

Beyond the fact of word order, which as anyone who speaks both German and English can tell you varies quite a bit even between closely related languages there are other emprical facts which demonstrate that Chomsky’s thesis of language as universal and inherent is wrong. To my understanding, Chomsky is arguing that language is self-similar, i.e.recursive, and this is why he calls it generative grammar. Either reading of Chomsky – language as a universal recursive function, or merely as universal, is empirically untrue.

Particles are the easiest example for an anglophone to see. Particles do not exist in English. They do in Russian with the “li” particle indicating hypotheticals, i.e. the subjuntive. “be -li” in Russian would be “if it be so” in English.

Russian is the only indo European language I know which has particles. Chinese has many particles. “Duh” as a particle indicates possesion. “Luh” as a particle indicates the past tense. “Guh” as a particle indicates some specific instance, and seems close to the partitive/genitive distinction.

In English we use prepositions. “Of”, “To”, “From” etc. In English, prepositions increasingly converge so that old-school prepositions such as unto, hereunder i.e. compounded prepositions are no longer widely used and are even archaic e.g. “Wherefore art thou, Romeo?” While Indo Europen uses prepositions Uralic languages such as Estonian use post-positions. In Estonian we would say “The lamp is the table on.” This is another example of why there is not in fact a universal grammar beyond, perhaps, nouns and verbs.

As well as pre- and post- positions we can note the existence of declensions. English only has vestiges of declensions in pronouns: “he”, “his”, “him”, “to him” are the nominative genitive accusative and dative forms. Likewise who, whose, whom, to whom; she, her, her, to her. Declensions no longer exist in English articles and really no longer exist in Romance languages like French, even though Latin had more cases than Old English. Yet, in modern German declensions still exist, which has four cases and declensions are even more notable in Russian with six cases. The richest countries for declensions seem to be Estonian and Finnish with 13 or 14 cases. Chinese in contrast has no cases.

Verb tense is also hardly universal. Chinese has no tense. Tense is indicated contextually by time markers and by particles in Chinese. English in contrast has remarkably complex tenses and so does French. Estonian and Russian in contrast only have three tenses — past present and future, but Russian has perfective and imperfective verbs to indicate whether the action was done to completion or is ongoing, whereas the two verb forms in Estonian, which also has but three tenses are used to indicate modality. Russian does not use the verb to be in the present tense, it is indicated contextually. Auxiliary verbs are not used to indicate tense in Russian.

These are the principle features of those languages which I have learned that differ sufficiently from each other that I am confident as an empirical matter that grammar is not universally uniform beyond nouns and verbs. There are sufficient basic words in all learned languages which appear to basic to be other than the consequence of a proto-world language. The nostratic hypothesis can be well proven by comparing just English, German, French, Estonian, and Russian — one language from each of the European language families apart from the isolates.

While I do believe that there was once a proto-world language with but one grammar, this grammar evolved and split and is not in fact an inherent biological function universal not merely in existence but also in development. We can see the decline of declensions and the convergence of prepositions quite clearly in English. If language were generative and a biological function then language would not evolve, yet it does.

Chomsky’s view should be seen in context of the times he wrote it. Machine analysis of language was in its infancy. Foreign language acquisition was considerably more difficult. The generative grammar hypothesis may well have served a useful role in the development of machine processing of language for translation. Yet, increasingly available historic and genetic evidence through better archeaology and genome analysis as well as the increasing awareness that language is not a uniquely human construct explain why for empirical reasons we must see Chomsky’s attempt to cast light from “the tower of babble” as but an effort. Chomsky, like Einstein, Keynes, Kelsen, sought to develop a general theory. Like each of them, his general theory must be seen as having collapsed into a special theory. Generative grammar is at least a special theory applicable to tbe machine processing of language. Beyond that it is unfortunately empirically flawed, like the other great efforts at general theorizing in the 20th century.

Evolutionary language theory is the future. The behaviorial theory of language has yet to be developed, and will provide key insights for data mining, machine intelligence, and AI user agents such as Cortana.