If vervet calls are indeed partly learned rather than entirely instinctive, one might expect vervet populations in different parts of Africa to have developed different ‘dialects’ for the same reason that different human populations have. That is, ‘word’ meanings and pronunciations would gradually change with time, but the changes would develop independently in different areas and would be transmitted by learning, leading first to different dialects and eventually to different languages. This prediction of dialect differences has yet to be tested for vervets, since all the detailed studies of their vocal communication to date have been made in one small area of Kenya. However, song dialects are well developed in some bird species whose young learn the locally correct song from adult birds that they hear around them as they grow up. In a North American songbird called the white-crowned sparrow, such dialect differences are so pronounced that experienced bird-watchers near San Francisco can pinpoint an individual sparrow’s home within ten miles.
*
So far, I have loosely applied human concepts such as ‘word’ and ‘language’ into vervet vocalization. Let’s now compare human vocalizations and those of subhuman primates more closely. In particular, let’s ask ourselves three questions. Do vervet sounds really constitute ‘words’? How large are animals’ ‘vocabularies’? Do any animal vocalizations involve ‘grammar’ and merit the term ‘language’?
Firstly, on the question of words, it is clear at least that each vervet alarm call refers to a well-defined class of external dangers. That does not imply, of course, that a vervet’s ‘leopard call’ designates the same animals to a vervet as the word ‘leopard’ does to a professional zoologist – namely, members of a single animal species, defined as a collection of potentially interbreeding individuals. We already know that vervets give their leopard alarm in response not just to leopards but also to two other medium-sized cat species (caracals and servals). If the ‘leopard call’ is a word at all, it would not mean ‘leopard’ but instead ‘medium-sized cat that is likely to attack us, hunts in a similar way, and is best avoided by running up a tree’. However, many human words are used in a similar generic sense. For example, most of us other than ichthyologists and ardent fishermen apply the generic word ‘fish’ to any cold-blooded animal with fins and a backbone that swims in the water and might be worth eating.
Instead, the real question is whether the leopard call constitutes a word (‘medium-sized cat that … etc.’), a statement (‘there goes a medium-sized cat’), an exclamation (‘watch out for that medium-sized cat!’) or a proposition (‘let’s run up a tree or take other appropriate action to avoid that medium-sized cat’). At present it is not clear which of those functions the leopard call fills, or whether it fills a combination of them. Similarly, I was excited when my then one-year-old son Max said ‘juice’, which I proudly took to be one of his first words. To Max, though, the syllable ‘juice’ was not just his academically correct identification of a external referent with certain properties, but it also served as a proposition: ‘Give me some juice!’ Only at a later age did Max add more syllables, like ‘gimme juice’, to distinguish propositions from pure words. Vervets show no evidence of having reached that stage.
On the second question of extent of ‘vocabulary’, even the most advanced animals seem, on the basis of present knowledge, to be far behind us. The average human has a daily working vocabulary of around a thousand words; my compact desk dictionary claims to contain 142,000 words; but only ten calls have been distinguished even for vervets, the most intensively studied mammal. Animals and humans surely do differ in vocabulary size, yet the difference may not be as great as these numbers suggest. Remember how slow has been our progress in distinguishing vervet calls. Not until 1967 did anyone realize that these common animals had any calls with distinct meanings. The most experienced observers of vervets still cannot separate some of their calls without machine analysis, and even with machine analysis the distinctness of some of the suspected ten calls remain unproven. Obviously, vervets (and other animals) could have many other calls whose distinctness we have not yet recognized.
There is nothing surprising about our difficulties in distinguishing animal sounds, when one considers our difficulties in distinguishing human sounds. Children devote much of their time for the first several years of their lives to learning how to recognize and reproduce the distinctions in the utterances of adults around them. As adults, we continue to have difficulty distinguishing sounds in unfamiliar human languages. After four years of high-school French between the ages of twelve and sixteen, my problems with understanding spoken French are embarrassing compared to the abilities of any four-year-old French child. But French is easy compared to the Iyau language of New Guinea’s Lakes Plains, in which a single vowel may have eight different meanings depending on its pitch. A slight change in pitch converts the meaning of the Iyau word meaning ‘mother-in-law’ into ‘snake’. Naturally, it would be suicidal for an Iyau man to address his mother-in-law as ‘beloved snake’, and Iyau children learn infallibly to hear and reproduce pitch distinctions that for years confounded even a professional linguist devoted full-time to the study of the Iyau language. Given the problems we have ourselves with unfamiliar human languages, of course we must still be overlooking distinctions within the vervet vocabulary.
However, it is unlikely that any studies on vervets will reveal to us the limits attained by animal vocal communication, because those limits are probably reached by apes rather than by monkeys. While the sounds made by chimps and gorillas seem to our ears to be unsophisticated grunts and shrieks, so did the sounds made by vervet monkeys until they were studied carefully. Even unfamiliar human languages can sound like undifferentiated gibberish to us.
Unfortunately, vocal communication by wild chimps and other apes has never been studied by the methods applied to vervets, because of logistical problems. The width of a troop’s territory is typically less than 2,000 feet for vervets but is several miles for chimps, making it far harder to carry out playback experiments with video cameras and hidden loudspeakers. These logistical problems cannot be overcome by studying groups of apes caught in the wild and held captive in conveniently-sized zoo cages, because the captives generally constitute an artificial community of individuals caught at different African locations and thrown together in a cage. As I will discuss later in this chapter, humans originally speaking different languages, when captured at different African locations and thrown together as slaves, converse in only the crudest shadow of human language, virtually without any grammar. Similarly, captive apes taken from the wild must be virtually useless for studying the degree of sophistication of a vocal community of wild apes. The problem will remain unsolved until someone works out how to do for wild chimps what Cheney and Seyfarth have done for wild vervets.
Several groups of scientists have nevertheless spent years training captive gorillas, common chimps, and pygmy chimps to understand and use artificial languages based on plastic chips of different sizes and colours, or on hand signs similar to those used by deaf people, or on consoles, like a gigantic typewriter with each key bearing a different symbol. The animals have been reported to learn the meanings of up to several hundred symbols, and a pygmy chimp has recently been reported to understand (but not to utter) a good deal of spoken English. At the least, these studies of trained apes reveal that they possess the intellectual capabilities for mastering large vocabularies, begging the obvious question of whether they have evolved such vocabularies in the wild.
It is suspicious that wild gorilla troops may be seen sitting together for a long time, grunting back and forth in seemingly undifferentiated gibberish, until suddenly all the gorillas get up at the same time and head off in the same direction. One wonders whether there really was a transaction concealed within that gibberish. Because the anatomy of apes’ vocal tracts restricts their ability to produce the variety of vowels and consonants that we can, the vocabulary of wild apes is unlikely to be anywhere as larg
e as our own. Nevertheless, I would be surprised if wild chimp and gorilla vocabularies did not eclipse those reported for vervets and comprise dozens of ‘words’, possibly including names for individual animals. In this exciting field where new knowledge is being rapidly acquired, we should keep an open mind on the exact size of the vocabulary gap between apes and humans.
The last unanswered question concerns whether animal vocal communication involves anything that could be considered grammar or syntax. Humans do not only have vocabularies of thousands of words with different meanings. We also combine those words and vary their forms in ways prescribed by grammatical rules that determine the meaning of the word combinations. Grammar thereby allows us to construct a potentially infinite number of sentences from a finite number of words. To appreciate this point, consider the different meaning of the following two sentences, composed of the same words and endings but with different word order, which constitutes one set of the grammatical rules that specify sentence meaning in the English language:
‘Your hungry dog bit my old mother’s leg.’
or
‘My hungry mother bit your old dog’s leg.’
If human language did not involve grammatical rules, those two sentences would have exactly the same meaning. Most linguists would not dignify an animal’s system of vocal communication with the name of language, no matter how large its vocabulary, unless it also involved grammatical rules.
No hint of syntax has been discovered in the studies of vervets to date. Most of their grunts and alarm calls are single utterances. When a vervet gives a sequence of two or more utterances, all analysed cases have proved to consist of the same utterance repeated, as has also been the case when one vervet has been recorded responding to another vervet’s call. Capuchin monkeys and gibbons do have calls of several elements used only in certain combinations or sequences, but the meanings of these combinations remain to be deciphered (by us humans, that is).
I doubt that any student of primate vocalizations expects even wild chimps to have evolved a grammar remotely approaching the complexity of human grammar, complete with prepositions, verb tenses, and interrogative particles. However, it remains for the present an open question whether any animal has evolved syntax. The necessary studies on the wild animals most likely to use grammar – pygmy or common chimps – simply have not yet been attempted.
In short, while the gulf between animal and human vocal communication is surely large, scientists are rapidly gaining understanding of the causeway that evolved over that gulf from the animal side. Now let’s trace the bridge from the human side. We have already discovered complex animal ‘languages’; do any truly primitive human languages still exist?
*
To help us recognize what a primitive human language might sound like if there were any, let’s remind ourselves of the ways in which normal human language differs from vervet vocalizations. One difference is that of grammar. Humans, but not vervets, possess grammar, meaning the variations in word order, prefixes, suffixes, and changes in word roots (such as ‘they’, ‘them’, ‘their’) that modulate the sense of the roots. A second difference is that vervet vocalizations, if they constitute words at all, stand only for things that one can point to or act out. One could try to argue that vervet calls do include the equivalents of nouns (‘eagle’) and verbs or verb phrases (‘watch out for the eagle’). Our words clearly include both nouns and verbs that are distinct from each other, as well as adjectives. Those three parts of a speech referring to specific objects, acts, or qualities are termed lexical items. But up to half of the words in typical human speech are purely grammatical items, with no referent that one can point to.
These grammatical words include our prepositions, conjunctions, articles, and auxiliary verbs (words like ‘can’, ‘may’, ‘do’, and ‘should’). It is much harder to understand how grammatical items could evolve than it is for lexical items. Given someone who understands no English, you can point to your nose to explain what that noun means. Apes might similarly come to agree on the meanings of grunts functioning as nouns, verbs, or adjectives. How, though, do you explain the meaning of ‘by’, ‘because’, ‘the’, and ‘did’ to someone who understands no English? How could apes have stumbled on such grammatical terms?
Yet another difference between human and vervet vocalizations is that ours possess a hierarchial structure, such that a modest number of items at each level creates a larger number of items at the next higher level. Our language uses many different syllables, all based on the same set of a few dozen sounds. We assemble those syllables into thousands of words. Those words are not merely strung haphazardly together but are organized into phrases, such as prepositional phrases. Those phrases in turn interlock to form a potentially infinite number of sentences. In contrast, vervet calls cannot be resolved into modular elements and lack even a single stage of hierarchical organization.
As children, we master all of this complex structure of human language without ever learning the explicit rules governing it. We are not forced to formulate the rules unless we study our own language in school or learn a foreign language from books. So complex is our language’s structure that many of the underlying rules currently postulated by professional linguists have been proposed only in recent decades. This gulf between human language and animal vocalizations explains why most linguists never discuss how human language might have evolved from animal precursors. They instead regard that question as unanswerable and therefore unworthy even of speculation.
*
The earliest written languages of 5,000 years ago were as complex as those of today. Human language must have achieved its modern complexity long before that. Can we at least recognize linguistic missing links by searching for primitive peoples with simple languages that might represent early stages of language evolution? After all, some tribes of hunter-gatherers retain stone tools as simple as those that characterized the whole world tens of thousands of years ago. Nineteenth-century travel books abound with tales of backward tribes who supposedly used only a few hundred words or who lacked articulated sounds, were reduced to saying ‘ugh’, and depended on gestures for their communications. That was Darwin’s first impression of the speech of the Indians in Tierra del Fuego. But all such tales proved to be pure myth. Darwin and other western travellers merely found it as hard to distinguish the unfamiliar sounds of non-western languages as non-westerners found English sounds, or as zoologists find the sounds of vervet monkeys.
Actually, it turns out that there is no correlation between linguistic and social complexity. Technologically primitive people do not speak primitive languages, as I discovered on my first day among the Foré people in the New Guinea highlands. Foré grammar proved deliciously complex, with postpositions similar to those of the Finnish language, dual as well as singular and plural forms similar to those of Slovenian, and verb tenses and phrase construction unlike any language I had encountered previously. I have already mentioned the eight vowel tones of New Guinea’s Iyau people, whose sound distinctions proved imperceptibly subtle to professional linguists for years. Nor could we reverse Darwin’s prejudice by claiming an inverse correlation between linguistic and social complexity, citing the advanced civilizations of China and England, whose languages are simple in the sense of having little or no word inflection (verb conjugations and noun declensions). French verbs are much more highly inflected than are modern English verbs (nous aimons, vous aimez, ils aiment, etc.), yet the French consider themselves the most highly civilized people.
Thus, while some peoples in the modern world retained primitive tools, none retained primitive languages. Furthermore, Cro-Magnon archaeological sites contain lots of preserved tools but no preserved words. The absence of such linguistic missing links deprives us of what might have been our best evidence about human language origins. We are forced to try more indirect approaches.
*
One indirect approach is to ask whether some people, deprived of the opportunity t
o hear any of our fully evolved, modern languages, ever spontaneously invented a primitive language. According to the Greek historian Herodotus, the Egyptian king Psammeticus intentionally carried out such an experiment in the hope of identifying the world’s oldest language. The king assigned two newborn infants to a solitary shepherd to rear in strict silence, with instructions to listen for their first words. The shepherd duly reported that both children, after mouthing nothing but meaningless babble until the age of two, ran up to him and began constantly repeating the word becos. Since that word meant ‘bread’ in the Phrygian language then spoken in central Turkey, Psammeticus supposedly conceded that the Phrygians were the most ancient people.
Unfortunately, Herodotus’s brief account of Psammeticus’s experiment fails to convince sceptics that it was carried out as rigorously as described. It illustrates why some scholars prefer to honour Herodotus as the Father of Lies, rather than as the Father of History. Certainly, solitary infants reared in social isolation, like the famous wolf boy of Aveyron, remain virtually speechless and do not invent or discover a language. However, a variant of the Psammeticus experiment has occurred dozens of times in the modern world. In this variant, whole populations of children heard adults around them speaking a grossly simplified and variable form of language, somewhat similar to that which normal children themselves speak at around the age of two years. The children proceeded unconsciously to evolve their own language, far advanced over vervet communication but simpler than normal human languages. The results were the new languages known as pidgins and creoles, which may provide us with models of two missing links in the evolution of normal human language.
My first experience of a creole was with the New Guinea lingua franca known either as Neo-Melanesian or pidgin English. (The latter name is a confusing misnomer, since Neo-Melanesian is not a pidgin but rather a creole derived from an advanced pidgin – I shall explain the difference later – and it is only one of many independently evolved languages equally misnamed as pidgin English.) Papua New Guinea boasts about 700 native languages within an area similar to that of Sweden, but no single one of those languages is spoken by more than three per cent of the population. Not surprisingly, a lingua franca was needed and it arose after the arrival of English-speaking traders and sailors in the early 1800s. Today, Neo-Melanesian serves in Papua New Guinea as the language not only of much conversation, but also of many schools, newspapers, radios, and parliamentary discussions. The advertisement in the appendix to this chapter (see here) gives a sense of this newly evolved language.