A Biological Dig for the Roots of Language

Once upon a time, there were very few human languages and perhaps only one, and if so, all of the 6,000 or so languages spoken round the world today must be descended from it.

If that family tree of human language could be reconstructed and its branching points dated, a wonderful new window would be opened onto the human past.

Yet in the view of many historical linguists, the chances of drawing up such a tree are virtually nil and those who suppose otherwise are chasing a tiresome delusion.

Languages change so fast, the linguists point out, that their genealogies can be traced back only a few thousand years at best before the signal dissolves completely into noise: witness how hard Chaucer is to read just 600 years later.

But the linguists' problem has recently attracted a new group of researchers who are more hopeful of success. They are biologists who have developed sophisticated mathematical tools for drawing up family trees of genes and species. Because the same problems crop up in both gene trees and language trees, the biologists are confident that their tools will work with languages, too.

The biologists' latest foray onto the linguists' turf is a reconstruction of the Indo-European family of languages by Dr. Russell D. Gray, an evolutionary biologist at the University of Auckland in New Zealand.

The family includes extinct languages like Hittite of ancient Turkey, and Tokharian, once spoken in Central Asia, as well as the Indian languages and Iranian in one major branch and all European languages except Basque in another.

Dr. Gray's results, published in November in Nature with his colleague Quentin Atkinson, have major implications, if correct, for archaeology as well as for linguistics. The shape of his tree is unsurprising — it arranges the Indo-European languages in much the same way as linguists do, using conventional methods of comparison. But the dates he puts on the tree are radically older.

Dr. Gray's calculations show that the ancestral tongue known as proto-Indo-European existed some 8,700 years ago (give or take 1,200 years), making it considerably older than linguists have assumed is likely.

The age of proto-Indo-European bears on a longstanding archaeological dispute. Some researchers, following the lead of Dr. Marija Gimbutas, who died in 1994, believe that the Indo-European languages were spread by warriors moving from their homeland in the Russian steppes, north of the Black and Caspian Seas, some time after 6,000 years ago.

A rival theory, proposed by Dr. Colin Renfrew of the University of Cambridge, holds that the Indo-Europeans were the first farmers who lived in ancient Turkey and that their language expanded not by conquest but with the spread of agriculture some 10,000 to 8,000 years ago.

Several linguists said Dr. Gray's tree was the right shape, but added that it told them nothing fresh, and that his dates were way off. "This method is not giving anything new," said Dr. Jay Jasanoff, a Harvard expert on Indo-European. As for the dates, Dr. Jasanoff said, "The numbers they have got seem extremely wrong to me."

Dr. Don Ringe, a linguist at the University of Pennsylvania who has taken a particular interest in computer modeling of language, said that Dr. Gray's approach was worth pursuing but that glottochronology, the traditional method of dating languages, had "failed to live up to its promise so often that convincing linguists there is anything there is an uphill battle."

In the biologists' camp, however, there is a feeling that the linguists do not yet fully understand how well the new techniques sidestep the pitfalls of the older method. The lack of novelty in Dr. Gray's tree of Indo-European languages is its best feature, biologists say, because it validates the method he used to construct it.

Most historical linguists know a few languages very well but less often consider the pattern of change affecting many languages, said Dr. Mark Pagel, an evolutionary biologist at the University of Reading.

"The field is being driven by people who are not confronted with the broad sweep of linguistic evolution and is being invaded by people like me who are only interested in the broad sweep," Dr. Pagel said.

Glottochronology was invented by the linguist Morris Swadesh in 1952. It is based on the compiling of a core list of 100 or 200 words that Swadesh believed were particularly resistant to change. Languages could then be compared on the basis of how many cognate words on a Swadesh list they shared in common.

Cognates are verbal cousins, like the Greek podos and the English foot, both descended from a common ancestor. The more cognates two languages share, the more recently they split apart. Swadesh and others then tried to quantify the method, deriving the date that two languages split from their percentage of shared cognates.

The method gave striking results, considering its simplicity, but not all of the findings were right. Glottochronology suffered from several problems. It assumed that languages changed at a constant rate, and it was vulnerable to unrecognized borrowings of words by one language from another, making them seem closer than they really were.

Because of these and other problems, many linguists have given up on glottochronology, showing more interest in an ingenious dating method known as linguistic paleontology.

The idea is to infer words for items in the material culture of an early language, and to correlate them with the appearance of such items in the archaeological record. Cognates for the word wheel exist in many branches of the Indo-European family tree, and linguists are confident that they can reconstruct the ancestral word in proto-Indo-European. It is, they say, "k'ek'los," the presumed forebear of words like "chakras," meaning wheel or circle in Sanskrit, "kuklos," meaning wheel or circle in Greek, as well as the English word "wheel."

The earliest wheels appear in the archaeological record around 5,500 years ago. So the proto-Indo-European language could not have started to split into its daughter tongues much before that date, some linguists argue. If the wheel was invented after the split, each language would have a different or borrowed word for it.

The dates on the earliest branches of Dr. Gray's tree are some 2,000 years earlier than the dates arrived at by linguistic paleontology.

"Since `wheel' is shared by Tocharian, Greek, Sanskrit and Germanic," said Bill Darden, an expert on Indo-European linguistic history at the University of Chicago, "and there is no evidence for wheels before the fourth millennium B.C., then having Tokharian split off 7,900 years ago and Balto-Slavic at 6,500 years ago are way out of line."

Dr. Gray, however, defends his dates, and points out a flaw in the wheel argument. What the daughter languages of proto-Indo-European inherited, he says, was not necessarily the word for wheel but the word "k'el," meaning "to rotate," from which each language may independently have derived its word for wheel. If so, the speakers of proto-Indo-European could have lived long before the invention of the wheel.

His tree, Dr. Gray said, was derived with the methods used by biologists to avoid problems identical to those in glottochronology. Genes, like languages, do not mutate at a constant rate. And organisms, particularly bacteria, often borrow genes rather than inheriting them from a common ancestor. Biologists have also learned that trees of any great complexity cannot be drawn up by subjective methods. Mathematical methods are required, like having a computer generate all possible trees — a number that quickly runs way beyond the trillions — and then deciding statistically which class of trees is more probable than the rest.

Dr. Gray based his tree on the Dyen list, a set of Indo-European words judged by linguists to be cognates, and he anchored the tree to 14 known historical dates for splits between Indo-European languages.

Many of the Dyen list cognates are marked uncertain, so Dr. Gray was able to test whether omission of the doubtful cognates made any difference (it did not). He also tested many other possible assumptions, but none of them produced an age for proto-Indo-European anywhere near the date of 6,000 years ago favored by linguists.

"This is why our results should be taken seriously by both linguists and anyone else interested in the origin of the Indo-European languages," he wrote, in a recent reply to his critics.

"We haven't repeated the errors of glottochronology," Dr. Gray said in an interview. "What we are doing is adding value, since we can make inferences about time depths which can't be made reliably in other ways."

Dr. Gray said he had formed collaborations with linguists and hoped they would give his tree a warmer reception once his critics understood that he had not made the errors they cited.

"I think these methods are extremely promising," said Dr. April McMahon of the University of Sheffield and the president of the Linguistics Association of Great Britain, though she expressed concern about Dr. Gray's emphasis on dating language splits.

If the biologists' methods can date languages that existed 9,000 years ago, how much further back can they probe?

"Words exist that can in principle resolve 20,000-year-old linguistic relationships," Dr. Pagel of Reading wrote in a recent symposium volume, "Time Depth in Historical Linguistics," adding that "words that can resolve even deeper linguistic relationships are not out of the question."

Many linguists believe that once two languages have drifted so far apart that they share only 5 percent or so of their vocabulary, chance resemblances will overwhelm the true ones, setting a firm limit on how far back their ancestry can be traced.

"That's a mistaken reasoning which shows the linguists are relying on a model of evolution they trash when they see it written down," Dr. Pagel said.

He added that their argument assumed a constant rate of language change, the very point they know is wrong in glottochronology.

Geneticists believe modern humans may have left Africa as recently as 50,000 years ago, perhaps in a single migration with very small numbers. Reconstructing language of 20,000 years ago would be a big stride toward whatever tongue those first emigrants spoke. But Dr. Gray has no plans in that direction.

"It's hard enough to work out what happened 10,000 years ago, let alone 30,000 years ago," he said.

An article in Science Times on Tuesday about efforts to construct a genealogy of the world's languages referred incorrectly to the Indo-European family. While it includes most of the European languages, Basque is not the only exception.

The reporter made a very serious error. There are several non-Indo-European languages other than Basque spoken in Europe.

A partial list would include Albanian, 9 different Saami languages, Turkish, Hungarian, Estonian and Finnish. No doubt Arabic is spoken regularly among the recently immigrated Arabs and North Africans. London and Manchester England probably have at least one newspaper published in a Dravidian language.

Within the expected lifetime of a baby born today, though, it is entirely possible that all languages on Earth will collapse into a single dominant language.

Cognates for the word wheel exist in many branches of the Indo-European family tree, and linguists are confident that they can reconstruct the ancestral word in proto-Indo-European. It is, they say, "k'ek'los," the presumed forebear of words like "chakras," meaning wheel or circle in Sanskrit, "kuklos," meaning wheel or circle in Greek, as well as the English word "wheel."

Aramaic for "wheel" is galgal, and Hebrew is galgal/gilgal. I'd have to look into it, but these could also be cognates. The hard "g" sound and the "k" sound are very close linguistically. (Both consonants are what are called "gutturals," and thus are very interchangeable.)

Hebrew and Aramaic are both Semitic languages, and so to find cognates among the languages listed is all the more striking.

I will accept that we all probably trace to a single pair of humanoids, but I don't buy the idea that all languages trace to a common ancestor.

I suspect that as the illiterate population began to separate in search of game or an acceptable habitat, they then began to acquire a method of communication which evolved into a language. However, many languages die, and I suspect new languages begin which don't necessarily have a strong connection to any pre-existing language. Net, I suggest that there are at least several roots otherwise please explain the glaring differences in written languages of the West, the Mid East and Asia. Also, explain some of the languages of Africa which are totally dissimilar to any of the above.

There's a thought among anthropoligists and archaologists that the Saami (Laplanders) have lived on the Northwest European coastline (Northern Norway, Sweden, Finland and Russia) for as long as 35,000 years having arrived BEFORE the last glacial advance.

They have several genetic adaptations to life in the far North (See Scandinavian porphyria, dwarfism, resistance to cholera, black plague, etc.) that would probably take longer than a mere 5 or 6 thousand years to develop and spread.

Authorities cite anywhere from 7 to 9 different full-blown Saami languages, all vaguely related to Finnish and other Uralic/Altaic languages. No doubt Turkish/Mongol words have infiltrated the Saami languages, as have modern English words, but the grammar is different.

If anyone wanted to make that leap into determining what language was used 20,000 years ago, he would be well advised to study Saami since it may be based on linguistic traditions 35,000 years old.

Finnish is definitely not anything like Swedish (and therefore not like Norwegian or Danish, either). Ethnically and linguistically the Finns are not "Scandinavian" like those other three.

The article mentions Finnish being in the same category as Hungarian, which I knew. I also have heard that these two are related to, of all things, Korean.

"Once upon a time, there were very few human languages and perhaps only one,"

Seems that I have read that all the people of Earth spoke one language. Where was it I read that, oh yeah, The Bible.

Galgal is obviously cognate with keklos. Seems to me, anyway. But it could be a borrowed word. In other words, something that Semitic languages borrowed from Indo-European or vice-versa.

If one accepts the Nostratic superfamily hypothesis (linking Proto-Indo-European to other families such as Semitic), then the galgal connection to keklos makes perfect sense. Of course, once a group had the wheel, I guess contact with neighboring language families would occur much more easily, so even if it's more than coincidence, it would be hard to be certain whether it's a borrowing or cognate. Nevertheless, that both words have similar meaning and both have velar stops, liquids, and apparent reduplication makes me agree with the cognate hypothesis you both made.

This result from a Google search reinforces the hypothesis; the following collection of roots includes "krikos" and "galgal."
http://www.angelfire.com/rant/tgpedersen/kr.html

Spinning wheels (operated by foot, or by turning a wheel) are very recent developments--roughtly around medieval/Renaissance. Until that time, all thread was spun on a spindle, an item of engineering so simple that you can manage it with a stick stuck into a potato. Even a small item of clothing was very labor-intensive--where we get our term, "heirloom."

(Interested in textiles)

Been reading a little of Basque lately because of the ETA--such a lot of Z's and K's! Has a very Greco affect.

And I'm always interested in Finnish, since that's what Tolkein used as a model for Elvish.

"Spinning wheels (operated by foot, or by turning a wheel) are very recent developments--roughtly around medieval/Renaissance."

I think they go back much further than that. Elizabeth Barber (A textile expert), in her book, The Mummies Of Urumchi, discusses the clothing of the mummies found in the Tarim Basin. Some of these Caucasian mummies date to 2,000BC and have clothing that are comparable with the Scottish twills of today.(patterns and weaving techniques)

The materials, styles and Manufacturing techniques are exactly like those of the Celts at Hallstadt, Austria...which is a thousand years apart in time and 4,000 miles in distance. These early people to that region spoke the extinct Indo-European language, Tocharian.

The oldest paper ever found comes from this region and the language written on the paper is Tocharian. For further reading on this subject, go here:

The Curse Of The Red-Headed Mummy

Ethnically most of the Finns and all of the Estonians are from the same stock as all other Scandinavians. The Saami are "different", but they're not Asiatics either.

Finland, Estonia and Hungary were all conquered by Mongolian and Turkish people in the early Middle Ages under conditions which brought about a linguistic change.

The Saami are a remnant (80,000+ people) of what may have been Europe's first population of modern humans. After the Black Plague which killed 90% of the Norse people in Norway, the Coastal Saami were enticed to take up Norse farms (presumably so taxes could be paid). 10% or less of the Saami suffered from the Plague although they live in an area where the dominant lifeform is the rat (and other rodents, e.g. lemmings). Their language is quite ancient ~ maybe more than we realize.

The Uralic-Altaic languages are related. Whether the people speaking them are closely related is a good question. Even Japanese has an Uralic-Altaic component, as well as a Polynesian component, and something else which is probably not related to any other current language group. Maybe the Jomon language is the "third part".

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.