Comparing genes and linguistic sounds across human populations worldwide

Joint analysis of data on genomics, linguistics and geography leads to new insights into their interference

- news | February 18, 2015

On January 20, 2015 PNAS published a paper “A comparison of worldwide phonemic and genetic variation in human populations” describing differences in neighboring genetics and language interactions. We have asked authors of this research, Dr. Nicole Creanza, Prof. Marcus Feldman from Stanford University and Prof. Sohini Ramachandran, to comment on this work.

The Study

Human evolutionary history, including the out-of-Africa expansion and the subsequent peopling of the world, has left a strong signature on human genetic data. Do languages, which can change much more quickly than genes and are not necessarily inherited from one’s parents, exhibit similar traces of human demographic history?

To address this question, we synthesized large databases of globally distributed linguistic and genetic data for the first time, using statistical methods to examine the imprints of human population history on both DNA variation from 246 worldwide populations and inventories of phonemes — the minimal sound units that can distinguish meaning between words — from 2,082 languages.

First, we found that geographic distance was linked to both genetic and phonemic distance: on average, the closer together two languages or two genetic samples were to one another, the more similar they were, even when the languages compared were not in the same language family. This suggests that nearby languages may have borrowed sounds from one another even if they are not closely related. However, genes and languages were not geographically structured on the same scale: whereas the relationship between genes and geography was significant on a worldwide scale, the spatial structuring in languages was only detectable within a geographic distance of ∼10,000 km.

New paper in PNAS proposes unexpected relationship between language and habitat
We then calculated the geographic direction with the strongest signal of population differentiation, which might be related to the direction of regional human migrations. With this analysis, genetic data and phonemic data predict similar axes of human geographic differentiation, further suggesting that there is a relationship between human dispersal and linguistic variation on a local scale.

In addition, we found that the pattern of changes of phoneme inventory sizes, unlike genetic changes, did not reflect the human expansion out of Africa that led to the peopling of the world by modern humans. Further, in contrast to the well-established detrimental effect of geographic isolation on genetic diversity, geographically isolated languages showed greater variance in their phonemes than languages with many neighbors.


The study of genetic variation has led to many insights about human evolutionary history. In contrast to genes, languages can change much more quickly and are not necessarily inherited from one’s parents. However, beginning in 1988, Luca Cavalli-Sforza and colleagues noted an intriguing similarity between the genetic groupings on an early phylogeny and the groupings of the same populations by language family. These language groupings were based on words (cognates), and there has been controversy in the literature concerning whether sounds (phonemes) would mirror genes in patterns of variation. The statistical analyses in this study show that phonemic and genetic variation reveals similar geographic patterns on a regional scale, but phonemes do not exhibit the deep evolutionary signals of genes; the complexity of sound transmission has produced a more complex legacy of relationships.

Future Direction

Studies of human evolutionary history benefit from a multipronged approach, drawing on many disciplines that study the human past. This study’s integration of genetic data with linguistic data, and methodologies for studying the geographic distribution of variation in both data sets, highlight an integrative approach that can be used by more researchers. Future studies, especially those incorporating more genetic samples, could shed light on the extent to which genetic and geographic relationships, in addition to linguistic relationships, can each describe phoneme evolution.

If you would like to contribute your own research, please contact us at [email protected]

Ph.D., Postdoctoral Fellow, Stanford University, Department of Biology
Ph.D., Professor, Stanford University, Department of Biology
Ph.D., Assistant Professor, Brown University, Department of Ecology and Evolutionary Biology and Center for Computational Molecular Biology
Did you like it? Share it with your friends!
Published items
To be published soon