Genome Annotation
Bioinformatician Tim Hubbard on the process of identifying genes, protein genes and RNA genes, and how many ge...
What exactly is Paleogenetics? What can Paleogenetics tell us about ancient environments? Professor Steven Benner explains how genetics can give us a window to the past.
Paleogenetics is a field of experimental science, where we use recombinant DNA technology, biotechnology, to bring back to life genes and proteins from ancient organisms that have long ago gone extinct.
We can do that because today we have many people doing sequencing of the genomes of many organisms. The human genome has been sequenced, as well as chimpanzees, gorillas, and many primates have had their complete genome sequenced. This is also the case with many other mammals and animals. Because of that we can infer the sequences of genes and proteins from your ancestors. For example, the last common ancestor of you and a chimpanzee, we have a relatively good idea of what you had by way of genetic makeup. That’s also true for the last common ancestor of you and a dog, cat, bird or even fruit fly.
So paleogenetics is the process of trying to learn about the history, your history, the history of life on Earth, by bringing back to life parts of the genomes of ancient organisms. Many examples of this exist. One of our favourites was concerned with how and when primates started to drink alcohol. Alcohol on Earth is only about 80 million years old. We know that because we have used paleogenetics to bring back to life the enzymes, the genes, the proteins, and yeast that ferment grapes to make alcohol, which is what you drink as wine, or ferment other things like beer. We know that yeast only started to make alcohol about 80 million years ago. This is partly because fruits, like grapes, only emerged about a 100 million years ago. So it took some time in evolution before we had the opportunity to make alcohol at all.
But then we have the ability to infer the sequences of genes and proteins that make enzymes that oxidise alcohol in your digestive tract. So when you drink alcohol, the enzymes in your throat, esophagus, and stomach start to metabolise that ethanol. Those enzymes are found in homologous forms in chimps, in fact in all animals but in rats, for example, the enzymes that are related to the enzymes that you have in your digestive tract do not themselves oxidise ethanol. So what we can do is infer by going back in time, the sequences of genes and proteins, and tell when your ancestors started to drink alcohol, because this is when the genes and proteins we resurrected started to be able to oxidise alcohol. So for example the enzymes in your digestive tract that were present in the last common ancestor of you and a chimpanzee were able to oxidise alcohol. Also, the ancestor of you and a gorilla was able to oxidise ethanol but the ancestor of you and an orangutan or a baboon were not able to oxidise alcohol. So we can say about 8 million years ago you had a mutation in the enzymes in your digestive tract that oxidizes alcohol. About 8 million years ago they started to be able to oxidize an alcohol. We know that because we do paleogenetics. We infer the genes and protein sequences of these ancient enzymes, bring them back to life using the magic of recombinant DNA, study them in the laboratory. So we know when those enzymes in your digestive tract started to be able to drink alcohol. That’s about 8 million years ago.
So the question is how do you know what the gene sequence is of an organism that lived and died 10 – 8 million years ago. That’s sort of the same way as we infer the sequences of ancient languages. So, the word in english for snow is snow, the word for snow in german is snay, and in russian, it’s sneg. These are all words in indo-european languages, and by comparing them we can guess what the structure of the word for snow was in the indo-european language that was spoken 10,000 years ago, when we have no modern speakers today. It’s the same thing with proteins. If we look at the sequences of the amino acids of the proteins in humans that are in the digestive tract, and look at the sequence in a chimpanzee, orangutan, a baboon, just like we can infer the sequences of letters in ancient words by looking at the sequences of letters in the derived words, we can figure out what the sequences of amino acids were in ancient proteins, and that’s how we bring these back to life.
The second thing of course is you can ask, well why did you learn how to drink alcohol 8 million years ago? That’s roughly the same time as you were coming down out of the trees and starting to walk across the ground. Of course when you’re in the trees you eat fruit by picking it fresh fruit and eating it. That fruit has not fallen to the ground, had its husk damaged, and not infected by yeast, so it hasn’t made any alcohol. The minute you come out of the trees you start picking up fruit off of the ground, that is fruit that has fallen. When it falls the outside of the fruit becomes damaged, then yeast can start to infect, and you can start to make alcohol. So the idea that you started to be able to drink alcohol at the same time as you came down from the trees and started walking around, which is the first time you started picking up fruits that contained alcohol makes a consistent picture about your ancient history. Of course this has major impact on our view of alcoholism. You’re obviously drinking fruits that are fermenting, that’s maybe wine, 10-15% alcohol. It’s not yet vodka, which is distilled. To make vodka, you must have civilisation.You need to be able to heat things, cook things, and distil things.
Lots of examples of this now. We have resurrected ancient genes from bacteria that lived 3 billion years ago. This is about three quarters of the way to the origin of life on Earth. These are enzymes that are optimally active at high temperatures, about 65 degrees celsius. So we know from that that bacteria 3 billion years ago were living in a hot spring, they were living at very high temperatures. That’s the start of the divergence of all modern bacteria on Earth. That comes from paleogenetics, where we resurrect ancient genes and proteins to learn what those ancient organisms were doing.
It’s like resurrecting ancient words from ancient languages also. If we know there is an indo- european word for snow, then we know indo-europeans lived in a place where they had snow. It was not in Africa, it was not even in Italy. It was some place north in northern europe. The minute you know something about the language and the environment of the people who spoke it. It’s the same thing with paleogenetics. When you resurrect ancient genes and proteins from ancient organisms, you not only learn about the genes, but also the environment of the organisms, and how they were adapting to a changing environment and climate.
A third example in addition to the ancient alcohol enzymes and ancient bacterial proteins is we looked at the genes and proteins in cows as they learned to eat grass. Grass is only about 40 million years old on Earth. The steppes of central asia are not very old by comparison to many things geological. So, the cows learned how to eat grass about 38 million years ago. We know that in part because we can resurrect the ancient proteins in the cow stomach which are used to digest the grass, and of course used to digest the bacteria that live in the cow rumen. So paleogenetics is providing many examples that connect DNA sequences and genetics to the environment and life. We are learning a lot about how life adapts to changing environments by looking at ancient forms of life, at least in small pieces.
You may wonder how you do these experiments in the laboratory. It starts by doing the sequences of the genomes of many different organisms. It’s like collecting languages, Finnish, russian, english, german, italian, and so on. Then you go to a computer, and it analyses the sequences of nucleotides in DNA and amino acids in proteins, just like linguists will analyse the sequence of sounds in words and letters in words. Then the computer builds a model for how a family of proteins has divergently evolved, what amino acids have changed, what amino acids have stayed the same in the proteins. From that computer model you build sequences of the ancient proteins.
Then it becomes time to do recombinant DNA technology. DNA synthesis is relatively easy if you know the sequence of the gene you want to have from an organism that lived 8 million years ago. You go to a DNA supplier and type in the sequence you want them to synthesise. You then pay them some money and they’ll send you the DNA that is coding for the gene of the ancient protein. Then it’s back to basic biotechnology. Once you have a gene for a protein, you can put that gene into a bacteria, and the bacteria is a little machine and will make the protein for you. This is recombinant DNA technology, and we use it all the time. We make insulin by recombinant DNA technology also, from synthetic genes. But here we are making the ancient protein from an ancient synthetic gene whose sequence we have inferred by looking at the sequences of all the derived genes. Then the graduate student goes to work. They purify the protein and study it in the laboratory to learn whether it oxidises alcohol, whether it is able to survive at warm temperatures, 65 degrees, or whether it digests grass in the stomach. This is how we connect the laboratory work on an ancient gene that has been made by inferring what the sequence was by looking at many modern genes, how we connect the behaviour of that ancient gene to an ancient environment and ancient life.
There are about 100,000 different gene families in all of life on Earth, but there is only one history of life on Earth. That entire history will be told, together with the geological records, the fossil records, as well as the genomic records and the paleogenetic records. So at some point, perhaps in the next 10 – 15 years, we will have an entire history for life on Earth, at least going back 2 – 3 billion years, and from that we will be able to understand a lot more about the intimate connection between genetics, environments, survival fitness and life itself.
Bioinformatician Tim Hubbard on the process of identifying genes, protein genes and RNA genes, and how many ge...
Chemical engineer Martin Z. Bazant on understanding dynamics at intermediate scale, "deionization shocks", and...
Biologist Arkhat Abzhanov on morphological diversity in birds, Hawaiian honeycreepers and how data on skull mo...