For a university professor, the weekly progress report in your own group can be a very sobering experience. Rarely does the 20 minute summary of a masters thesis in bioinformatics deliver the goosebumps of discovery. But sometimes it happens. And so it was when it was Madeline Weiss' turn to confess about recent progress, or lack thereof, in understanding early evolution from the study of all trees for the roughly 290,000 clusters of all 6,1 million protein coding genes found among 2,000 sequenced prokaryotic genomes. Looking at 290,000 trees (286,514) and trying to decide what they might mean is no simple task, neither is explaining it in 20 minutes. Hence my expectations were adjusted to sea level, at low tide.
And then, about 3 minutes into the presentation, while I was already scribbling notes to myself about what I had to do before lunch, Madeline says: "So by removing all trees with lateral gene transfer (LGT) between prokaryotic domains by looking for domain monophyly while having at least two phyla represented in each domain, we can get a picture of gene content in the last universal common ancestor (LUCA) that goes beyond the standard collection of the same old 30 ribosomal proteins." It was like getting an electric shock. My pen took flight, jumping out of my hands onto the floor. I blurted out: "Whose idea was that?" She said "Shiju's" (that's Shijulal Nelson-Sathi). Shiju smiled. I blurted further: "So? So what did you get?" Filipa (Sousa) intervened and said in her inimitable soft tone "She is trying to tell you, but you interrupted, so maybe just let her give her progress report." Sinje (Neukirchen) confirmed: "It's not like she is going to hide anything, her progress report has lots of lists, really." Natalia (Mrnjavac) added "But we still need to look at the annotations very carefully" and Mayo (Roettger) pitched in "Yes, it really is interesting, but by the way, the server is now completely full with her stuff, so we need to buy a new one." Group vs. professor, 6:0.
On went the progress report, and there they were. Lists of genes that were present in LUCA. Cofactor biosynthesis, radical SAM, FeS clusters, anaerobes, redox chemistry, flavins, corrins, molybdenum, nitrogen fixation, acetyl-CoA synthase, and so forth. Three hundred and fifty five genes. It was microbial physiology. It was LUCA's physiology. The genes were not only telling us how LUCA lived, but where. That was one exciting progress report.
Almost everywhere we looked, LUCA was using methyl groups. There was even information about the genetic code, because many of LUCA's genes were involved with base modifications. In the old days, when people were still sequencing tRNA and rRNA as it exists in its functionally active state in cells, they would find lots of modified bases. The modified bases are needed to make the code work the way it does. The modified bases that are conserved in the common ancestor of bacteria and archaea appeared to contain chemical imprints of the environment where Luca arose: methyl groups, radical SAM enzymes (with FeS clusters and radical reaction mechanisms), sulfur, and selenium. On top of all that, the trees were reciprocally rooted so we could ask "Who branches nearest to the root?" It turned out that clostridia and methanogens did, and that put acetogenesis and methanogenesis — pathways that conserve energy via the reduction of CO2 to small carbon compounds —at the base of the bacterial and archaeal tree respectively. That is exciting in many ways, not the least of which being that modern hydrothermal vent geochemistry appears to perform some of the same CO2-reducing reactions today. That genomes have retained traces of life's earliest history is an exciting prospect. Even more exciting is the prospect that we can decipher some of that history. But genomes will not tell us about their early history all by themselves — we have to pose specific questions to the data, which is what we did here. We like to think that Margaret Dayhoff would have liked our paper. We hope that microbiologists like it.
Weiss, M.C., Sousa, F.L., Mrnjavac, N., Neukirchen, S., Roettger, M., Nelson-Sathi, S. & Martin, W.F. The physiology and habitat of the last universal common acestor. Nature Microbiology (2016). DOI: 10.1038/NMICROBIOL.2016.116.