Osborn is part of a group of scientists who are mounting a kind of scientific salvage mission. It is known as the Earth BioGenome Project, or E.B.P., and its goal is to sequence a genome from every plant, animal, and fungus on the planet, as well as from many single-celled organisms, such as algae, retrieving the results of life’s grand experiment before it’s too late. “This is a completely wonderful and insane goal,” Hank Greely, a Stanford law professor who works with the E.B.P., told me. The effort, described by its organizers as a “moonshot for biology,” will likely cost billions of dollars—yet it does not currently have any direct funding, and depends instead on the volunteer work of scientists who do. Researchers will need to scour oceans, deserts, and rain forests to collect samples before species die out. And, as new species are discovered, the task of sequencing all of them will only grow. “That’s a heavy aspiration that will probably never be entirely achieved,” Greely, who is seventy-one, told me. “It’s like, when you’re my age, planting a young oak tree in your yard. You’re not going to live to see that be a mature oak, but your hope is somebody will.” (...)
Scientists didn’t even begin to sequence a DNA molecule until 1968. In 1977, they sequenced the roughly five thousand base pairs in a virus that invades bacteria. And, in 1990, the Human Genome Project started the thirteen-year process of sequencing almost all of the three billion base pairs in our DNA. Its organizers called the endeavor “one of the most ambitious scientific undertakings of all time, even compared to splitting the atom or going to the moon.” Since then, researchers have been filling in gaps and improving the quality of their sequences, in part by using a new format known as a telomere-to-telomere, or T2T, genome. The first T2T human genome was sequenced only last year, but already scientists with the Earth BioGenome Project are talking about repeating this process for every known eukaryotic species. (Eukaryotes are organisms whose cells have nuclei.)
Because the E.B.P. does not have its own funding, it does not sample or sequence species on its own. Instead, it’s a network of networks; its organizers set ethical and scientific standards for more than fifty projects, including the Darwin Tree of Life, Vertebrate Genomes Project, the African BioGenome Project, and the Butterfly Genome Project. This way, “when we get to the end of the project, it’s not the Tower of Babel,” Harris Lewin, an evolutionary biologist at the University of California, Davis, who chairs the E.B.P. executive council, told me. “You know—your genomes are produced this way, and mine are produced that way, and they’re of different quality, so that, when you compare them, you get different results.”
By 2025, the participants hope to assemble about nine thousand sequences, one from every known family of eukaryotes. By 2029, they aim to have one sequence from every genus—a hundred and eighty thousand in all. After the third and final phase, which could be completed a decade from now, they aim to have sequenced all 1.8 million species that scientists have documented so far. (Roughly eighty per cent of eukaryotic species are still undiscovered.) This database of genomes, including annotations and metadata, will require close to an exabyte of data, or as much as two hundred million DVDs. The amount of information involved is more than “astronomical,” Lewin said; it’s “genomical.” He compared the project to the Webb Space Telescope, which received about ten billion dollars of government funding. Given how much these projects change the way that humans see the world, Lewin said, “the cost is really not that much.” (...)
One goal of the E.B.P. is to compare and contrast large numbers of genomes, revealing how they are related. Benedict Paten, a computational biologist at the University of California, Santa Cruz, has developed software to align genomes and determine which genes correspond to one another. “It’s a really rich and difficult problem,” he told me, “because genomes evolve by a bunch of really complicated processes.” For a 2020 Nature paper, Paten and several collaborators used powerful computers to align more than a trillion As, Ts, Gs, and Cs and create a tree of six hundred bird and mammal species. On a typical home computer, such an undertaking could have taken more than a million hours. “If you wanted to do it for all plants and animals, it’s just a vast computational challenge,” Paten told me.
Sooner or later, a global database of genomes will have profound practical implications. Some creatures can regrow their limbs; others do not appear to die unless they suffer an injury. If the basis for such traits can be pinpointed in genes, humans might be able to borrow them, perhaps by using gene therapies. “Evolution has already done nearly every experiment, right?” Lewin told me. “There are organisms that’ll eat oil spills, there are organisms that’ll eat heavy metals. I mean, it’s incredible.” But, when genomes inspire new products, to whom will they belong? This question makes the E.B.P. not only a scientific project but a political one.
by Matthew Hutson, New Yorker | Read more:
Image: Petra Péterffy
[ed. Awesome.]