Despite stepping away from the limelight of the Human Genome Project, in which he was a pioneer, Craig Venter can't help but make a splash.
This northern spring, Venter is sailing around the French Polynesian Islands scooping up bucketfuls (figuratively) of seawater in an ambitious voyage to sample microbial genomes found in the world's oceans.
His 95-foot yacht, Sorcerer II, has been outfitted with all manner of technical equipment to accommodate the task, as well as a few surfboards should that opportunity arise.
"It's tough duty," Venter chuckled to an audience a the Massachusetts Institute of Technology last month, a few days after the online publication of his group's report in Science in which he describes the proof-of-concept project for his current expedition.
Make no mistake. Venter is back -- with a sizable splash. If the Sargasso results are any indication, many surprises will follow. Venter and colleagues report finding 1.2 million genes, including almost 70,000 entirely novel genes, from an estimated 1800 genomic species, including 148 novel bacterial phylotypes. This diversity is staggering and to a large extent unexpected.
"We chose the Sargasso seas because it was supposed to be a marine desert," says Venter wryly. "The assumption was low diversity there because of the extremely low nutrients." His team sequenced a total of 1.045 billion base pairs of non-redundant sequence. At the height of the work, "over 100 million letters of genetic code were sequenced every 24 hours." The results have been deposited in GenBank.
In one fell swoop, the controversial Venter is thrusting microbial science and himself (again) into the scientific world limelight. Moreover, the chronicling of his voyage -- the Discovery Channel Quest program is an expedition backer -- will surely catch the public's imagination as the Sorcerer II retraces segments of Charles Darwin's two great voyages in the HMS Beagle and the HMS Challenger.
Venter's team has already sampled the cold waters of Nova Scotia and the Gulf of Maine, the warm Caribbean, and Darwin's cherished Galapagos Islands, to name just a few stopping spots. The full route along with updates of sampling activity is available at www.sorcerer2expedition.com.
The scientific insights that should emerge promise to be spectacular. Venter is in effect doing for the microbial gene pool what he did a decade ago for human gene-coding fragments. "We think the next set [of data] will probably be well over ten million new genes," Venter says. "The kind of diversity we're seeing will help us to do the first approach to categorising the Earth's gene pool. My back-of-the-envelope calculation is there's on the order of 10-20 million genes."
Venter is, of course, not the first to sequence the genes of microbes from the ocean. Edward DeLong at Monterey Bay Aquarium Research Institute, in California, is a pioneer in the field and discovered new genes from bacteria in Monterey. DeLong is headed to MIT soon to become professor of biology and bioengineering, and was also a presenter at the symposium at which Venter spoke.
What's new, however, is the sheer scale of Venter's effort and vision. "One of the things about shotgun sequencing is that it gives you the complete genetic repertoire of what's there, without knowing the structure a priori. We decided we had the scale and we had the tools to do an experiment on the environment," he says.
Early response to Venter's Sargasso story and global plan has been positive, if guarded. Venter readily acknowledges that the need to return to the Sargasso and conduct more extensive sampling to fully characterise the genomes present. His initial results are based on samples taken from just 1500 litres of water.
Rutgers University researchers Paul Falkowski and Columban de Vargas wrote a generally positive commentary accompanying Venter's Science paper, while complaining that Venter's strategy relies on "technological capabilities that are not presently accessible to the vast majority of marine microbiologists."
True enough. The DNA sequencing for the Sargasso pilot project and the Sorcerer II expedition is being handled by an army of Applied Biosystems 3730XL machines running around the clock at the Venter Foundation's Joint Technology Centre in Maryland. The Celera Assembler software -- roughly 0.5 million lines of code -- was tweaked slightly to handle the task of identifying overlapping DNA fragments.
Falkowski and de Vargas also note that, "despite their huge sequencing effort, Venter and collaborators were able to reconstruct only two, almost-complete genomes, and this was with the help of fully sequenced templates."
Nevertheless, the commentators acknowledge that Venter's approach "certainly increases the awareness of the vast genetic diversity and complexity present in contemporary oceans. Such an enormous number of new genes from so few samples obtained in one of the world's most nutrient-impoverished bodies of water poses significant challenges to the emerging field of marine molecular microbial ecology and evolutionary biology."
Venter remains one of the giant lightning rods in science today. He draws thunderous bolts of criticism and praise in unequal measure. Following his departure from Celera in 2002, he used his new wealth to found The Institute for Biological Energy Alternatives, The Centre for the Advancement of Genomics, and the J Craig Venter Science Foundation, to complement his other non-profit, TIGR.
In addition to his own foundation and the Discovery Channel, the Sorcerer II expedition is being funded by the US Department of Energy, which is also supporting the DNA sequencing, along with the Gordon and Betty Moore Foundation. The hope is that identification and analysis of microbial genes will help scientists devise new methods of energy generation and pollutant elimination, perhaps by putting existing microbes to work.
"If you look at all photoreceptors for all species, about 180 have been characterised, so we are particularly excited by finding approximately 800 new rhodopsin-like molecules," Venter says. "Some of these are probably just sensory, but we like to think that in this marine desert, with low nutrients and tremendous diversity, that some of these organisms switch to photobiology in the absence of other energy sources."
Since the Sargasso, samples have also been taken from shallow bottom seeps "with sulphur gushing out". Microscopic examination reveals fluorescent microbes "packed with sulphur for future use. We can't wait to get some of this stuff on a future sequence run," Venter says. While most attention is on capturing bacterial cells, the sampling protocol also collects viral and eukaryotic material.
This may prove useful in health issues. For example, Halifax, Nova Scotia, is the largest city in North America to dump untreated sewage directly into its harbour. "It's a beautiful harbour but there are signs everywhere not to swim. We have to treat those samples under strict laboratory conditions and it will interesting to see if the viral samples contain things such as HIV. This should help understand human impact [on the ocean]," Venter says. The expedition is also attempting to collect soil samples from each island visited.
Collecting soil or water samples turns out to be a touchy issue for many countries. The Sorcerer II expedition has three people working fulltime with the US State Deptartment to obtain necessary permits. "It was a big surprise to me that there's very little international waters left. I thought I was out sailing free in the ocean and somebody's claimed it all," he says. Some countries sought the right to patent any sequences derived from samples taken in their jurisdictions, but Venter says such demands were successfully rebuffed.
Even when the paperwork is in order, problems arise. "We're dealing right now with a group that's protesting us taking biological samples in Ecuador," he said.
Sorting through the dizzying array of data generated by the Sorcerer II expedition will take years. Yet already trends are emerging, and dogmas being challenged. The tremendous diversity is one surprise. The emergence of 'gene themes' is another.
"We looked at six highly conserved proteins such as recA and we had over 1000 new recA genes. Each gene is a slightly different variation, but they are all relatively similar. There doesn't seem to be 40,000 different solutions for what recA does," says Venter, who suspects this will be a common thread in the sea of newly discovered genes.