Long-read metagenomics using PromethION uncovers oral bacteriophages

integrating both PromethION and HiSeq data of >30 Gb per sample with low human DNA contamination

Like Comment

The two most diverse human microbiomes are intestinal and oral microbiota, which harbor hundreds of coexisting species, including bacteria and viruses. Among them, bacteriophages (phages), or bacterial viruses, in the intestinal microbiome have received increasing attention over the last decade1, whereas those present in the oral microbiota have been less studied. Although a few metagenomic studies have focused on oral phages, they relied on short-read sequencing approaches that have limitations, particularly in assembly contiguity.

Here, we conduct a long-read metagenomic study of human saliva using PromethION. Our analyses, which integrate both PromethION and HiSeq data of >30 Gb per sample with low human DNA contamination, identify hundreds of viral contigs that do not cluster with those reported previously, and demonstrate enhanced scaffolding, and the ability to place a prophage in its host genomic context and enable its taxonomic classification, as shown in the following Figure 1.

Figure 1. Improved scaffolding and host genomic context compared to the viral sequences assembled from short-reads. (A) A high coverage oral prophage showing the improvement. Top: coverage plot based on mapping short-reads against it. The prophage region is indicated as a rectangle in the coverage plot spanning the entire contig.Genomic alignment of a high coverage prophage and its viral sequence registered in IMG/VR v2.0 database is shown.  (B) Distribution of the differences in length between each of the "most confidence" and "likely" prophages and their corresponding viral sequence in the IMG/VR v2.0 database.

Our analyses also identify a high coverage Streptococcus phage/prophage group (as shown in the following Figure 2) and nine jumbo phages/prophages. 86% of the phage/prophage group and 67% of the jumbo phages/prophages contain remote homologs of antimicrobial resistance genes, which might play a role as a source of recombination or horizontal gene transfer to generate a new antimicrobial resistance gene sequence, as it is known that the “mosaic” penicillin-resistant genes are generated by recombination among oral streptococci2.

Figure 2.  A group of high coverage oral phages/prophages (red at the top right)

The analysis of genes in the jumbo phages/prophages also find a caveat that we honestly and clearly write: insufficient consensus-based nucleotide error correction that generated fragmented ORFs3.  Our deep (>30 Gb per sample) long-read metagenomic sequencing successfully reconstructed the large genomic fragments of rare viruses, but their low-coverage contigs can be erroneous and should be carefully interpreted in further analyses. 

A key to these discoveries was DNA extraction by the enzymatic lysis method4, which is useful for isolation of high molecular weight DNA sufficient for long-read metagenomic sequencing.  Unexpectedly, we observed a small percentage of human reads during metagenomic sequencing of DNA extracted from salivary samples stored using the OMNIgene ORAL kit.  Honestly it was really a puzzle, but during the revision of the manuscript, we conducted an additional experiment comparing with that from the same samples stored using RNAlater (Figure 3), and succeeded in explaining the reduction of human DNA contamination.

Figure 3. Reduction of human DNA contamination.

The paper was published in Nature Communications (Yahara et al 2020).  All assembled nucleotide sequence data as well as raw read data were deposited to public databases.  This is one of the projects supported during 4 years by the 'Neo-virology' consortium5 (Figure 4) directed by Professor Yoshihiro Kawaoka. 

Figure 4. 'Neo-virology' consortium

  1. Sausset, R., Petit, M.A., Gaboriau-Routhiau, V. & De Paepe, M. New insights into intestinal phages. Mucosal Immunol 13, 205-215 (2020).
  2. Dowson, C.G et al. Evolution of penicillin resistance in Streptococcus pneumoniae; the role of Streptococcus mitis in the formation of a low affinity PBP2B in S. pneumoniae. Mol Microbiol 9, 635-43 (1993).
  3. Watson, M. & Warr, A. Errors in long-read assemblies can critically affect protein prediction. Nat Biotechnol 37, 124-126 (2019)
  4. Said, H.S. et al. Dysbiosis of salivary microbiota in inflammatory bowel disease and its association with oral immunological biomarkers. DNA Res 21, 15-25 (2014).
  5. Watanabe, T. et al. Neo-virology: The raison d'etre of viruses. Virus Res. 2019 Dec;274:197751. doi: 10.1016/j.virusres.2019.197751

Koji Yahara

Group Leader, National Institute of Infectious Diseases