Analysis of Complex Microbial Samples Using High Definition Mapping
Complex microbial communities play a critical role in a wide variety of biological systems in the environment and throughout the human body. Characterization of these communities has historically been limited to one or a small number of known genetic markers for species such as 16S rRNA genes. While the advent of inexpensive shotgun sequencing has enabled a more accurate measure of biodiversity than marker typing, short read lengths prevent accurate analysis of related strains within a mixture, as well as consistent characterization of large-scale structural variation that can distinguish highly related strains and significantly impact pathogenicity.
To address these issues, we have applied the Nabsys HD-Mapping™ platform to strain-level identification of microbial strains in the context of complex mixtures. HD-Mapping employs fully electronic detection of tagged single DNA molecules, hundreds of kilobases in length, at a resolution superior to existing optical mapping approaches. This combination of long read lengths and high information density means that individual HD-Mapping reads tend to be much more specific to the genomes from which they derive than do NGS reads. As a result, differences between closely related strains of the same species become clear with minimal bioinformatics work.
Here we describe strain-level characterization of the ZymoBIOMICS Microbial Community Standard using Nabsys HD-Mapping. DNA was extracted using a standard kit-based isolation procedure, and single-molecule reads derived from the mixture were mapped to the NCBI database of all ~10,500 completed bacterial references, including ~1,700 references for species present in the mixture. Through analysis of unique read mapping characteristics, the correct reference was identified for each of the 8 bacterial strains present in the mixture as well as relative strain quantitation. In addition, we show that strain-level detection of the 8 bacterial strains is unaffected by the presence of 20% human DNA co-extracted with the mixture.