TY - JOUR A1 - Costea, Paul I. A1 - Coelho, Louis Pedro A1 - Sunagawa, Shinichi A1 - Munch, Robin A1 - Huerta-Cepas, Jaime A1 - Forslund, Kristoffer A1 - Hildebrand, Falk A1 - Kushugulova, Almagul A1 - Zeller, Georg A1 - Bork, Peer T1 - Subspecies in the global human gut microbiome JF - Molecular Systems Biology N2 - Population genomics of prokaryotes has been studied in depth in only a small number of primarily pathogenic bacteria, as genome sequences of isolates of diverse origin are lacking for most species. Here, we conducted a large‐scale survey of population structure in prevalent human gut microbial species, sampled from their natural environment, with a culture‐independent metagenomic approach. We examined the variation landscape of 71 species in 2,144 human fecal metagenomes and found that in 44 of these, accounting for 72% of the total assigned microbial abundance, single‐nucleotide variation clearly indicates the existence of sub‐populations (here termed subspecies). A single subspecies (per species) usually dominates within each host, as expected from ecological theory. At the global scale, geographic distributions of subspecies differ between phyla, with Firmicutes subspecies being significantly more geographically restricted. To investigate the functional significance of the delineated subspecies, we identified genes that consistently distinguish them in a manner that is independent of reference genomes. We further associated these subspecies‐specific genes with properties of the microbial community and the host. For example, two of the three Eubacterium rectale subspecies consistently harbor an accessory pro‐inflammatory flagellum operon that is associated with lower gut community diversity, higher host BMI, and higher blood fasting insulin levels. Using an additional 676 human oral samples, we further demonstrate the existence of niche specialized subspecies in the different parts of the oral cavity. Taken together, we provide evidence for subspecies in the majority of abundant gut prokaryotes, leading to a better functional and ecological understanding of the human gut microbiome in conjunction with its host. KW - biology KW - genetic variation KW - metagenomics KW - microbiome KW - population structure KW - prokaryotic subspecies Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-172674 VL - 13 IS - 12 ER - TY - JOUR A1 - Mende, Daniel R. A1 - Letunic, Ivica A1 - Huerta-Cepas, Jaime A1 - Li, Simone S. A1 - Forslund, Kristoffer A1 - Sunagawa, Shinichi A1 - Bork, Peer T1 - proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes JF - Nucleic Acids Research N2 - The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defined prokaryotic clades. Thus, we developed proGenomes, a resource that provides user-friendly access to currently 25 038 high-quality genomes whose sequences and consistent annotations can be retrieved individually or by taxonomic clade. These genomes are assigned to 5306 consistent and accurate taxonomic species clusters based on previously established methodology. proGenomes also contains functional information for almost 80 million protein-coding genes, including a comprehensive set of general annotations and more focused annotations for carbohydrate-active enzymes and antibiotic resistance genes. Additionally, broad habitat information is provided for many genomes. All genomes and associated information can be downloaded by user-selected clade or multiple habitat-specific sets of representative genomes. We expect that the availability of high-quality genomes with comprehensive functional annotations will promote advances in clinical microbial genomics, functional evolution and other subfields of microbiology. proGenomes is available at http://progenomes.embl.de. KW - biology KW - genomic sequence KW - prokaryotic clade KW - proGenomes KW - habitat information KW - taxonomic description KW - genome collection KW - comparative genomics KW - genomics research Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-171987 VL - 45 IS - D1 ER - TY - JOUR A1 - Coelho, Luis Pedro A1 - Alves, Renato A1 - Monteiro, Paulo A1 - Huerta-Cepas, Jaime A1 - Freitas, Ana Teresa A1 - Bork, Peer T1 - NG-meta-profiler: fast processing of metagenomes using NGLess, a domain-specific language JF - Microbiome N2 - Background Shotgun metagenomes contain a sample of all the genomic material in an environment, allowing for the characterization of a microbial community. In order to understand these communities, bioinformatics methods are crucial. A common first step in processing metagenomes is to compute abundance estimates of different taxonomic or functional groups from the raw sequencing data. Given the breadth of the field, computational solutions need to be flexible and extensible, enabling the combination of different tools into a larger pipeline. Results We present NGLess and NG-meta-profiler. NGLess is a domain specific language for describing next-generation sequence processing pipelines. It was developed with the goal of enabling user-friendly computational reproducibility. It provides built-in support for many common operations on sequencing data and is extensible with external tools with configuration files. Using this framework, we developed NG-meta-profiler, a fast profiler for metagenomes which performs sequence preprocessing, mapping to bundled databases, filtering of the mapping results, and profiling (taxonomic and functional). It is significantly faster than either MOCAT2 or htseq-count and (as it builds on NGLess) its results are perfectly reproducible. Conclusions NG-meta-profiler is a high-performance solution for metagenomics processing built on NGLess. It can be used as-is to execute standard analyses or serve as the starting point for customization in a perfectly reproducible fashion. NGLess and NG-meta-profiler are open source software (under the liberal MIT license) and can be downloaded from https://ngless.embl.de or installed through bioconda. KW - metagenomics KW - next-generation sequencing KW - domain-specific language Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-223161 VL - 7 IS - 84 ER -