TY - JOUR A1 - Koetschan, Christian A1 - Foerster, Frank A1 - Keller, Alexander A1 - Schleicher, Tina A1 - Ruderisch, Benjamin A1 - Schwarz, Roland A1 - Mueller, Tobias A1 - Wolf, Matthias A1 - Schultz, Joerg T1 - The ITS2 Database III-sequences and structures for phylogeny N2 - The internal transcribed spacer 2 (ITS2) is a widely used phylogenetic marker. In the past, it has mainly been used for species level classifications. Nowadays, a wider applicability becomes apparent. Here, the conserved structure of the RNA molecule plays a vital role. We have developed the ITS2 Database (http://its2.bioapps .biozentrum.uni-wuerzburg.de) which holds information about sequence, structure and taxonomic classification of all ITS2 in GenBank. In the new version, we use Hidden Markov models (HMMs) for the identification and delineation of the ITS2 resulting in a major redesign of the annotation pipeline. This allowed the identification of more than 160 000 correct full ength and more than 50 000 partial structures. In the web interface, these can now be searched with a modified BLAST considering both sequence and structure, enabling rapid taxon sampling. Novel sequences can be annotated using the HMM based approach and modelled according to multiple template structures. Sequences can be searched for known and newly identified motifs. Together, the database and the web server build an exhaustive resource for ITS2 based phylogenetic analyses. KW - Biologie Y1 - 2010 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-68390 ER - TY - JOUR A1 - Merget, Benjamin A1 - Koetschan, Christian A1 - Hackl, Thomas A1 - Förster, Frank A1 - Dandekar, Thomas A1 - Müller, Tobias A1 - Schultz, Jörg A1 - Wolf, Matthias T1 - The ITS2 Database JF - Journal of Visual Expression N2 - The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1 and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation. The ITS2 Database presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank accurately reannotated. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold (direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold. The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE and ProfDistS for multiple sequence-structure alignment calculation and Neighbor Joining tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure. In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses. KW - homology modeling KW - molecular systematics KW - internal transcribed spacer 2 KW - alignment KW - genetics KW - secondary structure KW - ribosomal RNA KW - phylogenetic tree KW - phylogeny Y1 - 2012 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-124600 VL - 61 IS - e3806 ER - TY - JOUR A1 - Buchheim, Mark A. A1 - Keller, Alexander A1 - Koetschan, Christian A1 - Förster, Frank A1 - Merget, Benjamin A1 - Wolf, Matthias T1 - Internal Transcribed Spacer 2 (nu ITS2 rRNA) Sequence-Structure Phylogenetics: Towards an Automated Reconstruction of the Green Algal Tree of Life JF - PLoS ONE N2 - Background: Chloroplast-encoded genes (matK and rbcL) have been formally proposed for use in DNA barcoding efforts targeting embryophytes. Extending such a protocol to chlorophytan green algae, though, is fraught with problems including non homology (matK) and heterogeneity that prevents the creation of a universal PCR toolkit (rbcL). Some have advocated the use of the nuclear-encoded, internal transcribed spacer two (ITS2) as an alternative to the traditional chloroplast markers. However, the ITS2 is broadly perceived to be insufficiently conserved or to be confounded by introgression or biparental inheritance patterns, precluding its broad use in phylogenetic reconstruction or as a DNA barcode. A growing body of evidence has shown that simultaneous analysis of nucleotide data with secondary structure information can overcome at least some of the limitations of ITS2. The goal of this investigation was to assess the feasibility of an automated, sequence-structure approach for analysis of IT2 data from a large sampling of phylum Chlorophyta. Methodology/Principal Findings: Sequences and secondary structures from 591 chlorophycean, 741 trebouxiophycean and 938 ulvophycean algae, all obtained from the ITS2 Database, were aligned using a sequence structure-specific scoring matrix. Phylogenetic relationships were reconstructed by Profile Neighbor-Joining coupled with a sequence structure-specific, general time reversible substitution model. Results from analyses of the ITS2 data were robust at multiple nodes and showed considerable congruence with results from published phylogenetic analyses. Conclusions/Significance: Our observations on the power of automated, sequence-structure analyses of ITS2 to reconstruct phylum-level phylogenies of the green algae validate this approach to assessing diversity for large sets of chlorophytan taxa. Moreover, our results indicate that objections to the use of ITS2 for DNA barcoding should be weighed against the utility of an automated, data analysis approach with demonstrated power to reconstruct evolutionary patterns for highly divergent lineages. KW - RBCL Gene-sequences KW - Colonial volvocales chlorophyta KW - 26S RDNA Data KW - Land plants KW - Molecular systematics KW - Secondary structure KW - Nuclear RDNA KW - DNA KW - Barcodes KW - Dasycladales chlorophyta KW - Profile distances Y1 - 2011 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-140866 VL - 6 IS - 2 ER - TY - JOUR A1 - Koetschan, Christian A1 - Kittelmann, Sandra A1 - Lu, Jingli A1 - Al-Halbouni, Djamila A1 - Jarvis, Graeme N. A1 - Müller, Tobias A1 - Wolf, Matthias A1 - Janssen, Peter H. T1 - Internal Transcribed Spacer 1 Secondary Structure Analysis Reveals a Common Core throughout the Anaerobic Fungi (Neocallimastigomycota) JF - PLOS ONE N2 - The internal transcribed spacer (ITS) is a popular barcode marker for fungi and in particular the ITS1 has been widely used for the anaerobic fungi (phylum Neocallimastigomycota). A good number of validated reference sequences of isolates as well as a large number of environmental sequences are available in public databases. Its highly variable nature predisposes the ITS1 for low level phylogenetics; however, it complicates the establishment of reproducible alignments and the reconstruction of stable phylogenetic trees at higher taxonomic levels (genus and above). Here, we overcame these problems by proposing a common core secondary structure of the ITS1 of the anaerobic fungi employing a Hidden Markov Model-based ITS1 sequence annotation and a helix-wise folding approach. We integrated the additional structural information into phylogenetic analyses and present for the first time an automated sequence-structure-based taxonomy of the ITS1 of the anaerobic fungi. The methodology developed is transferable to the ITS1 of other fungal groups, and the robust taxonomy will facilitate and improve high-throughput anaerobic fungal community structure analysis of samples from various environments. KW - profile distances KW - ITS2 KW - phylogenetic trees KW - RNA sequence KW - reconstruction KW - diversity KW - populations KW - tool KW - systematics KW - herbivores Y1 - 2014 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-117058 VL - 9 IS - 3 ER - TY - JOUR A1 - Keller, Alexander A1 - Foerster, Frank A1 - Mueller, Tobias A1 - Dandekar, Thomas A1 - Schultz, Joerg A1 - Wolf, Matthias T1 - Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees N2 - Background: In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking. Results: This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness. Conclusions: Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion. Reviewers: This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. Open peer review: Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers’ comments section. KW - Phylogenie KW - phylogenetics Y1 - 2010 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-67832 ER - TY - JOUR A1 - Wolf, Matthias T1 - How to teach about what is a species JF - Biology N2 - To ask students what a species is always has something rhetorical about it. Too quickly comes the rote answer, often learned by heart without ever thinking about it: “A species is a reproductive community of populations (reproductively isolated from others), which occupies a specific niche in nature” (Mayr 1982). However, do two people look alike because they are twins or are they twins because they look alike? “Two organisms do not belong to the same species because they mate and reproduce, but they only are able to do so because they belong to the same species” (Mahner and Bunge 1997). Unfortunately, most biology (pre-university) teachers have no opinion on whether species are real or conceptual, simply because they have never been taught the question themselves, but rather one answer they still pass on to their students today, learned by heart without ever thinking about it. Species are either real or conceptual and, in my opinion, it is this “or” that we should teach about. Only then can we discuss those fundamental questions such as who or what is selected, who or what evolves and, finally, what is biodiversity and phylogenetics all about? Individuals related to each other by the tree of life. KW - biospecies KW - species as individuals KW - species as natural kinds KW - species concept KW - species problem Y1 - 2021 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-241052 SN - 2079-7737 VL - 10 IS - 6 ER - TY - JOUR A1 - Rybalka, Nataliya A1 - Wolf, Matthias A1 - Andersen, Robert A1 - Friedl, Thomas T1 - Congruence of chloroplast- and nuclear-encoded DNA sequence variations used to assess species boundaries in the soil microalga Heterococcus (Stramenopiles, Xanthophyceae) JF - BMC Evolutionary Biology N2 - Background: Heterococcus is a microalgal genus of Xanthophyceae (Stramenopiles) that is common and widespread in soils, especially from cold regions. Species are characterized by extensively branched filaments produced when grown on agarized culture medium. Despite the large number of species described exclusively using light microscopic morphology, the assessment of species diversity is hampered by extensive morphological plasticity. Results: Two independent types of molecular data, the chloroplast-encoded psbA/rbcL spacer complemented by rbcL gene and the internal transcribed spacer 2 of the nuclear rDNA cistron (ITS2), congruently recovered a robust phylogenetic structure. With ITS2 considerable sequence and secondary structure divergence existed among the eight species, but a combined sequence and secondary structure phylogenetic analysis confined to helix II of ITS2 corroborated relationships as inferred from the rbcL gene phylogeny. Intra-genomic divergence of ITS2 sequences was revealed in many strains. The 'monophyletic species concept', appropriate for microalgae without known sexual reproduction, revealed eight different species. Species boundaries established using the molecular-based monophyletic species concept were more conservative than the traditional morphological species concept. Within a species, almost identical chloroplast marker sequences (genotypes) were repeatedly recovered from strains of different origins. At least two species had widespread geographical distributions; however, within a given species, genotypes recovered from Antarctic strains were distinct from those in temperate habitats. Furthermore, the sequence diversity may correspond to adaptation to different types of habitats or climates. Conclusions: We established a method and a reference data base for the unambiguous identification of species of the common soil microalgal genus Heterococcus which uses DNA sequence variation in markers from plastid and nuclear genomes. The molecular data were more reliable and more conservative than morphological data. KW - xanthophyceae KW - psbA/rbcL spacer KW - ITS2 KW - tool KW - RBCL KW - alignment KW - evolution KW - chlorophyta KW - RNA secondary structure KW - terrestrial habitats KW - phylogenetic trees KW - mixed models KW - green algae KW - heterococcus KW - systematics KW - molecular phylogeny KW - species concept Y1 - 2013 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-121848 SN - 1471-2148 VL - 13 IS - 39 ER - TY - JOUR A1 - Wolf, Matthias A1 - Chen, Shilin A1 - Song, Jingyuan A1 - Ankenbrand, Markus A1 - Müller, Tobias T1 - Compensatory Base Changes in ITS2 Secondary Structures Correlate with the Biological Species Concept Despite Intragenomic Variability in ITS2 Sequences – A Proof of Concept JF - PLoS ONE N2 - Compensatory base changes (CBCs) in internal transcribed spacer 2 (ITS2) rDNA secondary structures correlate with Ernst Mayr’s biological species concept. This hypothesis also referred to as the CBC species concept recently was subjected to large-scale testing, indicating two distinct probabilities. (1) If there is a CBC then there are two different species with a probability of ~0.93. (2) If there is no CBC then there is the same species with a probability of ~0.76. In ITS2 research, however, the main problem is the multicopy nature of ITS2 sequences. Most recently, 454 pyrosequencing data have been used to characterize more than 5000 intragenomic variations of ITS2 regions from 178 plant species, demonstrating that mutation of ITS2 is frequent, with a mean of 35 variants per species, respectively per individual organism. In this study, using those 454 data, the CBC criterion is reconsidered in the light of intragenomic variability, a proof of concept, a necessary criterion, expecting no intragenomic CBCs in variant ITS2 copies. In accordance with the CBC species concept, we could demonstrate that the probability that there is no intragenomic CBC is ~0.99. KW - citrus KW - concerted evolution KW - DNA sequences KW - Genome evolution KW - Phylogenetics KW - plant evolution KW - sequence alignment KW - sequence databases Y1 - 2013 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-96450 ER - TY - JOUR A1 - Rackevei, Antonia S. A1 - Borges, Alyssa A1 - Engstler, Markus A1 - Dandekar, Thomas A1 - Wolf, Matthias T1 - About the analysis of 18S rDNA sequence data from trypanosomes in barcoding and phylogenetics: tracing a continuation error occurring in the literature JF - Biology N2 - The variable regions (V1–V9) of the 18S rDNA are routinely used in barcoding and phylogenetics. In handling these data for trypanosomes, we have noticed a misunderstanding that has apparently taken a life of its own in the literature over the years. In particular, in recent years, when studying the phylogenetic relationship of trypanosomes, the use of V7/V8 was systematically established. However, considering the current numbering system for all other organisms (including other Euglenozoa), V7/V8 was never used. In Maia da Silva et al. [Parasitology 2004, 129, 549–561], V7/V8 was promoted for the first time for trypanosome phylogenetics, and since then, more than 70 publications have replicated this nomenclature and even discussed the benefits of the use of this region in comparison to V4. However, the primers used to amplify the variable region of trypanosomes have actually amplified V4 (concerning the current 18S rDNA numbering system). KW - RNA secondary structure KW - variable regions KW - V1–V9 KW - V4 KW - V7/V8 KW - Trypanosoma Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-297562 SN - 2079-7737 VL - 11 IS - 11 ER - TY - JOUR A1 - Merget, Benjamin A1 - Wolf, Matthias T1 - A molecular phylogeny of Hypnales (Bryophyta) inferred from ITS2 sequence-structure data N2 - Background: Hypnales comprise over 50% of all pleurocarpous mosses. They provide a young radiation complicating phylogenetic analyses. To resolve the hypnalean phylogeny, it is necessary to use a phylogenetic marker providing highly variable features to resolve species on the one hand and conserved features enabling a backbone analysis on the other. Therefore we used highly variable internal transcribed spacer 2 (ITS2) sequences and conserved secondary structures, as deposited with the ITS2 Database, simultaneously. Findings: We built an accurate and in parts robustly resolved large scale phylogeny for 1,634 currently available hypnalean ITS2 sequence-structure pairs. Conclusions: Profile Neighbor-Joining revealed a possible hypnalean backbone, indicating that most of the hypnalean taxa classified as different moss families are polyphyletic assemblages awaiting taxonomic changes. KW - Moose KW - Hypnales KW - Bryophyta Y1 - 2010 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-67997 ER - TY - JOUR A1 - Rackevei, Antonia S. A1 - Karnkowska, Anna A1 - Wolf, Matthias T1 - 18S rDNA sequence–structure phylogeny of the Euglenophyceae (Euglenozoa, Euglenida) JF - Journal of Eukaryotic Microbiology N2 - The phylogeny of Euglenophyceae (Euglenozoa, Euglenida) has been discussed for decades with new genera being described in the last few years. In this study, we reconstruct a phylogeny using 18S rDNA sequence and structural data simultaneously. Using homology modeling, individual secondary structures were predicted. Sequence–structure data are encoded and automatically aligned. Here, we present a sequence–structure neighbor‐joining tree of more than 300 taxa classified as Euglenophyceae. Profile neighbor‐joining was used to resolve the basal branching pattern. Neighbor‐joining, maximum parsimony, and maximum likelihood analyses were performed using sequence–structure information for manually chosen subsets. All analyses supported the monophyly of Eutreptiella, Discoplastis, Lepocinclis, Strombomonas, Cryptoglena, Monomorphina, Euglenaria, and Colacium. Well‐supported topologies were generally consistent with previous studies using a combined dataset of genetic markers. Our study supports the simultaneous use of sequence and structural data to reconstruct more accurate and robust trees. The average bootstrap value is significantly higher than the average bootstrap value obtained from sequence‐only analyses, which is promising for resolving relationships between more closely related taxa. KW - euglena KW - euglenids KW - phylogenetics KW - secondary structure Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-311896 VL - 70 IS - 2 ER - TY - JOUR A1 - Plieger, Tanja A1 - Wolf, Matthias T1 - 18S and ITS2 rDNA sequence-structure phylogeny of Prototheca (Chlorophyta, Trebouxiophyceae) JF - Biologia N2 - Protothecosis is an infectious disease caused by organisms currently classified within the green algal genus Prototheca. The disease can manifest as cutaneous lesions, olecranon bursitis or disseminated or systemic infections in both immunocompetent and immunosuppressed patients. Concerning diagnostics, taxonomic validity is important. Prototheca, closely related to the Chlorella species complex, is known to be polyphyletic, branching with Auxenochlorella and Helicosporidium. The phylogeny of Prototheca was discussed and revisited several times in the last decade; new species have been described. Phylogenetic analyses were performed using ribosomal DNA (rDNA) and partial mitochondrial cytochrome b (cytb) sequence data. In this work we use Internal Transcribed Spacer 2 (ITS2) as well as 18S rDNA data. However, for the first time, we reconstruct phylogenetic relationships of Prototheca using primary sequence and RNA secondary structure information simultaneously, a concept shown to increase robustness and accuracy of phylogenetic tree estimation. Using encoded sequence-structure data, Neighbor-Joining, Maximum-Parsimony and Maximum-Likelihood methods yielded well-supported trees in agreement with other trees calculated on rDNA; but differ in several aspects from trees using cytb as a phylogenetic marker. ITS2 secondary structures of Prototheca sequences are in agreement with the well-known common core structure of eukaryotes but show unusual differences in their helix lengths. An elongation of the fourth helix of some species seems to have occurred independently in the course of evolution. KW - secondary structure KW - 18S KW - ITS2 KW - phylogeny KW - prototheca Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-269897 SN - 1336-9563 VL - 77 IS - 2 ER -