Refine
Has Fulltext
- yes (7)
Is part of the Bibliography
- yes (7)
Document Type
- Doctoral Thesis (5)
- Journal article (1)
- Master Thesis (1)
Language
- English (7) (remove)
Keywords
- Phylogenie (7) (remove)
Biodiversity may be investigated and explored by the means of genetic sequence information and molecular phylogenetics. Yet, with ribosomal genes, information for phylogenetic studies may not only be retained from the primary sequence, but also from the secondary structure. Software that is able to cope with two dimensional data and designed to answer taxonomic questions has been recently developed and published as a new scientific pipeline. This thesis is concerned with expanding this pipeline by a tool that facialiates the annotation of a ribosomal region, namely the ITS2. We were also able to show that this states a crucial step for secondary structure phylogenetics and for data allocation of the ITS2-database. This resulting freely available tool determines high quality annotations. In a further study, the complete phylogenetic pipeline has been evaluated on a theoretical basis in a comprehensive simulation study. We were able to show that both, the accuracy and the robustness of phylogenetic trees are largely improved by the approach. The second major part of this thesis concentrates on case studies that applied this pipeline to resolve questions in taxonomy and ecology. We were able to determine several independent phylogenies within the green algae that further corroborate the idea that secondary structures improve the obtainable phylogenetic signal, but now from a biological perspective. This approach was applicable in studies on the species and genus level, but due to the conservation of the secondary structure also for investigations on the deeper level of taxonomy. An additional case study with blue butterflies indicates that this approach is not restricted to plants, but may also be used for metazoan phylogenies. The importance of high quality phylogenetic trees is indicated by two ecological studies that have been conducted. By integrating secondary structure phylogenetics, we were able to answer questions about the evolution of ant-plant interactions and of communities of bacteria residing on different plant tissues. Finally, we speculate how phylogenetic methods with RNA may be further enhanced by integration of the third dimension. This has been a speculative idea that was supplemented with a small phylogenetic example, however it shows that the great potential of structural phylogenetics has not been fully exploited yet. Altogether, this thesis comprises aspects of several different biological disciplines, which are evolutionary biology and biodiversity research, community and invasion ecology as well as molecular and structural biology. Further, it is complemented by statistical approaches and development of informatical software. All these different research areas are combined by the means of bioinformatics as the central connective link into one comprehensive thesis.
Background: In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking. Results: This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness. Conclusions: Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion. Reviewers: This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. Open peer review: Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers’ comments section.
During the past years, the internal transcribed spacer 2 (ITS2) was established as a commonly used molecular phylogenetic marker for the eukaryotes. Its fast evolving sequence is predestinated for the use in low-level phylogenetics. However, the ITS2 also consists of a very conserved secondary structure. This enables the discrimination between more distantly related species. The combination of both in a sequence-structure based analysis increases the resolution of the marker and enables even more robust tree reconstructions on a broader taxonomic range. But, performing such an analysis required the application of different programs and databases making the use of the ITS2 non trivial for the typical biologist. To overcome this hindrance, I have developed the ITS2 Workbench, a completely web-based tool for automated phylogenetic sequence-structure analyses using the ITS2 (http://its2.bioapps.biozentrum.uni-wuerzburg.de). The development started with an optimization of length modelling topologies for Hidden Markov Models (HMMs), which were successfully applied on a secondary structure prediction model of the ITS2 marker. Here, structure is predicted by considering the sequences' composition in combination with the length distribution of different helical regions. Next, I integrated HMMs into the sequence-structure generation process for the delineation of the ITS2 within a given sequence. This re-implemented pipeline could more than double the number of structure predictions and reduce the runtime to a few days. Together with further optimizations of the homology modelling process I can now exhaustively predict secondary structures in several iterations. These modifications currently provide 380,000 annotated sequences including 288,000 structure predictions. To include these structures in the calculation of alignments and phylogenetic trees, I developed the R-package "treeforge". It generates sequence-structure alignments on up to four different coding alphabets. For the first time also structural bonds were considered in alignments, which required the estimation of new scoring matrices. Now, the reconstruction of Maximum Parsimony, Maximum Likelihood as well as Neighbour Joining trees on all four alphabets requires just a few lines of code. The package was used to resolve the controversial chlorophyceaen dataset and could be integrated into future versions of the ITS2 workbench. The platform is based on a modern, feature-rich Web 2.0 user interface equipped with the latest AJAX and Web-service technologies. It performs HMM-based sequence annotation, structure prediction by energy minimization or homology modelling, alignment calculation and tree reconstruction on a flexible data pool that repeats calculations according to data changes. Further, it provides sequence motif detection to control annotation and structure prediction and a sequence-structure based BLAST search, which facilitates the taxon sampling process. All features and the usage of the ITS2 workbench are explained in a video tutorial. However, the workbench bears some limitations regarding the size of datasets. This is caused mainly due to the immense computational power needed for such extensive calculations. To demonstrate the validity of the approach also for large-scale analyses, a fully automated reconstruction of the Chlorophyta (Green Algal) Tree of Life was performed. The successful application of the marker even on large datasets underlines the capabilities of ITS2 sequence-structure analysis and suggests its utilization on further datasets. The ITS2 workbench provides an excellent starting point for such endeavours.
The internal transcribed spacer 2 (ITS2) of the ribosomal gene repeat is an increasingly important phylogenetic marker whose RNA secondary structure is widely conserved across eukaryotic organisms. The ITS2 database aims to be a comprehensive resource on ITS2 sequence and secondary structure, based on direct thermodynamic as well as homology modelled RNA folds. Results: (a) A rebuild of the original ITS2 database generation scripts applied to a current NCBI dataset reveal more than 60,000 ITS2 structures. This more than doubles the contents of the original database and triples it when including partial structures. (b) The end-user interface was rewritten, extended and now features user-defined homology modelling. (c) Other possible RNA structure discovery methods (namely suboptimal and shape folding) prove helpful but are not able to replace homology modelling. (d) A use case of the ITS2 database in conjunction with other tools developed at the department gave insight into molecular phylogenetic analysis with ITS2.
The genus Pogonomyrmex is predisposed for analyzing the evolution of ant colony characteristics in general and the sociogenetic structure in particular, due to the renowned biology of several species and the diversity of mating frequency and queen number. This variation in the sociogenetic structure of colonies produces a high variance in intracolonial relatedness which can be a major component driving the evolution of various colony characteristics. To exactly determine the variability of the intracolonial relatedness in the genus Pogonomyrmex both were analyzed, the number of matrilines and patrilines, in selected members of Pogonomyrmex, namely P. (sensu stricto) rugosus, P. (sensu stricto) badius and P. (Ephebomyrmex) pima using DNA fingerprint techniques. The evolution of these colony characteristics were tried to be explained within a phylogenetic framework. For that purpose we constructed a gene-tree of 39 species of the genus Pogonomyrmex. The taxon sampling covered about 83 % of the North American species and 43 % of the South American species. Effective multiple mating of queens was confirmed for P. rugosus (me=4.1) and P. badius (me=6.7). Additionally, both species are monogynous. These results corroborate behavioral observations of multiple mating for these species. Multiple mating is now known from 9 Pogonomyrmex species (behavioral evidence for 3 species – genetic evidence for 6 species). However, in P. (E.) pima all queens that were analyzed were single mated (me=1.0). Therefore, multiple mating may have either evolved early during the evolution of the genus Pogonomyrmex and has subsequently been lost in the subgenus Ephebomyrmex (plesiomorphic hypothesis), or it has first been evolved in the subgenus Pogonomyrmex sensu stricto (apomorphic hypothesis). In P. huachucanus, a species basal to the North- American sensu stricto complex, smaller effective mating number of queens compared to its sensu stricto relatives (J. Gadau and C.-P. Strehl, unpublished) probably do mirror a change from monandry to polyandry during the evolution of more advanced sensu stricto species, which would support the apomorphic hypothesis. The intracolonial relatedness in P. (E.) pima is however rather low. This is probably the result of multiple reproducing queens (polygyny). Polygyny is also documented for at least four other species of the subgenus Ephebomyrex, but so far P. (E.) pima is the only species with genetic evidence. It might be that there was an evolutionary trade-off within the subgenus Ephebomyrmex between polyandry and polygyny. Therefore, both subgenera retained a high intracolonial genetic diversity. This high genetic diversity might be one cause for the success and radiation of the genus Pogonomyrmex in arid environments. Evolution might have favored high genetic diversity of Pogonomyrmex colonies, because it helps colonies to improve their colonial organization and efficiency in performing external tasks. At least in P. badius a link between patrilines and physical polyethism was found, indicative of an improvement of colonial organization via polyandry. Furthermore, the documented extreme levels of polyandry might help P. badius females to overcome the possibility of inbreeding due to restricted dispersal. Restricted dispersal is also found in P. (E.) pima due to wingless, intermorphic queens. However, in P. (E.) pima inbreeding is probably prevented by outcrossing via males because no significant inbreeding is found. In the presented gene trees the subgenus Pogonomyrmex Ephebomyrmex was separated from the subgenus Pogonomyrmex sensu stricto. Therefore, P. Ephebomyrmex might be elevated to generic status, also due to its distinct morphological and life history characters. Nevertheless, for a precise taxonomic revision a broader complement of species has to be applied. Regularly a low number of unrelated workers was found in P. rugosus colonies, which probably stem from brood raids between mature and founding colonies. It is well known that most founding colonies are destroyed by neighboring conspecific mature colonies, but so far it was assumed that the brood of these colonies was also destroyed. This often neglected aspect might be an important fitness token for mature colonies.