Refine
Has Fulltext
- yes (4)
Is part of the Bibliography
- yes (4)
Document Type
- Doctoral Thesis (3)
- Master Thesis (1)
Language
- English (4) (remove)
Keywords
- Datenbank (4) (remove)
Computer Science approaches (software, database, management systems) are powerful tools to boost research. Here they are applied to metabolic modelling in infections as well as health care management. Starting from a comparative analysis this thesis shows own steps and examples towards improvement in metabolic modelling software and health data management. In section 2, new experimental data on metabolites and enzymes induce high interest in metabolic modelling including metabolic flux calculations. Data analysis of metabolites, calculation of metabolic fluxes, pathways and their condition-specific strengths is now possible by an advantageous combination of specific software. How can available software for metabolic modelling be improved from a computational point of view? A number of available and well established software solutions are first discussed individually. This includes information on software origin, capabilities, development and used methodology. Performance information is obtained for the compared software using provided example data sets. A feature based comparison shows limitations and advantages of the compared software for specific tasks in metabolic modeling. Often found limitations include third party software dependence, no comprehensive database management and no standard format for data input and output. Graphical visualization can be improved for complex data visualization and at the web based graphical interface. Other areas for development are platform independency, product line architecture, data standardization, open source movement and new methodologies. The comparison shows clearly space for further software application development including steps towards an optimal user friendly graphical user interface, platform independence, database management system and third party independence especially in the case of desktop applications. The found limitations are not limited to the software compared and are of course also actively tackled in some of the most recent developments. Other improvements should aim at generality and standard data input formats, improved visualization of not only the input data set but also analyzed results. We hope, with the implementation of these suggestions, metabolic software applications will become more professional, cheap, reliable and attractive for the user. Nevertheless, keeping these inherent limitations in mind, we are confident that the tools compared can be recommended for metabolic modeling for instance to model metabolic fluxes in bacteria or metabolic data analysis and studies in infection biology. ...
The phylum Tardigrada consists of about 1000 described species to date. The animals live in habitats within marine, freshwater and terrestrial ecosystems allover the world. Tardigrades are polyextremophiles. They are capable to resist extreme temperature, pressure or radiation. In the event of desiccation, tardigrades enter a so-called tun stage. The reason for their great tolerance capabilities against extreme environmental conditions is not discovered yet. Our Funcrypta project aims at finding answers to the question what mechanisms underlie these adaption capabilities particularly with regard to the species Milnesium tardigradum. The first part of this thesis describes the establishment of expressed sequence tags (ESTs) libraries for different stages of M. tardigradum. From proteomics data we bioinformatically identified 144 proteins with a known function and additionally 36 proteins which seemed to be specific for M. tardigradum. The generation of a comprehensive web-based database allows us to merge the proteome and transcriptome data. Therefore we created an annotation pipeline for the functional annotation of the protein and nucleotide sequences. Additionally, we clustered the obtained proteome dataset and identified some tardigrade-specific proteins (TSPs) which did not show homology to known proteins. Moreover, we examined the heat shock proteins of M. tardigradum and their different expression levels depending on the actual state of the animals. In further bioinformatical analyses of the whole data set, we discovered promising proteins and pathways which are described to be correlated with the stress tolerance, e.g. late embryogenesis abundant (LEA) proteins. Besides, we compared the tardigrades with nematodes, rotifers, yeast and man to identify shared and tardigrade specific stress pathways. An analysis of the 50 and 30 untranslated regions (UTRs) demonstrates a strong usage of stabilising motifs like the 15-lipoxygenase differentiation control element (15-LOX-DICE) but also reveals a lack of other common UTR motifs normally used, e.g. AU rich elements. The second part of this thesis focuses on the relatedness between several cryptic species within the tardigrade genus Paramacrobiotus. Therefore for the first time, we used the sequence-structure information of the internal transcribed spacer 2 (ITS2) as a phylogenetic marker in tardigrades. This allowed the description of three new species which were indistinguishable using morphological characters or common molecular markers like the 18S ribosomal ribonucleic acid (rRNA) or the Cytochrome c oxidase subunit I (COI). In a large in silico simulation study we also succeeded to show the benefit for the phylogenetic tree reconstruction by adding structure information to the ITS2 sequence. Next to the genus Paramacrobiotus we used the ITS2 to corroborate a monophyletic DO-group (Sphaeropleales) within the Chlorophyceae. Additionally we redesigned another comprehensive database—the ITS2 database resulting in a doubled number of sequence-structure pairs of the ITS2. In conclusion, this thesis shows the first insights (6 first author publications and 4 coauthor publications) into the reasons for the enormous adaption capabilities of tardigrades and offers a solution to the debate on the phylogenetic relatedness within the tardigrade genus Paramacrobiotus.
During the past years, the internal transcribed spacer 2 (ITS2) was established as a commonly used molecular phylogenetic marker for the eukaryotes. Its fast evolving sequence is predestinated for the use in low-level phylogenetics. However, the ITS2 also consists of a very conserved secondary structure. This enables the discrimination between more distantly related species. The combination of both in a sequence-structure based analysis increases the resolution of the marker and enables even more robust tree reconstructions on a broader taxonomic range. But, performing such an analysis required the application of different programs and databases making the use of the ITS2 non trivial for the typical biologist. To overcome this hindrance, I have developed the ITS2 Workbench, a completely web-based tool for automated phylogenetic sequence-structure analyses using the ITS2 (http://its2.bioapps.biozentrum.uni-wuerzburg.de). The development started with an optimization of length modelling topologies for Hidden Markov Models (HMMs), which were successfully applied on a secondary structure prediction model of the ITS2 marker. Here, structure is predicted by considering the sequences' composition in combination with the length distribution of different helical regions. Next, I integrated HMMs into the sequence-structure generation process for the delineation of the ITS2 within a given sequence. This re-implemented pipeline could more than double the number of structure predictions and reduce the runtime to a few days. Together with further optimizations of the homology modelling process I can now exhaustively predict secondary structures in several iterations. These modifications currently provide 380,000 annotated sequences including 288,000 structure predictions. To include these structures in the calculation of alignments and phylogenetic trees, I developed the R-package "treeforge". It generates sequence-structure alignments on up to four different coding alphabets. For the first time also structural bonds were considered in alignments, which required the estimation of new scoring matrices. Now, the reconstruction of Maximum Parsimony, Maximum Likelihood as well as Neighbour Joining trees on all four alphabets requires just a few lines of code. The package was used to resolve the controversial chlorophyceaen dataset and could be integrated into future versions of the ITS2 workbench. The platform is based on a modern, feature-rich Web 2.0 user interface equipped with the latest AJAX and Web-service technologies. It performs HMM-based sequence annotation, structure prediction by energy minimization or homology modelling, alignment calculation and tree reconstruction on a flexible data pool that repeats calculations according to data changes. Further, it provides sequence motif detection to control annotation and structure prediction and a sequence-structure based BLAST search, which facilitates the taxon sampling process. All features and the usage of the ITS2 workbench are explained in a video tutorial. However, the workbench bears some limitations regarding the size of datasets. This is caused mainly due to the immense computational power needed for such extensive calculations. To demonstrate the validity of the approach also for large-scale analyses, a fully automated reconstruction of the Chlorophyta (Green Algal) Tree of Life was performed. The successful application of the marker even on large datasets underlines the capabilities of ITS2 sequence-structure analysis and suggests its utilization on further datasets. The ITS2 workbench provides an excellent starting point for such endeavours.
The internal transcribed spacer 2 (ITS2) of the ribosomal gene repeat is an increasingly important phylogenetic marker whose RNA secondary structure is widely conserved across eukaryotic organisms. The ITS2 database aims to be a comprehensive resource on ITS2 sequence and secondary structure, based on direct thermodynamic as well as homology modelled RNA folds. Results: (a) A rebuild of the original ITS2 database generation scripts applied to a current NCBI dataset reveal more than 60,000 ITS2 structures. This more than doubles the contents of the original database and triples it when including partial structures. (b) The end-user interface was rewritten, extended and now features user-defined homology modelling. (c) Other possible RNA structure discovery methods (namely suboptimal and shape folding) prove helpful but are not able to replace homology modelling. (d) A use case of the ITS2 database in conjunction with other tools developed at the department gave insight into molecular phylogenetic analysis with ITS2.