TY  - THES
A1  - Zeeshan [geb. Majeed], Saman
T1  - Implementation of Bioinformatics Methods for miRNA and Metabolic Modelling
T1  - Die Umsetzung der Bioinformatik-Methoden für miRNA-und der Metabolischen Modellierung
N2  - Dynamic interactions and their changes are at the forefront of current research in bioinformatics and systems biology. This thesis focusses on two particular dynamic aspects of cellular adaptation: miRNA and metabolites.
miRNAs have an established role in hematopoiesis and megakaryocytopoiesis, and platelet miRNAs have potential as tools for understanding basic mechanisms of platelet function. The thesis highlights the possible role of miRNAs in regulating protein translation in platelet lifespan with relevance to platelet apoptosis and identifying involved pathways and potential key regulatory molecules. Furthermore, corresponding miRNA/target mRNAs in murine platelets are identified. Moreover, key miRNAs involved in aortic aneurysm are predicted by similar techniques. The clinical relevance of miRNAs as biomarkers, targets, resulting later translational therapeutics, and tissue specific restrictors of genes expression in cardiovascular diseases is also discussed.
In a second part of thesis we highlight the importance of scientific software solution development in metabolic modelling and how it can be helpful in bioinformatics tool development along with software feature analysis such as performed on metabolic flux analysis applications. We proposed the “Butterfly” approach to implement efficiently scientific software programming. Using this approach, software applications were developed for quantitative Metabolic Flux Analysis and efficient Mass Isotopomer Distribution Analysis (MIDA) in metabolic modelling as well as for data management. “LS-MIDA” allows easy and efficient MIDA analysis and, with a more powerful algorithm and database, the software “Isotopo” allows efficient analysis of metabolic flows, for instance in pathogenic bacteria (Salmonella, Listeria). All three approaches have been published (see Appendices).
N2  - Dynamische Wechselwirkungen und deren Veränderungen sind wichtige Themen der aktuellen Forschung in Bioinformatik und Systembiologie. Diese Promotionsarbeit konzentriert sich auf zwei besonders dynamische Aspekte der zellulären Anpassung: miRNA und Metabolite.
miRNAs spielen eine wichtige Rolle in der Hämatopoese und Megakaryozytopoese, und die Thrombozyten miRNAs helfen uns, grundlegende Mechanismen der Thrombozytenfunktion besser zu verstehen. 
Die Arbeit analysiert die potentielle Rolle von miRNAs bei der Proteintranslation, der Thrombozytenlebensdauer sowie der Apoptose von Thrombozyten und ermöglichte die Identifizierung von beteiligten Signalwegen und möglicher regulatorischer Schlüsselmoleküle. Darüber hinaus wurden entsprechende miRNA / Ziel-mRNAs in murinen Thrombozyten systematisch gesammelt. Zudem wurden wichtige miRNAs, die am Aortenaneurysma beteiligt sein könnten, durch ähnliche Techniken vorhergesagt. Die klinische Relevanz von miRNAs als Biomarker, und resultierende potentielle Therapeutika, etwa über eine gewebsspezifische Beeinflussung der Genexpression bei Herz-Kreislauf Erkrankungen wird ebenfalls diskutiert. 
In einem zweiten Teil der Dissertation wird die Bedeutung der Entwicklung wissenschaftlicher Softwarelösungen für die Stoffwechselmodellierung aufgezeigt, mit einer Software-Feature-Analyse wurden verschiedene Softwarelösungen in der Bioinformatik verglichen. Wir vorgeschlagen dann den "Butterfly"-Ansatz, um effiziente wissenschaftliche Software-Programmierung zu implementieren. Mit diesem Ansatz wurden für die quantitative Stoffflussanalyse mit Isotopomeren effiziente Software-Anwendungen und ihre Datenverwaltung entwickelt: LS-MIDA ermöglicht eine einfache und effiziente Analyse, die Software "Isotopo" ermöglicht mit einem leistungsfähigeren Algorithmus und einer Datenbank, eine noch effizientere Analyse von Stoffwechselflüssen, zum Beispiel in pathogenen Bakterien (Salmonellen, Listerien). Alle drei Ansätze wurden bereits veröffentlicht (siehe Appendix).
KW  - miRNS
KW  - Bioinformatics
KW  - miRNA
KW  - Metabolic Modelling
KW  - Spectral Data Analysis
KW  - Butterfly
KW  - Thrombozyt
KW  - Bioinformatik
KW  - Stoffwechsel
KW  - Modellierung
KW  - Metabolischen Modellierung
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-102900
ER  - 
TY  - THES
A1  - Yu, Sung-Huan
T1  - Development and application of computational tools for RNA-Seq based transcriptome annotations
T1  - Entwicklung und Anwendung bioinformatischer Werkzeuge für RNA-Seq-basierte Transkriptom-Annotationen
N2  - In order to understand the regulation of gene expression in organisms, precise genome annotation is essential. In recent years, RNA-Seq has become a potent method for generating and improving genome annotations. However, this Approach is time consuming and often inconsistently performed when done manually. In particular, the discovery of non-coding RNAs benefits strongly from the application of RNA-Seq data but requires significant amounts of expert knowledge and is labor-intensive. As a part of my doctoral study, I developed a modular tool called ANNOgesic that can detect numerous transcribed genomic features, including non-coding RNAs, based on RNA-Seq data in a precise and automatic fashion with a focus on bacterial and achaeal species. The software performs numerous analyses and generates several visualizations. It can generate annotations of high-Resolution that are hard to produce using traditional annotation tools that are based only on genome sequences. ANNOgesic can detect numerous novel genomic Features like UTR-derived small non-coding RNAs for which no other tool has been developed before. ANNOgesic is available under an open source license (ISCL) at https://github.com/Sung-Huan/ANNOgesic.
My doctoral work not only includes the development of ANNOgesic but also its application to annotate the transcriptome of Staphylococcus aureus HG003 - a strain which has been a insightful model in infection biology. Despite its potential as a model, a complete genome sequence and annotations have been lacking for HG003. In order to fill this gap, the annotations of this strain, including sRNAs and their functions, were generated using ANNOgesic by analyzing differential RNA-Seq data from 14 different samples (two media conditions with seven time points), as well as RNA-Seq data generated after transcript fragmentation. ANNOgesic was
also applied to annotate several bacterial and archaeal genomes, and as part of this its high performance was demonstrated. In summary, ANNOgesic is a powerful computational tool for RNA-Seq based annotations and has been successfully applied to several species.
N2  - Exakte Genomannotationen sind essentiell für das Verständnis Genexpressionsregulation in verschiedenen Organismen. In den letzten Jahren entwickelte sich RNA-Seq zu einer äußerst wirksamen Methode, um solche Genomannotationen zu erstellen und zu verbessern. Allerdings ist das Erstellen von Genomannotationen bei manueller Durchführung noch immer ein zeitaufwändiger und inkonsistenter Prozess. Die Verwendung von RNA-Seq-Daten begünstigt besonders die Identifizierung von nichtkodierenden RNAs, was allerdings arbeitsintensiv ist und fundiertes Expertenwissen erfordert. Ein Teil meiner Promotion bestand aus der Entwicklung eines modularen Tools namens ANNOgesic, das basierend auf RNA-Seq-Daten in der Lage ist, eine Vielzahl von Genombestandteilen, einschließlich nicht-kodierender RNAs, automatisch und präzise zu ermitteln. Das Hauptaugenmerk lag dabei auf der Anwendbarkeit für bakterielle und archaeale Genome. Die Software führt eine Vielzahl von Analysen durch und stellt die verschiedenen Ergebnisse grafisch dar. Sie generiert hochpräzise Annotationen, die nicht unter Verwendung herkömmlicher Annotations-Tools auf Basis von Genomsequenzen erzeugt werden könnten. Es kann eine Vielzahl neuer Genombestandteile, wie kleine nicht-kodierende RNAs in UTRs, ermitteln, welche von bisherigen Programme nicht vorhergesagt werden können. ANNOgesic ist unter einer Open-Source-Lizenz (ISCL) auf https://github.com/Sung-Huan/ANNOgesic verfügbar.
Meine Forschungsarbeit beinhaltet nicht nur die Entwicklung von ANNOgesic, sondern auch dessen Anwendung um das Transkriptom des Staphylococcus aureus-Stamms HG003 zu annotieren. Dieser ist einem Derivat von S. aureus NCTC8325 - ein Stamm, Dear ein bedeutendes Modell in der Infektionsbiologie darstellt. Zum Beispiel wurde er für die Untersuchung von Antibiotikaresistenzen genutzt, da er anfällig für alle bekannten Antibiotika ist. Der Elternstamm NCTC8325 besitzt zwei Mutationen im regulatorischen Genen (rsbU und tcaR), die Veränderungen der Virulenz zur Folge haben und die in Stamm HG003 auf die Wildtypsequenz zurückmutiert wurden. Dadurch besitzt S. aureus HG003 das vollständige, ursprüngliche Regulationsnetzwerk und stellt deshalb ein besseres Modell zur Untersuchung von sowohl Virulenz als auch Antibiotikaresistenz dar. Trotz seines Modellcharakters fehlten für HG003 bisher eine
vollständige Genomsequenz und deren Annotationen. Um diese Lücke zu schließen habe ich als Teil meiner Promotion mit Hilfe von ANNOgesic Annotationen für diesen Stamm, einschließlich sRNAs und ihrer Funktionen, generiert. Dafür habe ich Differential RNA-Seq-Daten von 14 verschiedenen Proben (zwei Mediumsbedingungen mit sieben Zeitpunkten) sowie RNA-Seq-Daten, die von fragmentierten Transkripten generiert wurden, analysiert. Neben S. aureus HG003 wurde ANNOgesic auf eine Vielzahl von Bakterien- und Archaeengenome angewendet und dabei wurde eine hohe
Performanz demonstriert. Zusammenfassend kann gesagt werden, dass ANNOgesic ein mächtiges bioinformatisches Werkzeug für die RNA-Seq-basierte Annotationen ist und für verschiedene Spezies erfolgreich angewandt wurde.
KW  - RNA-Seq
KW  - Genome Annotation
KW  - small RNA
KW  - Genom
KW  - Annotation
KW  - Small RNA
KW  - Bioinformatik
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-176468
ER  - 
TY  - THES
A1  - Wolter, Steve
T1  - Single-molecule localization algorithms in super-resolution microscopy
T1  - Einzelmoleküllokalisierungsalgorithmen in der superauflösenden Mikroskopie
N2  - Lokalisationsmikroskopie ist eine Methodenklasse der superauflösenden Fluoreszenzmikroskopie, deren Methoden sich durch stochastische zeitliche Isolation der
Fluoreszenzemission auszeichnen. Das Blinkverhalten von Fluorophoren wird so verändert, dass gleichzeitige Aktivierung von einander nahen Fluorophoren unwahrscheinlich ist. Bekannte okalisationsmikroskopische Methoden umfassen dSTORM, STORM, PALM, FPALM, oder GSDIM.
Lokalisationsmikroskopie ist von hohem biologischem Interesse, weil sie die Auflösung des Fluoreszenzmikroskops bei minimalem technischem Aufwand um eine Größenordnung verbessert.
Der verbundene Rechenaufwand ist allerdings erheblich, da Millionen von Fluoreszenzemissionen einzeln mit Nanometergenauigkeit lokalisiert werden müssen.
Der Rechen- und Implementationsaufwand dieser Auswertung hat die Verbreitung der superauflösenden Mikroskopie lange verzögert.

Diese Arbeit beschreibt meine algorithmische Grundstruktur für die Auswertung lokalisationsmikroskopischer Daten. Die Echtzeitfähigkeit, d.h. eine Auswertegeschwindigkeit oberhalb der Datenaufnahmegeschwindigkeit an normalen Messaufbauten, meines neuartigen und quelloffenen Programms wird demonstriert.
Die Geschwindigkeit wird auf verbrauchermarktgängigen Prozessoren erreicht und dadurch spezialisierte Rechenzentren oder der Einsatz von Grafikkarten vermieden.
Die Berechnung wird mit dem allgemein anerkannten Gaussschen Punktantwortmodell und einem
Rauschmodell auf Basis der größten Poissonschen Wahrscheinlichkeit durchgeführt.

Die algorithmische Grundstruktur wird erweitert, um robuste und optimale Zweifarbenauswertung zu realisieren
und damit korrelative Mikroskopie zwischen verschiedenen Proteinen und Strukturen zu ermöglichen.
Durch den Einsatz von kubischen Basissplines wird die Auswertung von dreidimensionalen Proben vereinfacht und stabilisiert, um präzisem Abbilden von mikrometerdicken Proben näher zu kommen. Das Grenzverhalten von Lokalisationsalgorithmen bei hohen Emissionsdichten wird untersucht.

Abschließend werden Algorithmen für die Anwendung der Lokalisationsmikroskopie auf verbreitete Probleme der Biologie aufgezeigt. Zelluläre Bewegung und Motilität werden anhand der in vitro Bewegung von Myosin-Aktin-Filamenten studiert. Lebendzellbildgebung mit hellen und stabilen organischen Fluorophoren wird mittels SNAP-tag-Fusionsproteinen realisiert. Die Analyse des Aufbaus von Proteinklumpen zeigt, wie Lokalisationsmikroskopie neue quantitative Ansätze jenseits reiner Bildgebung bietet.
N2  - Localization microscopy is a class of super-resolution fluorescence microscopy techniques. Localization microscopy methods are characterized by stochastic temporal isolation of fluorophore emission, i.e., making the fluorophores blink so rapidly that no two are
likely to be photoactive at the same time close to each other. Well-known localization microscopy methods include dSTORM}, STORM, PALM, FPALM, or GSDIM. The biological community has taken great interest in localization microscopy, since it can enhance the resolution of common fluorescence microscopy by an order of magnitude at little experimental cost.
However, localization microscopy has considerable computational cost since millions of individual stochastic emissions must be located with nanometer precision. The computational cost of this evaluation, and the organizational cost of implementing the complex algorithms, has impeded adoption of super-resolution microscopy for a long time.

In this work, I describe my algorithmic framework for evaluating localization microscopy data.
I demonstrate how my novel open-source software achieves real-time data evaluation, i.e., can evaluate data faster than the common experimental setups can capture them.
I show how this speed is attained on standard consumer-grade CPUs, removing the need for computing on expensive clusters or deploying graphics processing units.
The evaluation is performed with the widely accepted Gaussian PSF model and a Poissonian maximum-likelihood noise model.

I extend the computational model to show how robust, optimal two-color evaluation is realized, allowing correlative microscopy between multiple proteins or structures. By employing cubic B-splines, I show how the evaluation of three-dimensional samples can be made simple and robust, taking an important step towards precise imaging of micrometer-thick samples.
I uncover the behavior and limits of localization algorithms in the face of increasing emission densities.

Finally, I show up algorithms to extend localization microscopy to common biological problems.
I investigate cellular movement and motility by considering the in vitro movement of myosin-actin filaments. I show how SNAP-tag fusion proteins enable imaging with bright and stable organic fluorophores in live cells. By analyzing the internal structure of protein clusters, I show how localization microscopy can provide new quantitative approaches beyond pure imaging.
KW  - super-resolution microscopy
KW  - fluorescence
KW  - scientific computing
KW  - dSTORM
KW  - localization microscopy
KW  - PALM
KW  - 3D microscopy
KW  - two-color microscopy
KW  - Fluoreszenzmikroskopie
KW  - Bildauflösung
KW  - Bioinformatik
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-109370
ER  - 
TY  - THES
A1  - Wolf, Beat
T1  - Reducing the complexity of OMICS data analysis
T1  - Verringerung der Komplexität von OMICS Datenanalysen
N2  - The field of genetics faces a lot of challenges and opportunities in both research and diagnostics due to the rise of next generation sequencing (NGS), a technology that allows to sequence DNA increasingly fast and cheap.
NGS is not only used to analyze DNA, but also RNA, which is a very similar molecule also present in the cell, in both cases producing large amounts of data.
The big amount of data raises both infrastructure and usability problems, as powerful computing infrastructures are required and there are many manual steps in the data analysis which are complicated to execute.
Both of those problems limit the use of NGS in the clinic and research, by producing a bottleneck both computationally and in terms of manpower, as for many analyses geneticists lack the required computing skills.
Over the course of this thesis we investigated how computer science can help to improve this situation to reduce the complexity of this type of analysis.
We looked at how to make the analysis more accessible to increase the number of people that can perform OMICS data analysis (OMICS groups various genomics data-sources).
To approach this problem, we developed a graphical NGS data analysis pipeline aimed at a diagnostics environment while still being useful in research in close collaboration with the Human Genetics Department at the University of Würzburg.
The pipeline has been used in various research papers on covering subjects, including works with direct author participation in genomics, transcriptomics as well as epigenomics.
To further validate the graphical pipeline, a user survey was carried out which confirmed that it lowers the complexity of OMICS data analysis.

We also studied how the data analysis can be improved in terms of computing infrastructure by improving the performance of certain analysis steps.
We did this both in terms of speed improvements on a single computer (with notably variant calling being faster by up to 18 times), as well as with distributed computing to better use an existing infrastructure.
The improvements were integrated into the previously described graphical pipeline, which itself also was focused on low resource usage.

As a major contribution and to help with future development of parallel and distributed applications, for the usage in genetics or otherwise, we also looked at how to make it easier to develop such applications.
Based on the parallel object programming model (POP), we created a Java language extension called POP-Java, which allows for easy and transparent distribution of objects.
Through this development, we brought the POP model to the cloud, Hadoop clusters and present a new collaborative distributed computing model called FriendComputing.

The advances made in the different domains of this thesis have been published in various works specified in this document.
N2  - Das Gebiet der Genetik steht vor vielen Herausforderungen, sowohl in der Forschung als auch Diagnostik, aufgrund des "next generation sequencing" (NGS), eine Technologie die DNA immer schneller und billiger sequenziert.
NGS wird nicht nur verwendet um DNA zu analysieren sondern auch RNA, ein der DNA sehr ähnliches Molekül, wobei in beiden Fällen große Datenmengen zu erzeugt werden.
Durch die große Menge an Daten entstehen Infrastruktur und Benutzbarkeitsprobleme, da leistungsstarke Computerinfrastrukturen erforderlich sind, und es viele manuelle Schritte in der Datenanalyse gibt die kompliziert auszuführen sind.
Diese beiden Probleme begrenzen die Verwendung von NGS in der Klinik und Forschung, da es einen Engpass sowohl im Bereich der Rechnerleistung als auch beim Personal gibt, da für viele Analysen Genetikern die erforderlichen Computerkenntnisse fehlen.

In dieser Arbeit haben wir untersucht wie die Informatik helfen kann diese Situation zu verbessern indem die Komplexität dieser Art von Analyse reduziert wird.
Wir haben angeschaut, wie die Analyse zugänglicher gemacht werden kann um die Anzahl Personen zu erhöhen, die OMICS (OMICS gruppiert verschiedene Genetische Datenquellen) Datenanalysen durchführen können.
In enger Zusammenarbeit mit dem Institut für Humangenetik der Universität Würzburg wurde eine graphische NGS Datenanalysen Pipeline erstellt um diese Frage zu erläutern.
Die graphische Pipeline wurde für den Diagnostikbereich entwickelt ohne aber die Forschung aus dem Auge zu lassen.
Darum warum die Pipeline in verschiedenen Forschungsgebieten verwendet, darunter mit direkter Autorenteilname Publikationen in der Genomik, Transkriptomik und Epigenomik,
Die Pipeline wurde auch durch eine Benutzerumfrage validiert, welche bestätigt, dass unsere graphische Pipeline die Komplexität der OMICS Datenanalyse reduziert.

Wir haben auch untersucht wie die Leistung der Datenanalyse verbessert werden kann, damit die nötige Infrastruktur zugänglicher wird.
Das wurde sowohl durch das optimieren der verfügbaren Methoden (wo z.B. die Variantenanalyse bis zu 18 mal schneller wurde) als auch mit verteiltem Rechnen angegangen, um eine bestehende Infrastruktur besser zu verwenden.
Die Verbesserungen wurden in der zuvor beschriebenen graphischen Pipeline integriert, wobei generell die geringe Ressourcenverbrauch ein Fokus war.

Um die künftige Entwicklung von parallelen und verteilten Anwendung zu unterstützen, ob in der Genetik oder anderswo, haben wir geschaut, wie man es einfacher machen könnte solche Applikationen zu entwickeln.

Dies führte zu einem wichtigen informatischen Result, in dem wir, basierend auf dem Model von „parallel object programming“ (POP), eine Erweiterung der Java-Sprache namens POP-Java entwickelt haben, die eine einfache und transparente Verteilung von Objekten ermöglicht.
Durch diese Entwicklung brachten wir das POP-Modell in die Cloud, Hadoop-Cluster und präsentieren ein neues Model für ein verteiltes kollaboratives rechnen, FriendComputing genannt.

Die verschiedenen veröffentlichten Teile dieser Dissertation werden speziel aufgelistet und diskutiert.
KW  - Bioinformatik
KW  - Humangenetik
KW  - OMICS
KW  - Distributed computing
KW  - User interfaces
KW  - Verteiltes Datenbanksystem
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-153687
ER  - 
TY  - THES
A1  - Vainshtein, Yevhen
T1  - Applying microarray‐based techniques to study gene expression patterns: a bio‐computational approach
T1  - Anwendung von Mikroarrayanalysen um Genexpressionsmuster zu untersuchen: Ein bioinformatischer Ansatz
N2  - The regulation and maintenance of iron homeostasis is critical to human health. As a constituent of hemoglobin, iron is essential for oxygen transport and significant iron deficiency leads to anemia. Eukaryotic cells require iron for survival and proliferation. Iron is part of hemoproteins, iron-sulfur (Fe-S) proteins, and other proteins with functional groups that require iron as a cofactor. At the cellular level, iron uptake, utilization, storage, and export are regulated at different molecular levels (transcriptional, mRNA stability, translational, and posttranslational). Iron regulatory proteins (IRPs) 1 and 2 post-transcriptionally control mammalian iron homeostasis by binding to iron-responsive elements (IREs), conserved RNA stem-loop structures located in the 5’- or 3‘- untranslated regions of genes involved in iron metabolism (e.g. FTH1, FTL, and TFRC). To identify novel IRE-containing mRNAs, we integrated biochemical, biocomputational, and microarray-based experimental approaches. Gene expression studies greatly contribute to our understanding of complex relationships in gene regulatory networks. However, the complexity of array design, production and manipulations are limiting factors, affecting data quality. The use of customized DNA microarrays improves overall data quality in many situations, however, only if for these specifically designed microarrays analysis tools are available. Methods In this project response to the iron treatment was examined under different conditions using bioinformatical methods. This would improve our understanding of an iron regulatory network. For these purposes we used microarray gene expression data. To identify novel IRE-containing mRNAs biochemical, biocomputational, and microarray-based experimental approaches were integrated. IRP/IRE messenger ribonucleoproteins were immunoselected and their mRNA composition was analysed using an IronChip microarray enriched for genes predicted computationally to contain IRE-like motifs. Analysis of IronChip microarray data requires specialized tool which can use all advantages of a customized microarray platform. Novel decision-tree based algorithm was implemented using Perl in IronChip Evaluation Package (ICEP). Results IRE-like motifs were identified from genomic nucleic acid databases by an algorithm combining primary nucleic acid sequence and RNA structural criteria. Depending on the choice of constraining criteria, such computational screens tend to generate a large number of false positives. To refine the search and reduce the number of false positive hits, additional constraints were introduced. The refined screen yielded 15 IRE-like motifs. A second approach made use of a reported list of 230 IRE-like sequences obtained from screening UTR databases. We selected 6 out of these 230 entries based on the ability of the lower IRE stem to form at least 6 out of 7 bp. Corresponding ESTs were spotted onto the human or mouse versions of the IronChip and the results were analysed using ICEP. Our data show that the immunoselection/microarray strategy is a feasible approach for screening bioinformatically predicted IRE genes and the detection of novel IRE-containing mRNAs. In addition, we identified a novel IRE-containing gene CDC14A (Sanchez M, et al. 2006). The IronChip Evaluation Package (ICEP) is a collection of Perl utilities and an easy to use data evaluation pipeline for the analysis of microarray data with a focus on data quality of custom-designed microarrays. The package has been developed for the statistical and bioinformatical analysis of the custom cDNA microarray IronChip, but can be easily adapted for other cDNA or oligonucleotide-based designed microarray platforms. ICEP uses decision tree-based algorithms to assign quality flags and performs robust analysis based on chip design properties regarding multiple repetitions, ratio cut-off, background and negative controls (Vainshtein Y, et al., 2010).
N2  - Die Regulierung und Aufrechterhaltung der Eisen-Homeostase ist bedeutend für die menschliche Gesundheit. Als Bestandteil des Hämoglobins ist es wichtig für den Transport von Sauerstoff, ein Mangel führt zu Blutarmut. Eukaryotische Zellen benötigen Eisen zum Überleben und zum Proliferieren. Eisen ist am Aufbau von Hämo- und Eisenschwefelproteinen (Fe-S) beteiligt und kann als Kofaktor dienen. Die Aufnahme, Nutzung, Speicherung und der Export von Eisen ist zellulär auf verschiedenen molekularen Ebenen reguliert (Transkription, mRNA-Level, Translation, Protein-Level). Die iron regulatory proteins (IRPs) 1 und 2 kontrollieren die Eisen-Homeostase in Säugetieren posttranslational durch die Bindung an Iron-responsive elements (IREs). IREs sind konservierte RNA stem-loop Strukturen in den 5' oder 3' untranslatierten Bereichen von Genen, die im Eisenmetabolismus involviert sind (z.B. FTH1, FTL und TFRC). In dieser Arbeit wurden biochemische und bioinformatische Methoden mit Microarray-Experimenten kombiniert, um neue mRNAs mit IREs zu identifizieren. Genexpressionsstudien verbessern unser Verständnis über die komplexen Zusammenhänge in genregulatorischen Netzwerken. Das komplexe Design von Microarrays, deren Produktion und Manipulation sind dabei die limitierenden Faktoren bezüglich der Datenqualität. Die Verwendung von angepassten DNA Microarrays verbessert häufig die Datenqualität, falls entsprechende Analysemöglichkeiten für diese Arrays existieren. Methoden Um unser Verständnis von eisenregulierten Netzwerken zu verbessern, wurde im Rahmen dieses Projektes die Auswirkung einer Behandlung mit Eisen bzw. von Knockout Mutation unter verschiedenen Bedingungen mittels bioinformatischer Methoden untersucht. Hierfür nutzen wir Expressionsdaten aus Microarray-Experimenten. Durch die Verknüpfung von biochemischen, bioinformatischen und Microarray Ansätzen können neue Proteine mit IREs identifiziert werden. IRP/IRE messenger Ribonucleoproteine wurden immunpräzipitiert. Die Zusammensetzung der enthaltenen mRNAs wurde mittels einem IronChip Microarray analysiert: Für diesen Chip wurden bioinformatisch Gene vorhergesagt, die IRE-like Motive aufweisen. Der Chip wurde mit solchen Oligonucleotiden beschichtet und durch Hybridisierung überprüft, ob die präzipitierten mRNA sich hieran binden. Die Analyse der erhaltenen Daten erfordert ein spezialisiertes Werkzeug um von allen Vorteilen der angepassten Microarrays zu profitieren. Ein neuer Entscheidungsbaum-basierter Algorithmus wurde in Perl im IronChip Evaluation Package (ICEP) implementiert. Ergebnisse Aus großen Sequenz-Datenbanken wurden IRE-like Motive identifiziert. Dazu kombiniert der Algorithmus, insbesondere RNA-Primärsequenz und RNA-Strukturdaten. Solche Datenbankanalysen tendieren dazu, eine große Anzahl falsch positiver Treffer zu generieren. Daher wurden zusätzliche Bedingungen formuliert, um die Suche zu verfeinern und die Anzahl an falsch positiven Treffer zu reduzieren. Die angepassten Suchkriterien ergaben 15 IRE-like Motive. In einem weiteren Ansatz verwendeten wir eine Liste von 230 IRE-like Sequenzen aus UTR-Datenbanken. Daraus wurden 6 Sequenzen ausgewählt, die auch im unteren Teil stabil sind (untere Helix über 6 bp stabil). Die korrespondierenden Expressed Sequence Tags (ESTs) wurden auf die humane oder murine Version des IronChips aufgetragen. Die Microarray Ergebnisse wurden mit dem ICEP Programm ausgewertet. Unsere Ergebnisse zeigen, dass die Immunpräzipitation mit anschließender Microarrayanalyse ein nützlicher Ansatz ist, um bioinformatisch vorhergesagte IRE-Gene zu identifizieren. Darüber hinaus ermöglicht uns dieser Ansatz die Detektion neuer mRNAs, die IREs enthalten, wie das von uns gefundene Gen CDC14A (Sanchez et al., 2006). ICEP ist ein optimiertes Programmpaket aus Perl Programmen (Vainshtein et al., BMC Bioinformatics, 2010). Es ermöglicht die einfache Auswertung von Microarray Daten mit dem Fokus auf selbst entwickelten Microarray Designs. ICEP diente für die statistische und bioinformatische Analyse von selbst entwickelten IronChips, kann aber auch leicht an die Analyse von oligonucleotidbasierten oder cDNA Microarrays adaptiert werden. ICEP nutzt einen Entscheidungsbaum-basierten Algorithmus um die Qualität zu bewerten und führt eine robuste Analyse basierend auf Chipeigenschaften, wie mehrfachen Wiederholungen, Signal/Rausch Verhältnis, Hintergrund und Negativkontrollen durch.
KW  - Microarray
KW  - Genexpression
KW  - Bioinformatik
KW  - geneexpression
KW  - microarrays
KW  - IronChip
KW  - ICEP
Y1  - 2010
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-51967
ER  - 
TY  - THES
A1  - Thakar, Juilee
T1  - Computational models for the study of responses to infections
T1  - Bioinformatische Modelle zur Analyse der Immunantwort auf Infektionen
N2  - In diesem Jahrhundert haben neue experimentelle Techniken und Computer-Verfahren enorme Mengen an Information erzeugt, die bereits viele biologische Rätsel enthüllt haben. Doch die Komplexität biologischer Systeme wirft immer weitere neue Fragen auf. Um ein System zu verstehen, bestand der Hauptansatz bis jetzt darin, es in Komponenten zu zerlegen, die untersucht werden können. Ein neues Paradigma verknüpft die einzelnen Informationsteile, um sie auf globaler Ebene verstehen zu können. In der vorgelegten Doktorarbeit habe ich deshalb versucht, infektiöse Krankheiten mit globalen Methoden („Systembiologie“) bioinformatisch zu untersuchen. Im ersten Teil wird der Apoptose-Signalweg analysiert. Apoptose (Programmierter Zelltod) wird bei verschiedenen Infektionen, zum Beispiel bei Viruserkrankungen, als Abwehrmaßnahme eingesetzt. Die Interaktionen zwischen Proteinen, die ‚death’ Domänen beinhalten, wurden untersucht, um folgende Fragen zu klären: i) wie wird die Spezifität der Interaktionen erzielt? –sie wird durch Adapter erreicht, ii) wie werden Proliferation/ Überlebenssignale während der Aktivierung der Apoptose eingeleitet? – wir fanden Hinweise für eine entscheidende Rolle des RIP Proteins (Rezeptor-Interagierende Serine/Threonine-Proteinkinase 1). Das Modell erlaubte uns, die Interaktions-Oberflächen von RIP vorherzusagen. Der Signalweg wurde anschließend auf globaler Ebene mit Simulationen für verschiedene Zeitpunkte analysiert, um die Evolution der Aktivatoren und Inhibitoren des Signalwegs und seine Struktur besser zu verstehen. Weiterhin wird die Signalverarbeitung für Apoptosis-Signalwege in der Maus detailliert modelliert, um den Konzentrationsverlauf der Effektor-Kaspasen vorherzusagen. Weitere experimentelle Messungen von Kaspase-3 und die Überlebenskurven von Zellen bestätigen das Modell. Der zweite Teil der Resultate konzentriert sich auf das Phagosom, eine Organelle, die eine entscheidende Rolle bei der Eliminierung von Krankheitserregern spielt. Dies wird am Beispiel von M. tuberculosis veranschaulicht. Die Fragestellung wird wiederum in zwei Aspekten behandelt: i) Um die Prozesse, die durch M. tuberculosis inhibiert werden zu verstehen, haben wir uns auf das Phospholipid-Netzwerk konzentriert, das bei der Unterdrückung oder Aktivierung der Aktin-Polymerisation eine große Rolle spielt. Wir haben für diese Netzwerkanalyse eine Simulation für verschiedene Zeitpunkte ähnlich wie in Teil eins angewandt. ii) Es wird vermutet, dass Aktin-Polymere bei der Fusion des Phagosoms mit dem Lysosom eine Rolle spielen. Um diese Hypothese zu untersuchen, wurde ein in silico Modell von uns entwickelt. Wir fanden heraus, dass in der Anwesenheit von Aktin-Polymeren die Suchzeit für das Lysosom um das Fünffache reduziert wurde. Weiterhin wurden die Effekte der Länge der Aktin-Polymere, die Größe der Lysosomen sowie der Phagosomen und etliche andere Modellparameter analysiert. Nach der Untersuchung eines Signalwegs und einer Organelle führte der nächste Schritt zur Untersuchung eines komplexen biologischen Systems der Infektabwehr. Dies wurde am Beispiel der Wirt-Pathogen Interaktion bei Bordetella pertussis und Bordetella bronchiseptica dargestellt. Die geringe Menge verfügbarer quantitativer Daten war der ausschlaggebende Faktor bei unserer Modellwahl. Für die dynamische Simulation wurde ein selbst entwickeltes Bool’sches Modell verwendet. Die Ergebnisse sagen wichtige Faktoren bei der Pathologie von Bordetellen hervor, besonders die Bedeutung der Th1 assoziierten Antworten und dagegen nicht der Th2 assoziierten Antworten für die Eliminierung des Pathogens. Einige der quantitativen Vorhersagen wurden durch Experimente wie die Untersuchung des Verlaufs einer Infektion in verschiedenen Mutanten und Wildtyp-Mäusen überprüft. Die begrenzte Verfügbarkeit kinetischer Daten war der kritische Faktor bei der Auswahl der computer-gestützten Modelle. Der Erfolg unserer Modelle konnte durch den Vergleich mit experimentellen Beobachtungen belegt werden. Die vergleichenden Modelle in Kapitel 6 und 9 können zur Untersuchung neuer Wirt-Pathogen Interaktionen verwendet werden. Beispielsweise führt in Kapitel 6 die Analyse von Inhibitoren und inhibitorischer Signalwege aus drei Organismen zur Identifikation wichtiger regulatorischer Zentren in komplexen Organismen und in Kapitel 9 ermöglicht die Identifikation von drei Phasen in B. bronchiseptica und der Inhibition von IFN-&#947; durch den Faktor TTSS die Untersuchung ähnlicher Phasen und die Inhibition von IFN-&#947; in B. pertussis. Eine weitere wichtige Bedeutung bekommen diese Modelle durch die mögliche Identifikation neuer, essentieller Komponenten in Wirt-Pathogen Interaktionen. In silico Modelle der Effekte von Deletionen zeigen solche Komponenten auf, die anschließend durch experimentelle Mutationen weiter untersucht werden können.
N2  - In this century new experimental and computational techniques are adding an enormous amount of information, revealing many biological mysteries. The complexities of biological systems still broach new questions. Till now the main approach to understand a system has been to divide it in components that can be studied. The upcoming new paradigm is to combine the pieces of information in order to understand it at a global level. In the present thesis we have tried to study infectious diseases with such a global ‘Systems Biology’ approach. In the first part the apoptosis pathway is analyzed. Apoptosis (Programmed cell death) is used as a counter measure in different infections, for example viral infections. The interactions between death domain containing proteins are studied to address the following questions: i) How specificity is maintained - showing that it is induced through adaptors, ii) how proliferation/ survival signals are induced during activation of apoptosis – suggesting the pivotal role of RIP. The model also allowed us to detect new possible interacting surfaces. The pathway is then studied at a global level in a time step simulation to understand the evolution of the topology of activators and inhibitors of the pathway. Signal processing is further modeled in detail for the apoptosis pathway in M. musculus to predict the concentration time course of effector caspases. Further, experimental measurements of caspase-3 and viability of cells validate the model. The second part focuses on the phagosome, an organelle which plays an essential role in removal of pathogens as exemplified by M. tuberculosis. Again the problem is addressed in two main sections: i) To understanding the processes that are inhibited by M. tuberculosis; we focused on the phospholipid network applying a time step simulation in section one, which plays an important role in inhibition or activation of actin polymerization on the phagosome membrane. ii) Furthermore, actin polymers are suggested to play a role in the fusion of the phagosome with lysosome. To check this hypothesis an in silico model was developed; we find that the search time is reduced by 5 fold in the presence of actin polymers. Further the effect of length of actin polymers, dimensions of lysosome, phagosome and other model parameter is analyzed. After studying a pathway and then an organelle, the next step was to move to the system. This was exemplified by the host pathogen interactions between Bordetella pertussis and Bordetella bronchiseptica. The limited availability of quantitative information was the crucial factor behind the choice of the model type. A Boolean model was developed which was used for a dynamic simulation. The results predict important factors playing a role in Bordetella pathology especially the importance of Th1 related responses and not Th2 related responses in the clearance of the pathogen. Some of the quantitative predictions have been counterchecked by experimental results such as the time course of infection in different mutants and wild type mice. All these computational models have been developed in presence of limited kinetic data. The success of these models has been validated by comparison with experimental observations. Comparative models studied in chapters 6 and 9 can be used to explore new host pathogen interactions. For example in chapter 6, the analysis of inhibitors and inhibitory paths in three organism leads to the identification of regulatory hotspots in complex organisms and in chapter 9 the identification of three phases in B. bronchiseptica and inhibition of IFN-&#947; by TTSS lead us to explore similar phases and inhibition of IFN-&#947; in B. pertussis. Further an important significance of these models is to identify new components playing an essential role in host-pathogen interactions. In silico deletions can point out such components which can be further analyzed by experimental mutations.
KW  - Bordetella pertussis
KW  - Infektion
KW  - Apoptosis
KW  - Signaltransduktion
KW  - Bioinformatik
KW  - Tuberkelbakterium
KW  - Biologische Kaskaden
KW  - Bordetellae
KW  - M. tuberculosis
KW  - Apoptose
KW  - Biological cascades
KW  - Bordetellae
KW  - M. tuberculosis
KW  - Apoptosis
Y1  - 2006
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-17266
ER  - 
TY  - THES
A1  - Selig, Christian
T1  - The ITS2 Database - Application and Extension
N2  - Der internal transcribed spacer 2 (ITS2) des ribosomalen Genrepeats ist ein zunehmend wichtiger phylogenetischer Marker, dessen RNA-Sekundärstruktur innerhalb vieler eukaryontischer Organismen konserviert ist. Die ITS2-Datenbank hat zum Ziel, eine umfangreiche Ressource für ITS2-Sequenzen und -Sekundärstrukturen auf Basis direkter thermodynamischer als auch homologiemodellierter RNA-Faltung zu sein. Ergebnisse: (a) Eine komplette Neufassung der ursprünglichen die ITS2-Datenbank generierenden Skripte, angewandt auf einen aktuellen NCBI-Datensatz, deckte mehr als 65.000 ITS2-Strukturen auf. Dies verdoppelt den Inhalt der ursprünglichen Datenbank und verdreifacht ihn, wenn partielle Strukturen mit einbezogen werden. (b) Die Endbenutzer-Schnittstelle wurde neu geschrieben, erweitert und ist jetzt in der Lage, benutzerdefinierte Homologiemodellierungen durchzuführen. (c) Andere möglichen RNA-Strukturaufklärungsmethoden (suboptimales und formenbasiertes Falten) sind hilfreich, können aber Homologiemodellierung nicht ersetzen. (d) Ein Anwendungsfall der ITS2-Datenbank in Zusammenhang mit anderen am Lehrstuhl entwickelten Werkzeugen gab Einblick in die Verwendung von ITS2 für molekulare Phylogenie.
N2  - The internal transcribed spacer 2 (ITS2) of the ribosomal gene repeat is an increasingly important phylogenetic marker whose RNA secondary structure is widely conserved across eukaryotic organisms. The ITS2 database aims to be a comprehensive resource on ITS2 sequence and secondary structure, based on direct thermodynamic as well as homology modelled RNA folds. Results: (a) A rebuild of the original ITS2 database generation scripts applied to a current NCBI dataset reveal more than 60,000 ITS2 structures. This more than doubles the contents of the original database and triples it when including partial structures. (b) The end-user interface was rewritten, extended and now features user-defined homology modelling. (c) Other possible RNA structure discovery methods (namely suboptimal and shape folding) prove helpful but are not able to replace homology modelling. (d) A use case of the ITS2 database in conjunction with other tools developed at the department gave insight into molecular phylogenetic analysis with ITS2.
KW  - Phylogenie
KW  - Bioinformatik
KW  - Würzburg / Universität / Lehrstuhl für Bioinformatik
KW  - Datenbank
KW  - Perl
KW  - SQL
KW  - Trichoplax adhaerens
KW  - Placozoa
KW  - RNS
KW  - S
KW  - internal transcribed spacer 2
KW  - ITS-2
KW  - ITS2
KW  - Phylogeny
KW  - Database
KW  - Perl
KW  - SQL
KW  - Placozoa
Y1  - 2007
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-23895
ER  - 
TY  - THES
A1  - Schwarz, Roland
T1  - Modellierung von Metabolismus, Transkriptom und Zellentwicklung bei Arabidopsis, Listerien und anderen Organismen
T1  - Modeling of metabolism, transcriptome and cell development in Arabidopsis, Listeria and other organisms
N2  - Im gleichen Maße wie informatisches Wissen mehr und mehr in den wissenschaftlichen Alltag aller Lebenswissenschaften Einzug gehalten hat, hat sich der Schwerpunkt bioinformatischer Forschung in stärker mathematisch und informatisch-orientierte Themengebiete verschoben. Bioinformatik heute ist mehr als die computergestützte Verarbeitung großer Mengen an biologischen Daten, sondern hat einen entscheidenden Fokus auf der Modellierung komplexer biologischer Systeme. Zur Anwendung kommen hierbei insbesondere Theorien aus dem Bereich der Stochastik und Statistik, des maschinellen Lernens und der theoretischen Informatik. In der vorliegenden Dissertation beschreibe ich in Fallstudien die systematische Modellierung biologischer Systeme aus einem informatisch - mathematischen Standpunkt unter Anwendung von Verfahren aus den genannten Teilbereichen und auf unterschiedlichen Ebenen biologischer Abstraktion. Ausgehend von der Sequenzinformation über Transkriptom, Metabolom und deren regulatorischer Interaktion hin zur Modellierung von Populationseffekten werden hierbei aktuelle biologische Fragestellungen mit mathematisch - informatischen Modellen und einer Vielzahl experimenteller Daten kombiniert. Ein besonderer Augenmerk liegt dabei auf dem Vorgang der Modellierung und des Modellbegriffs als solchem im Rahmen moderner bioinformatischer Forschung. Im Detail umfassen die Projekte (mehrere Publikationen) die Entwicklung eines neuen Ansatzes zur Einbettung und Visualisierung von Multiplen Sequenz- und Sequenz-Strukturalignments, illustriert am Beispiel eines Hemagglutininalignments unterschiedlicher H5N1 Varianten, sowie die Modellierung des Transkriptoms von A. thaliana, bei welchem mit Hilfe einer kernelisierten nicht-parametrischen Metaanalyse neue, an der Infektionsabwehr beteiligten, Gene ausfindig gemacht werden konnten. Desweiteren ist uns mit Hilfe unserer Software YANAsquare eine detaillierte Untersuchung des Metabolismus von L. monocytogenes unter Aktivierung des Transkriptionsfaktors prfA gelungen, dessen Vorhersagen durch experimentelle 13C Isotopologstudien belegt werden konnten. In einem Anschlußprojekt war der Zusammenhang zwischen Regulation des Metabolismus durch Regulation der Genexpression und der Fluxverteilung des metabolischen Steady- State-Netzwerks das Ziel. Die Modellierung eines komplexen organismischen Phänotyps, der Zellgrößenentwicklung der Diatomee Pseudo-nitzschia delicatissima, schließt die Untersuchungen ab.
N2  - In the same way that informatical knowledge has made its way into almost all areas of research in the Life Sciences, the focus of bioinformatical research has shifted towards topics originating more in the fields of mathematics and theoretical computer science. Bioinformatics today is more than the computer-driven processing of huge amounts of biological data, but it has a special focus on the emphmodelling of complex biological systems. Of special importance hereby are theories from stochastics and statistics, from the field of machine learning and theoretical computer science. In the following dissertation, I describe the systematic modelling of biological systems from an informatical-mathematical point of view in a case studies approach, applying methods from the aforementioned areas of research and on different levels of biological abstraction. Beginning with the sequence information itself, followed by the transcriptome, metabolome and the interaction of both and finally population effects I show how current biological questions can be tackled with mathematical models and combined with a variety of different experimental datasets. A special focus lies hereby on the procedure of modelling and the concept and notion of a model as such in the framework of bioinformatical research. In more detail, the projects contained the development of a new approach for embedding and visualizing Multiple Sequence and Structure Alignments, which was illustrated using a hemagglutinin alignment from different H5N1 variants as an example. Furthermore we investigated the A. thaliana transcriptome by means of a kernelized non-parametric meta-analysis, thus being able to annotate several new genes as pathogen-defense related. Another major part of this work was the modelling of the metabolic network of L. monocytogenes under activation of the transcription factor prfA, establishing predictions which were later verified by experimental 13C isotopologue studies. Following this project we investigated the relationship between the regulation of metabolism by changes in the cellular genexpression patterns and the flux distributions of the metabolic steady-state network. Modelling of a complex organismal property, the cell size development of the planktonic diatom Pseudo-nitzschia delicatissima concludes this work.
KW  - Bioinformatik
KW  - Würzburg / Universität / Lehrstuhl für Bioinformatik
KW  - Modellierung
KW  - Metabolismus
KW  - Stoffwechsel
KW  - Transkriptom
KW  - Transkriptomanalyse
KW  - bioinformatics
KW  - metabolome
KW  - transcriptome
KW  - modeling
KW  - steady-state
Y1  - 2008
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-27622
ER  - 
TY  - JOUR
A1  - Schulze, Katja
A1  - Tillich, Ulrich M.
A1  - Dandekar, Thomas
A1  - Frohme, Marcus
T1  - PlanktoVision – an automated analysis system for the identification of phytoplankton
JF  - BMC Bioinformatics
N2  - Background

Phytoplankton communities are often used as a marker for the determination of fresh water quality. The routine analysis, however, is very time consuming and expensive as it is carried out manually by trained personnel. The goal of this work is to develop a system for an automated analysis.

Results

A novel open source system for the automated recognition of phytoplankton by the use of microscopy and image analysis was developed. It integrates the segmentation of the organisms from the background, the calculation of a large range of features, and a neural network for the classification of imaged organisms into different groups of plankton taxa. The analysis of samples containing 10 different taxa showed an average recognition rate of 94.7% and an average error rate of 5.5%. The presented system has a flexible framework which easily allows expanding it to include additional taxa in the future.

Conclusions

The implemented automated microscopy and the new open source image analysis system - PlanktoVision - showed classification results that were comparable or better than existing systems and the exclusion of non-plankton particles could be greatly improved. The software package is published as free software and is available to anyone to help make the analysis of water quality more reproducible and cost effective.
KW  - Bioinformatik
Y1  - 2013
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-96395
UR  - http://www.biomedcentral.com/1471-2105/14/115
ER  - 
TY  - THES
A1  - Schulze, Katja
T1  - Automatisierte Klassifizierung und Viabilitätsanalyse von Phytoplankton
T1  - Automated classification and viability analysis for phytoplankton
N2  - Zentrales Ziel dieser Arbeit war es, Methoden der Mikroskopie, Bildverarbeitung und Bilderkennung für die Charakterisierungen verschiedener Phyotplankter zu nutzen, um deren Analyse zu verbessern und zu vereinfachen.
Der erste Schwerpunkt der Arbeit lag auf der Analyse von Phytoplanktongemeinschaften, die im Rahmen der Überprüfung der Süßwasserqualität als Marker dienen. Die konventionelle Analyse ist dabei sehr aufwendig, da diese noch immer vollständig von Hand durchgeführt wird und hierfür speziell ausgebildetes Personal eingesetzt werden muss. Ziel war es, ein System zur automatischen Erkennung aufzubauen, um die Analyse vereinfachen zu können. Mit Hilfe von automatischer Mikroskopie war es möglich Plankter unterschiedlicher Ausdehnung durch die Integration mehrerer Schärfeebenen besser in einem Bild aufzunehmen. Weiterhin wurden verschiedene Fluoreszenzeigenschaften in die Analyse integriert. Mit einem für ImageJ erstellten Plugin können Organismen vom Hintergrund der Aufnahmen abgetrennt und eine Vielzahl von Merkmalen berechnet werden. Über das Training von neuralen Netzen wird die Unterscheidung von verschieden Gruppen von Planktontaxa möglich. Zudem können weitere Taxa einfach in die Analyse integriert und die Erkennung erweitert werden. Die erste Analyse von Mischproben, bestehend aus 10 verschiedenen Taxa, zeigte dabei eine durchschnittliche Erkennungsrate von 94.7% und eine durchschnittliche Falsch-Positiv Rate von 5.5%. Im Vergleich mit bestehenden Systemen konnte die Erkennungsrate verbessert und die Falsch Positiv Rate deutlich gesenkt werde. Bei einer Erweiterung des Datensatzes auf 22 Taxa wurde darauf geachtet, Arten zu verwenden, die verschiedene Stadien in ihrem Wachstum durchlaufen oder höhere Ähnlichkeiten zu den bereits vorhandenen Arten aufweisen, um evtl. Schwachstellen des Systemes erkennen zu können. Hier ergab sich eine gute Erkennungsrate (86.8%), bei der der Ausschluss von nicht-planktonischen Partikeln (11.9%) weiterhin verbessert war. Der Vergleich mit weiteren Klassifikationsverfahren zeigte, dass neuronale Netze anderen Verfahren bei dieser Problemstellung überlegen sind. Ähnlich gute Klassifikationsraten konnten durch Support Vektor Maschinen erzielt werden. Allerdings waren diese bei der Unterscheidung von unbekannten Partikeln dem neuralen Netz deutlich unterlegen.
Der zweite Abschnitt stellt die Entwicklung einer einfachen Methode zur Viabilitätsanalyse von Cyanobakterien, bei der keine weitere Behandlung der Proben notwendig ist, dar. Dabei wird die rote Chlorophyll - Autofluoreszenz als Marker für lebende Zellen und eine grüne unspezifische Fluoreszenz als Marker für tote Zellen genutzt. Der Assay wurde mit dem Modellorganismus Synechocystis sp. PCC 6803 etabliert und validiert. Die Auswahl eines geeigeneten Filtersets ermöglicht es beide Signale gleichzeitig anzuregen und zu beobachten und somit direkt zwischen lebendenden und toten Zellen zu unterscheiden. Die Ergebnisse zur Etablierung des Assays konnten durch Ausplattieren, Chlorophyllbestimmung und Bestimmung des Absorbtionsspektrums bestätigt werden. Durch den Einsatz von automatisierter Mikroskopie und einem neu erstellten ImageJ Plugin wurde eine sehr genaue und schnelle Analyse der Proben möglich. Der Einsatz beim Monitoring einer mutagenisierten Kultur zur Erhöhung der Temperaturtoleranz ermöglichte genaue und zeitnahe Einblicke in den Zustand der Kultur. Weitere Ergebnisse weisen darauf hin, dass die Kombination mit Absorptionsspektren es ermöglichen können bessere Einblicke in die Vitalität der Kultur zu erhalten.
N2  - Central goal of this work was to improve and simplify the characterization of different phytoplankter by the use of automated microscopy, image processing and image analysis.
The first part of the work dealt with the analysis of pytoplankton communities, which are used as a marker for the determination of fresh water quality. The current routine analysis, is very time consuming and expensive, as it is carried out manually by trained personnel. Thus the goal of this work was to develop a system for automating the analysis. With the use of automated microscopy different focal planes could be integrated into one image, which made it possible to image plankter of different focus levels simultaneously. Additionally it allowed the integration of different fluorescence characteristics into the analysis. An image processing routine, developed in ImageJ, allows the segmentation of organisms from the image background and the calculation of a large range of features. Neural networks are then used for the classification of previously defined groups of plankton taxa. The program allows easy integration of additional taxa and expansion of the recognition targets. The analysis of samples containing 10 different taxa showed an average recognition rate of 94.7% and an average error rate of 5.5%. The obtained recognition rate was better than those of existing systems and the exclusion of non-plankton particles could be greatly improved. After extending the data set to 22 different classes of (more demanding) taxa a still good recognition (86.9 %) and still improved error rate (11.9 %) were obtained. This extended set was specifically selected in order to target potential weaknesses of the system. It contained mainly taxa that showed strong similarities to each other or taxa that go through various different morphological stages during their growth. The obtained recognition rates were comparable or better than those of existing systems and the exclusion of non-plankton particles could be greatly improved. A comparison of different classification methods showed, that neural networks are superior to all other investigated methods when used for this specific task. While similar recognition rates could be achieved with the use of support vector machines they were vastly inferior for the differentiation of unknown particles.
The second part focused on the development of a simple live - dead assay for unicellular cyanobacteria without the need of sample preparation. The assay uses red chlorophyll fluorescence, corresponding to viable cells, and an unspecific green autofluorescence, that can only be observed in non viable cells. The assay was established and validated for the model organism Synechocystis sp. PCC 6803. With the selection of a suitable filter-set both signals could be excited and observed simultaneously, allowing a direct classification of viable and non-viable cells. The results were confirmed by plating/colony count, absorption spectra and chlorophyll measurements. The use of an automated fluorescence microscope and an ImageJ based image analysis plugin allows a very precise and fast analysis. The monitoring of a random mutagenized culture undergoing selection for improved temperature tolerance allowed an accurate and prompt insight into the condition of the culture. Further results indicate that a combination of the new assay with absorption spectra or chlorophyll concentration measurements allows the estimation of the vitality of cells.
KW  - Bilderkennnung
KW  - Bioinformatik
KW  - Phytoplankton
KW  - Bilderkennung
KW  - Phytoplankton
KW  - Viabilität
KW  - Mikroskopie
KW  - Bioinformatik
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-107174
ER  -