TY  - THES
A1  - Somody, Joseph Christian Campbell
T1  - Leveraging deep learning for identification and structural determination of novel protein complexes from \(in\) \(situ\) electron cryotomography of \(Mycoplasma\) \(pneumoniae\)
T1  - Tiefenlernen als Werkzeug zur Identifizierung und Strukturbestimmung neuer Proteinkomplexe aus der \(in\)-\(situ\)-Elektronenkryotomographie von \(Mycoplasma\) \(pneumoniae\)
N2  - The holy grail of structural biology is to study a protein in situ, and this goal has been fast approaching since the resolution revolution and the achievement of atomic resolution. A cell's interior is not a dilute environment, and proteins have evolved to fold and function as needed in that environment; as such, an investigation of a cellular component should ideally include the full complexity of the cellular environment. Imaging whole cells in three dimensions using electron cryotomography is the best method to accomplish this goal, but it comes with a limitation on sample thickness and produces noisy data unamenable to direct analysis. This thesis establishes a novel workflow to systematically analyse whole-cell electron cryotomography data in three dimensions and to find and identify instances of protein complexes in the data to set up a determination of their structure and identity for success. Mycoplasma pneumoniae is a very small parasitic bacterium with fewer than 700 protein-coding genes, is thin enough and small enough to be imaged in large quantities by electron cryotomography, and can grow directly on the grids used for imaging, making it ideal for exploratory studies in structural proteomics. As part of the workflow, a methodology for training deep-learning-based particle-picking models is established.

As a proof of principle, a dataset of whole-cell Mycoplasma pneumoniae tomograms is used with this workflow to characterize a novel membrane-associated complex observed in the data. Ultimately, 25431 such particles are picked from 353 tomograms and refined to a density map with a resolution of 11 Å. Making good use of orthogonal datasets to filter search space and verify results, structures were predicted for candidate proteins and checked for suitable fit in the density map. In the end, with this approach, nine proteins were found to be part of the complex, which appears to be associated with chaperone activity and interact with translocon machinery.

Visual proteomics refers to the ultimate potential of in situ electron cryotomography: the comprehensive interpretation of tomograms. The workflow presented here is demonstrated to help in reaching that potential.
N2  - Der heilige Gral der Strukturbiologie ist die Untersuchung eines Proteins in situ, und dieses Ziel ist seit der Auflösungsrevolution und dem Erreichen der atomaren Auflösung in greifbare Nähe gerückt. Das Innere einer Zelle ist keine verdünnte Umgebung, und Proteine haben sich so entwickelt, dass sie sich falten und so funktionieren, wie es in dieser Umgebung erforderlich ist; daher sollte die Untersuchung einer zellulären Komponente idealerweise die gesamte Komplexität der zellulären Umgebung umfassen. Die Abbildung ganzer Zellen in drei Dimensionen mit Hilfe der Elektronenkryotomographie ist die beste Methode, um dieses Ziel zu erreichen, aber sie ist mit einer Beschränkung der Probendicke verbunden und erzeugt verrauschte Daten, die sich nicht für eine direkte Analyse eignen. In dieser Dissertation wird ein neuartiger Workflow zur systematischen dreidimensionalen Analyse von Ganzzell-Elektronenkryotomographiedaten und zur Auffindung und Identifizierung von Proteinkomplexen in diesen Daten entwickelt, um eine erfolgreiche Bestimmung ihrer Struktur und Identität zu ermöglichen. Mycoplasma pneumoniae ist ein sehr kleines parasitäres Bakterium mit weniger als 700 proteinkodierenden Genen. Es ist dünn und klein genug, um in grossen Mengen durch Elektronenkryotomographie abgebildet zu werden, und kann direkt auf den für die Abbildung verwendeten Gittern wachsen, was es ideal für Sondierungsstudien in der strukturellen Proteomik macht. Als Teil des Workflows wird eine Methodik für das Training von Deep-Learning-basierten Partikelpicken-Modellen entwickelt.

Als Proof-of-Principle wird ein Dataset von Ganzzell-Tomogrammen von Mycoplasma pneumoniae mit diesem Workflow verwendet, um einen neuartigen membranassoziierten Komplex zu charakterisieren, der in den Daten beobachtet wurde. Insgesamt wurden 25431 solcher Partikel aus 353 Tomogrammen gepickt und zu einer Dichtekarte mit einer Auflösung von 11 Å verfeinert. Unter Verwendung orthogonaler Datensätze zur Filterung des Suchraums und zur Überprüfung der Ergebnisse wurden Strukturen für Protein-Kandidaten vorhergesagt und auf ihre Eignung für die Dichtekarte überprüft. Letztendlich wurden mit diesem Ansatz neun Proteine als Bestandteile des Komplexes gefunden, der offenbar mit der Chaperonaktivität in Verbindung steht und mit der Translocon-Maschinerie interagiert.

Das ultimative Potenzial der In-situ-Elektronenkryotomographie – die umfassende Interpretation von Tomogrammen – wird als visuelle Proteomik bezeichnet. Der hier vorgestellte Workflow soll dabei helfen, dieses Potenzial auszuschöpfen.
KW  - Kryoelektronenmikroskopie
KW  - Tomografie
KW  - Mycoplasma pneumoniae
KW  - Deep learning
KW  - cryo-EM
KW  - cryo-ET
KW  - tomography
KW  - mycoplasma
KW  - pneumoniae
KW  - deep learning
KW  - particle picking
KW  - membrane protein
KW  - visual proteomics
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-313447
ER  - 
TY  - JOUR
A1  - Krenzer, Adrian
A1  - Makowski, Kevin
A1  - Hekalo, Amar
A1  - Fitting, Daniel
A1  - Troya, Joel
A1  - Zoller, Wolfram G.
A1  - Hann, Alexander
A1  - Puppe, Frank
T1  - Fast machine learning annotation in the medical domain: a semi-automated video annotation tool for gastroenterologists
JF  - BioMedical Engineering OnLine
N2  - Background
Machine learning, especially deep learning, is becoming more and more relevant in research and development in the medical domain. For all the supervised deep learning applications, data is the most critical factor in securing successful implementation and sustaining the progress of the machine learning model. Especially gastroenterological data, which often involves endoscopic videos, are cumbersome to annotate. Domain experts are needed to interpret and annotate the videos. To support those domain experts, we generated a framework. With this framework, instead of annotating every frame in the video sequence, experts are just performing key annotations at the beginning and the end of sequences with pathologies, e.g., visible polyps. Subsequently, non-expert annotators supported by machine learning add the missing annotations for the frames in-between.
Methods
In our framework, an expert reviews the video and annotates a few video frames to verify the object’s annotations for the non-expert. In a second step, a non-expert has visual confirmation of the given object and can annotate all following and preceding frames with AI assistance. After the expert has finished, relevant frames will be selected and passed on to an AI model. This information allows the AI model to detect and mark the desired object on all following and preceding frames with an annotation. Therefore, the non-expert can adjust and modify the AI predictions and export the results, which can then be used to train the AI model.
Results
Using this framework, we were able to reduce workload of domain experts on average by a factor of 20 on our data. This is primarily due to the structure of the framework, which is designed to minimize the workload of the domain expert. Pairing this framework with a state-of-the-art semi-automated AI model enhances the annotation speed further. Through a prospective study with 10 participants, we show that semi-automated annotation using our tool doubles the annotation speed of non-expert annotators compared to a well-known state-of-the-art annotation tool.
Conclusion
In summary, we introduce a framework for fast expert annotation for gastroenterologists, which reduces the workload of the domain expert considerably while maintaining a very high annotation quality. The framework incorporates a semi-automated annotation system utilizing trained object detection models. The software and framework are open-source.
KW  - object detection
KW  - machine learning
KW  - deep learning
KW  - annotation
KW  - endoscopy
KW  - gastroenterology
KW  - automation
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-300231
VL  - 21
IS  - 1
ER  - 
TY  - JOUR
A1  - Ankenbrand, Markus J.
A1  - Shainberg, Liliia
A1  - Hock, Michael
A1  - Lohr, David
A1  - Schreiber, Laura M.
T1  - Sensitivity analysis for interpretation of machine learning based segmentation models in cardiac MRI
JF  - BMC Medical Imaging
N2  - Background

Image segmentation is a common task in medical imaging e.g., for volumetry analysis in cardiac MRI. Artificial neural networks are used to automate this task with performance similar to manual operators. However, this performance is only achieved in the narrow tasks networks are trained on. Performance drops dramatically when data characteristics differ from the training set properties. Moreover, neural networks are commonly considered black boxes, because it is hard to understand how they make decisions and why they fail. Therefore, it is also hard to predict whether they will generalize and work well with new data. Here we present a generic method for segmentation model interpretation. Sensitivity analysis is an approach where model input is modified in a controlled manner and the effect of these modifications on the model output is evaluated. This method yields insights into the sensitivity of the model to these alterations and therefore to the importance of certain features on segmentation performance.

Results

We present an open-source Python library (misas), that facilitates the use of sensitivity analysis with arbitrary data and models. We show that this method is a suitable approach to answer practical questions regarding use and functionality of segmentation models. We demonstrate this in two case studies on cardiac magnetic resonance imaging. The first case study explores the suitability of a published network for use on a public dataset the network has not been trained on. The second case study demonstrates how sensitivity analysis can be used to evaluate the robustness of a newly trained model.

Conclusions

Sensitivity analysis is a useful tool for deep learning developers as well as users such as clinicians. It extends their toolbox, enabling and improving interpretability of segmentation models. Enhancing our understanding of neural networks through sensitivity analysis also assists in decision making. Although demonstrated only on cardiac magnetic resonance images this approach and software are much more broadly applicable.
KW  - deep learning
KW  - neural networks
KW  - cardiac magnetic resonance
KW  - sensitivity analysis
KW  - transformations
KW  - augmentation
KW  - segmentation
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-259169
VL  - 21
IS  - 1
ER  -