Refine
Has Fulltext
- yes (28)
Is part of the Bibliography
- yes (28)
Year of publication
Document Type
- Doctoral Thesis (25)
- Preprint (2)
- Journal article (1)
Keywords
- Maschinelles Lernen (28) (remove)
Institute
- Betriebswirtschaftliches Institut (7)
- Institut für Informatik (6)
- Graduate School of Life Sciences (4)
- Institut für Pharmazie und Lebensmittelchemie (2)
- Institut für Theoretische Physik und Astrophysik (2)
- Institut für deutsche Philologie (2)
- Fakultät für Chemie und Pharmazie (1)
- Graduate School of Science and Technology (1)
- Institut für Funktionsmaterialien und Biofabrikation (1)
- Institut für Geographie und Geologie (1)
Sonstige beteiligte Institutionen
EU-Project number / Contract (GA) number
It is the aim of this thesis to present a visual body weight estimation, which is suitable for medical applications. A typical scenario where the estimation of the body weight is essential, is the emergency treatment of stroke patients: In case of an ischemic stroke, the patient has to receive a body weight adapted drug, to solve a blood clot in a vessel. The accuracy of the estimated weight influences the outcome of the therapy directly. However, the treatment has to start as early as possible after the arrival at a trauma room, to provide sufficient treatment. Weighing a patient takes time, and the patient has to be moved. Furthermore, patients are often not able to communicate a value for their body weight due to their stroke symptoms. Therefore, it is state of the art that physicians guess the body weight. A patient receiving a too low dose has an increased risk that the blood clot does not dissolve and brain tissue is permanently damaged. Today, about one-third gets an insufficient dosage. In contrast to that, an overdose can cause bleedings and further complications. Physicians are aware of this issue, but a reliable alternative is missing.
The thesis presents state-of-the-art principles and devices for the measurement and estimation of body weight in the context of medical applications. While scales are common and available at a hospital, the process of weighing takes too long and can hardly be integrated into the process of stroke treatment. Sensor systems and algorithms are presented in the section for related work and provide an overview of different approaches.
The here presented system -- called Libra3D -- consists of a computer installed in a real trauma room, as well as visual sensors integrated into the ceiling. For the estimation of the body weight, the patient is on a stretcher which is placed in the field of view of the sensors. The three sensors -- two RGB-D and a thermal camera -- are calibrated intrinsically and extrinsically. Also, algorithms for sensor fusion are presented to align the data from all sensors which is the base for a reliable segmentation of the patient.
A combination of state-of-the-art image and point cloud algorithms is used to localize the patient on the stretcher. The challenges in the scenario with the patient on the bed is the dynamic environment, including other people or medical devices in the field of view.
After the successful segmentation, a set of hand-crafted features is extracted from the patient's point cloud. These features rely on geometric and statistical values and provide a robust input to a subsequent machine learning approach. The final estimation is done with a previously trained artificial neural network.
The experiment section offers different configurations of the previously extracted feature vector. Additionally, the here presented approach is compared to state-of-the-art methods; the patient's own assessment, the physician's guess, and an anthropometric estimation. Besides the patient's own estimation, Libra3D outperforms all state-of-the-art estimation methods: 95 percent of all patients are estimated with a relative error of less than 10 percent to ground truth body weight. It takes only a minimal amount of time for the measurement, and the approach can easily be integrated into the treatment of stroke patients, while physicians are not hindered.
Furthermore, the section for experiments demonstrates two additional applications: The extracted features can also be used to estimate the body weight of people standing, or even walking in front of a 3D camera. Also, it is possible to determine or classify the BMI of a subject on a stretcher. A potential application for this approach is the reduction of the radiation dose of patients being exposed to X-rays during a CT examination.
During the time of this thesis, several data sets were recorded. These data sets contain the ground truth body weight, as well as the data from the sensors. They are available for the collaboration in the field of body weight estimation for medical applications.
Acceleration is a central aim of clinical and technical research in magnetic resonance imaging (MRI) today, with the potential to increase robustness, accessibility and patient comfort, reduce cost, and enable entirely new kinds of examinations. A key component in this endeavor is image reconstruction, as most modern approaches build on advanced signal and image processing. Here, deep learning (DL)-based methods have recently shown considerable potential, with numerous publications demonstrating benefits for MRI reconstruction. However, these methods often come at the cost of an increased risk for subtle yet critical errors. Therefore, the aim of this thesis is to advance DL-based MRI reconstruction, while ensuring high quality and fidelity with measured data. A network architecture specifically suited for this purpose is the variational network (VN). To investigate the benefits these can bring to non-Cartesian cardiac imaging, the first part presents an application of VNs, which were specifically adapted to the reconstruction of accelerated spiral acquisitions. The proposed method is compared to a segmented exam, a U-Net and a compressed sensing (CS) model using qualitative and quantitative measures. While the U-Net performed poorly, the VN as well as the CS reconstruction showed good output quality. In functional cardiac imaging, the proposed real-time method with VN reconstruction substantially accelerates examinations over the gold-standard, from over 10 to just 1 minute. Clinical parameters agreed on average.
Generally in MRI reconstruction, the assessment of image quality is complex, in particular for modern non-linear methods. Therefore, advanced techniques for precise evaluation of quality were subsequently demonstrated.
With two distinct methods, resolution and amplification or suppression of noise are quantified locally in each pixel of a reconstruction. Using these, local maps of resolution and noise in parallel imaging (GRAPPA), CS, U-Net and VN reconstructions were determined for MR images of the brain. In the tested images, GRAPPA delivers uniform and ideal resolution, but amplifies noise noticeably. The other methods adapt their behavior to image structure, where different levels of local blurring were observed at edges compared to homogeneous areas, and noise was suppressed except at edges. Overall, VNs were found to combine a number of advantageous properties, including a good trade-off between resolution and noise, fast reconstruction times, and high overall image quality and fidelity of the produced output. Therefore, this network architecture seems highly promising for MRI reconstruction.
This paper discusses the categorization of Quranic chapters by major phases of Prophet Mohammad’s messengership using machine learning algorithms. First, the chapters were categorized by places of revelation using Support Vector Machine and naïve Bayesian classifiers separately, and their results were compared to each other, as well as to the existing traditional Islamic and western orientalists classifications. The chapters were categorized into Meccan (revealed in Mecca) and Medinan (revealed in Medina). After that, chapters of each category were clustered using a kind of fuzzy-single linkage clustering approach, in order to correspond to the major phases of Prophet Mohammad’s life. The major phases of the Prophet’s life were manually derived from the Quranic text, as well as from the secondary Islamic literature e.g hadiths, exegesis. Previous studies on computing the places of revelation of Quranic chapters relied heavily on features extracted from existing background knowledge of the chapters. For instance, it is known that Meccan chapters contain mostly verses about faith and related problems, while Medinan ones encompass verses dealing with social issues, battles…etc. These features are by themselves insufficient as a basis for assigning the chapters to their respective places of revelation. In fact, there are exceptions, since some chapters do contain both Meccan and Medinan features. In this study, features of each category were automatically created from very few chapters, whose places of revelation have been determined through identification of historical facts and events such as battles, migration to Medina…etc. Chapters having unanimously agreed places of revelation were used as the initial training set, while the remaining chapters formed the testing set. The classification process was made recursive by regularly augmenting the training set with correctly classified chapters, in order to classify the whole testing set. Each chapter was preprocessed by removing unimportant words, stemming, and representation with vector space model. The result of this study shows that, the two classifiers have produced useable results, with an outperformance of the support vector machine classifier. This study indicates that, the proposed methodology yields encouraging results for arranging Quranic chapters by phases of Prophet Mohammad’s messengership.
Deep Learning (DL) models are trained on a downstream task by feeding (potentially preprocessed) input data through a trainable Neural Network (NN) and updating its parameters to minimize the loss function between the predicted and the desired output. While this general framework has mainly remained unchanged over the years, the architectures of the trainable models have greatly evolved. Even though it is undoubtedly important to choose the right architecture, we argue that it is also beneficial to develop methods that address other components of the training process. We hypothesize that utilizing domain knowledge can be helpful to improve DL models in terms of performance and/or efficiency. Such model-agnostic methods can be applied to any existing or future architecture. Furthermore, the black box nature of DL models motivates the development of techniques to understand their inner workings. Considering the rapid advancement of DL architectures, it is again crucial to develop model-agnostic methods.
In this thesis, we explore six principles that incorporate domain knowledge to understand or improve models. They are applied either on the input or output side of the trainable model. Each principle is applied to at least two DL tasks, leading to task-specific implementations. To understand DL models, we propose to use Generated Input Data coming from a controllable generation process requiring knowledge about the data properties. This way, we can understand the model’s behavior by analyzing how it changes when one specific high-level input feature changes in the generated data. On the output side, Gradient-Based Attribution methods create a gradient at the end of the NN and then propagate it back to the input, indicating which low-level input features have a large influence on the model’s prediction. The resulting input features can be interpreted by humans using domain knowledge.
To improve the trainable model in terms of downstream performance, data and compute efficiency, or robustness to unwanted features, we explore principles that each address one of the training components besides the trainable model. Input Masking and Augmentation directly modifies the training input data, integrating knowledge about the data and its impact on the model’s output. We also explore the use of Feature Extraction using Pretrained Multimodal Models which can be seen as a beneficial preprocessing step to extract useful features. When no training data is available for the downstream task, using such features and domain knowledge expressed in other modalities can result in a Zero-Shot Learning (ZSL) setting, completely eliminating the trainable model. The Weak Label Generation principle produces new desired outputs using knowledge about the labels, giving either a good pretraining or even exclusive training dataset to solve the downstream task. Finally, improving and choosing the right Loss Function is another principle we explore in this thesis. Here, we enrich existing loss functions with knowledge about label interactions or utilize and combine multiple task-specific loss functions in a multitask setting.
We apply the principles to classification, regression, and representation tasks as well as to image and text modalities. We propose, apply, and evaluate existing and novel methods to understand and improve the model. Overall, this thesis introduces and evaluates methods that complement the development and choice of DL model architectures.
Traditional fashion retailers are increasingly hard-pressed to keep up with their digital competitors. In this context, the re-invention of brick-and-mortar stores as smart retail environments is being touted as a crucial step towards regaining a competitive edge. This thesis describes a design-oriented research project that deals with automated product tracking on the sales floor and presents three smart fashion store applications that are tied to such localization information: (i) an electronic article surveillance (EAS) system that distinguishes between theft and non-theft events, (ii) an automated checkout system that detects customers’ purchases when they are leaving the store and associates them with individual shopping baskets to automatically initiate payment processes, and (iii) a smart fitting room that detects the items customers bring into individual cabins and identifies the items they are currently most interested in to offer additional customer services (e.g., product recommendations or omnichannel services). The implementation of such cyberphysical systems in established retail environments is challenging, as architectural constraints, well-established customer processes, and customer expectations regarding privacy and convenience pose challenges to system design. To overcome these challenges, this thesis leverages Radio Frequency Identification (RFID) technology and machine learning techniques to address the different detection tasks. To optimally configure the systems and draw robust conclusions regarding their economic value contribution, beyond technological performance criteria, this thesis furthermore introduces a service operations model that allows mapping the systems’ technical detection characteristics to business relevant metrics such as service quality and profitability. This analytical model reveals that the same system component for the detection of object transitions is well suited for the EAS application but does not have the necessary high detection accuracy to be used as a component of an automated checkout system.
Digitization and artificial intelligence are radically changing virtually all areas across business and society. These developments are mainly driven by the technology of machine learning (ML), which is enabled by the coming together of large amounts of training data, statistical learning theory, and sufficient computational power. This technology forms the basis for the development of new approaches to solve classical planning problems of Operations Research (OR): prescriptive analytics approaches integrate ML prediction and OR optimization into a single prescription step, so they learn from historical observations of demand and a set of features (co-variates) and provide a model that directly prescribes future decisions. These novel approaches provide enormous potential to improve planning decisions, as first case reports showed, and, consequently, constitute a new field of research in Operations Management (OM).
First works in this new field of research have studied approaches to solving comparatively simple planning problems in the area of inventory management. However, common OM planning problems often have a more complex structure, and many of these complex planning problems are within the domain of capacity planning. Therefore, this dissertation focuses on developing new prescriptive analytics approaches for complex capacity management problems. This dissertation consists of three independent articles that develop new prescriptive approaches and use these to solve realistic capacity planning problems.
The first article, “Prescriptive Analytics for Flexible Capacity Management”, develops two prescriptive analytics approaches, weighted sample average approximation (wSAA) and kernelized empirical risk minimization (kERM), to solve a complex two-stage capacity planning problem that has been studied extensively in the literature: a logistics service provider sorts daily incoming mail items on three service lines that must be staffed on a weekly basis. This article is the first to develop a kERM approach to solve a complex two-stage stochastic capacity planning problem with matrix-valued observations of demand and vector-valued decisions. The article develops out-of-sample performance guarantees for kERM and various kernels, and shows the universal approximation property when using a universal kernel. The results of the numerical study suggest that prescriptive analytics approaches may lead to significant improvements in performance compared to traditional two-step approaches or SAA and that their performance is more robust to variations in the exogenous cost parameters.
The second article, “Prescriptive Analytics for a Multi-Shift Staffing Problem”, uses prescriptive analytics approaches to solve the (queuing-type) multi-shift staffing problem (MSSP) of an aviation maintenance provider that receives customer requests of uncertain number and at uncertain arrival times throughout each day and plans staff capacity for two shifts. This planning problem is particularly complex because the order inflow and processing are modelled as a queuing system, and the demand in each day is non-stationary. The article addresses this complexity by deriving an approximation of the MSSP that enables the planning problem to be solved using wSAA, kERM, and a novel Optimization Prediction approach. A numerical evaluation shows that wSAA leads to the best performance in this particular case. The solution method developed in this article builds a foundation for solving queuing-type planning problems using prescriptive analytics approaches, so it bridges the “worlds” of queuing theory and prescriptive analytics.
The third article, “Explainable Subgradient Tree Boosting for Prescriptive Analytics in Operations Management” proposes a novel prescriptive analytics approach to solve the two capacity planning problems studied in the first and second articles that allows decision-makers to derive explanations for prescribed decisions: Subgradient Tree Boosting (STB). STB combines the machine learning method Gradient Boosting with SAA and relies on subgradients because the cost function of OR planning problems often cannot be differentiated. A comprehensive numerical analysis suggests that STB can lead to a prescription performance that is comparable to that of wSAA and kERM. The explainability of STB prescriptions is demonstrated by breaking exemplary decisions down into the impacts of individual features. The novel STB approach is an attractive choice not only because of its prescription performance, but also because of the explainability that helps decision-makers understand the causality behind the prescriptions.
The results presented in these three articles demonstrate that using prescriptive analytics approaches, such as wSAA, kERM, and STB, to solve complex planning problems can lead to significantly better decisions compared to traditional approaches that neglect feature data or rely on a parametric distribution estimation.
In the course of the growth of the Internet and due to increasing availability of data, over the last two decades, the field of network science has established itself as an own area of research. With quantitative scientists from computer science, mathematics, and physics working on datasets from biology, economics, sociology, political sciences, and many others, network science serves as a paradigm for interdisciplinary research.
One of the major goals in network science is to unravel the relationship between topological graph structure and a network’s function. As evidence suggests, systems from the same fields, i.e. with similar function, tend to exhibit similar structure. However, it is still vague whether a similar graph structure automatically implies likewise function. This dissertation aims at helping to bridge this gap, while particularly focusing on the role of triadic structures.
After a general introduction to the main concepts of network science, existing work devoted to the relevance of triadic substructures is reviewed. A major challenge in modeling triadic structure is the fact that not all three-node subgraphs can be specified independently
of each other, as pairs of nodes may participate in multiple of those triadic subgraphs.
In order to overcome this obstacle, we suggest a novel class of generative network models based on so called Steiner triple systems. The latter are partitions of a graph’s vertices into pair-disjoint triples (Steiner triples). Thus, the configurations on Steiner triples can be specified independently of each other without overdetermining the network’s link
structure.
Subsequently, we investigate the most basic realization of this new class of models. We call it the triadic random graph model (TRGM). The TRGM is parametrized by a probability distribution over all possible triadic subgraph patterns. In order to generate a network instantiation of the model, for all Steiner triples in the system, a pattern is drawn from the distribution and adjusted randomly on the Steiner triple. We calculate the degree distribution of the TRGM analytically and find it to be similar to a Poissonian distribution. Furthermore, it is shown that TRGMs possess non-trivial triadic structure. We discover inevitable correlations in the abundance of certain triadic subgraph
patterns which should be taken into account when attributing functional relevance to particular motifs – patterns which occur significantly more frequently than expected at random. Beyond, the strong impact of the probability distributions on the Steiner triples on the occurrence of triadic subgraphs over the whole network is demonstrated. This interdependence allows us to design ensembles of networks with predefined triadic substructure. Hence, TRGMs help to overcome the lack of generative models needed for assessing the relevance of triadic structure.
We further investigate whether motifs occur homogeneously or heterogeneously distributed over a graph. Therefore, we study triadic subgraph structures in each node’s neighborhood individually. In order to quantitatively measure structure from an individual node’s perspective, we introduce an algorithm for node-specific pattern mining for both directed unsigned, and undirected signed networks. Analyzing real-world datasets, we find that there are networks in which motifs are distributed highly heterogeneously, bound to the proximity of only very few nodes. Moreover, we observe indication for the potential sensitivity of biological systems to a targeted removal of these critical vertices. In addition, we study whole graphs with respect to the homogeneity and homophily of their node-specific triadic structure. The former describes the similarity of subgraph distributions in the neighborhoods of individual vertices. The latter quantifies whether connected vertices
are structurally more similar than non-connected ones. We discover these features to be characteristic for the networks’ origins. Moreover, clustering the vertices of graphs regarding their triadic structure, we investigate structural groups in the neural network of C. elegans, the international airport-connection network, and the global network of diplomatic sentiments between countries. For the latter we find evidence for the instability of triangles considered socially unbalanced according to sociological theories.
Finally, we utilize our TRGM to explore ensembles of networks with similar triadic substructure in terms of the evolution of dynamical processes acting on their nodes. Focusing on oscillators, coupled along the graphs’ edges, we observe that certain triad motifs impose a clear signature on the systems’ dynamics, even when embedded in a larger
network structure.
Techniken des computergestützten Wirkstoffdesigns spielen eine wichtige Rolle bei der Entwicklung neuer Wirkstoffe. Die vorliegende Arbeit befasst sich sowohl mit der Entwicklung als auch mit der praktischen Anwendung von Methoden des strukturbasierten Wirkstoffdesigns. Die Arbeit glieder sich daher in zwei Teile.
Der erste Teil beschäftigt sich mit der Entwicklung von empirischen Scoring-Funktionen, die eine Schlüsselrolle im strukturbasierten computergestützen Wirkstoffdesign einnehmen. Grundlage dieser Arbeiten sind die empirischen Deskriptoren und Scoring-Funktionen aus dem SFCscore-Programmpaket.
Dabei wurde zunächst untersucht, wie sich die Zusammensetzung der Trainingsdaten auf die Vorhersagen von empirischen Scoring-Funktionen auswirkt. Durch die gezielte Zusammenstellung eines neuen Trainingsdatensatzes wurde versucht, die Spannweite der Vorhersagen zu vergrößern, um so vor allem eine bessere Erkennung von hoch- und niedrig-affinen Komplexen zu erreichen. Die resultierende Funktion erzielte vor allem im niedrig-affinen Bereich verbesserte Vorhersagen.
Der zweite Themenkomplex beschäftigt sich ebenfalls mit der verbesserten Separierung von aktiven und inaktiven Verbindungen. Durch den Einsatz der Machine Learning-Methode RandomForest wurden dazu Klassifizierungsmodelle abgeleitet, die im Unterschied zu den klassischen Scoring-Funktionen keinen genauen Score liefern, sondern die Verbindungen nach ihrer potentiellen Aktivität klassifizieren.
Am Beispiel des mykobakteriellen Enzyms InhA konnte gezeigt werden, dass derartige Modelle den klassischen Scoring-Funktionen im Bezug auf die Erkennung von aktiven Verbindungen deutlich überlegen sind.
Der RandomForest-Algorithmus wurde im nächsten Schritt auch verwendet, um eine neue Scoring-Funktion zur Vorhersage von Bindungsaffinitäten abzuleiten. Diese Funktion wurde unter dem Namen SFCscoreRF in das SFCscore-Programmpaket implementiert. Die Funktion unterschiedet sich in einigen wesentlichen Punkten von den ursprünglichen SFCscore-Funktionen.
Zum einen handelt es sich beim RF-Algorithmus um eine nicht-lineare Methode, die im Unterschied zu den klassischen Methoden, die zur Ableitung von Scoring-Funktionen eingesetzt werden, nicht von der Additivität der einzelnen Deskriptoren ausgeht. Der Algorithmus erlaubt außerdem die Verwendung aller verfügbaren SFCscore-Deskriptoren, was eine deutlich umfassendere Repräsentation von Protein-Ligand-Komplexen als Grundlage des Scorings ermöglicht. Für die Ableitung von SFCscoreRF wurden insgesamt 1005 Komplexe im Trainingsdatensatz verwendet. Dieser Datensatz ist somit einer der größten, die bisher für die Ableitung einer empirischen Scoring-Funktion verwendet wurden.
Die Evaluierung gegen zwei Benchmark-Datensätze ergab deutlich bessere Vorhersagen von SFCscoreRF im Vergleich zu den ursprünglichen SFCscore-Funktionen. Auch im internationalen Vergleich mit anderen Scoring-Funktion konnten für beide Datensätze Spitzenwerte erreicht werden.
Weitere ausgiebige Testungen im Rahmen einer Leave-Cluster-Out-Validierung und die Teilnahme am CSAR 2012 Benchmark Exercise ergaben, dass auch SFCscoreRF Performanceschwankungen bei der Anwendung an proteinspezifischen Datensätzen zeigt - ein Phänomen, dass bei Scoring-Funktionen immer beobachtet wird. Die Analyse der CSAR 2012-Datensätze ergab darüber hinaus wichtige Erkenntnisse im Bezug auf Vorhersage von gedockten Posen sowie über die statistische Signifikanz bei der Evaluierung von Scoring-Funktionen.
Die Tatsache, dass empirische Scoring-Funktionen innerhalb eines bestimmten chemischen Raums trainiert wurden, ist ein wichtiger Faktor für die protein-abhängigen Leistungsschwankungen, die in dieser Arbeit beobachtet wurden. Verlässliche Vorhersagen sind nur innerhalb des kalibrierten chemischen Raums möglich. In dieser Arbeit wurden verschiedene Ansätze untersucht, mit denen sich diese ``Applicability Domain'' für die SFCscore-Funktionen definieren lässt. Mit Hilfe von PCA-Analysen ist es gelungen die ``Applicability Domain'' einzelner Funktionen zu visualisieren. Zusätzlich wurden eine Reihe numerischer Deskriptoren getestet, mit den die Vorhersageverlässlichkeit basierend auf der ``Applicability Domain'' abgeschätzt werden könnte. Die RF-Proximity hat sich hier als vielversprechender Ausgangspunkt für weitere Entwicklungen erwiesen.
Der zweite Teil der Arbeit beschäftigt sich mit der Entwicklung neuer Inhibitoren für das Chaperon Hsp70, welches eine vielversprechende Zielstruktur für die Therapie des multiplen Myeloms darstellt.
Grundlage dieser Arbeiten war eine Leitstruktur, die in einer vorhergehenden Arbeit entdeckt wurde und die vermutlich an einer neuartigen Bindestelle in der Interface-Region zwischen den beiden großen Domänen von Hsp70 angreift.
Die Weiterentwicklung und Optimierung dieser Leitstruktur, eines Tetrahydroisochinolinon-Derivats, stand zunächst im Vordergrund. Anhand detaillierter Docking-Analysen wurde der potentielle Bindemodus der Leitstruktur in der Interfaceregion von Hsp70 untersucht. Basierend auf diesen Ergebnissen wurde eine Substanzbibliothek erstellt, die von Kooperationspartnern innerhalb der KFO 216 synthetisiert und biologisch getestet wurde. Die Struktur-Wirkungsbeziehungen, die sich aus diesen experimentellen Daten ableiten lassen, konnten teilweise gut mit den erstellten Docking-Modellen korreliert werden. Andere Effekte konnten anhand der Docking-Posen jedoch nicht erklärt werden. Für die Entwicklung neuer Derivate ist deswegen eine umfassendere experimentelle Charakterisierung und darauf aufbauend eine Verfeinerung der Bindungsmodelle notwendig.
Strukturell handelt es sich bei Hsp70 um ein Zwei-Domänen-System, dass verschiedene allostere Zustände einnehmen kann. Um die Auswirkungen der daraus folgenden Flexibilität auf die Stabilität der Struktur und die Bindung von Inhibitoren zu untersuchen, wurden molekulardynamische Simulationen für das Protein durchgeführt.
Diese zeigen, dass das Protein tatsächlich eine überdurchschnittlich hohe Flexibilität aufweist, die vor allem durch die relative Bewegung der beiden großen Domänen zueinander dominiert wird. Die Proteinkonformation die in der Kristallstruktur hscaz beobachtet wird, bleibt jedoch in ihrer Grundstruktur in allen vier durchgeführten Simulationen erhalten. Es konnten hingegen keine Hinweise dafür gefunden werden, dass die Mutationen, welche die für die strukturbasierten Arbeiten verwendete Kristallstruktur im Vergleich zum Wildtyp aufweist, einen kritischen Einfluss auf die Gesamtstabilität des Systems haben.
Obwohl die Interface-Region zwischen NBD und SBD also in allen Simulationen erhalten bleibt, wird die Konformation in diesem Bereich doch wesentlich durch die Domänenbewegung beeinflusst und variiert. Da dieser Proteinbereich den wahrscheinlichsten Angriffspunkt der Tetrahydroisochinolinone darstellt, wurde der Konformationsraum detailliert untersucht. Wie erwartet weist die Region eine nicht unerhebliche Flexibilität auf, welche zudem, im Sinne eines ``Induced-Fit''-Mechanismus, durch die Gegenwart eines Liganden (Apoptozol) stark beeinflusst wird. Es ist daher als sehr wahrscheinlich anzusehen, dass die Dynamik der Interface-Region auch einen wesentlichen Einfluss auf die Bindung der Tetrahydroisochinolinone hat. Molekuardynamische Berechnungen werden deswegen auch in zukünftige Arbeiten auf diesem Gebiet eine wichtige Rolle spielen.
Die Analysen zeigen zudem, dass die Konformation der Interface-Region eng mit der Konformation des gesamten Proteins - vor allem im Bezug auf die relative Stellung von SBD und NBD zueinander - verknüpft ist. Das untermauert die Hypothese, dass die Interface-Bindetasche einen Angriffspunkt für die Inhibtion des Proteins darstellt.
One consequence of the recent coronavirus pandemic is increased demand and use of online services around the globe. At the same time, performance requirements for modern technologies are becoming more stringent as users become accustomed to higher standards. These increased performance and availability requirements, coupled with the unpredictable usage growth, are driving an increasing proportion of applications to run on public cloud platforms as they promise better scalability and reliability.
With data centers already responsible for about one percent of the world's power consumption, optimizing resource usage is of paramount importance. Simultaneously, meeting the increasing and changing resource and performance requirements is only possible by optimizing resource management without introducing additional overhead. This requires the research and development of new modeling approaches to understand the behavior of running applications with minimal information.
However, the emergence of modern software paradigms makes it increasingly difficult to derive such models and renders previous performance modeling techniques infeasible. Modern cloud applications are often deployed as a collection of fine-grained and interconnected components called microservices. Microservice architectures offer massive benefits but also have broad implications for the performance characteristics of the respective systems. In addition, the microservices paradigm is typically paired with a DevOps culture, resulting in frequent application and deployment changes. Such applications are often referred to as cloud-native applications. In summary, the increasing use of ever-changing cloud-hosted microservice applications introduces a number of unique challenges for modeling the performance of modern applications. These include the amount, type, and structure of monitoring data, frequent behavioral changes, or infrastructure variabilities. This violates common assumptions of the state of the art and opens a research gap for our work.
In this thesis, we present five techniques for automated learning of performance models for cloud-native software systems. We achieve this by combining machine learning with traditional performance modeling techniques. Unlike previous work, our focus is on cloud-hosted and continuously evolving microservice architectures, so-called cloud-native applications. Therefore, our contributions aim to solve the above challenges to deliver automated performance models with minimal computational overhead and no manual intervention. Depending on the cloud computing model, privacy agreements, or monitoring capabilities of each platform, we identify different scenarios where performance modeling, prediction, and optimization techniques can provide great benefits. Specifically, the contributions of this thesis are as follows:
Monitorless: Application-agnostic prediction of performance degradations.
To manage application performance with only platform-level monitoring, we propose Monitorless, the first truly application-independent approach to detecting performance degradation. We use machine learning to bridge the gap between platform-level monitoring and application-specific measurements, eliminating the need for application-level monitoring. Monitorless creates a single and holistic resource saturation model that can be used for heterogeneous and untrained applications. Results show that Monitorless infers resource-based performance degradation with 97% accuracy. Moreover, it can achieve similar performance to typical autoscaling solutions, despite using less monitoring information.
SuanMing: Predicting performance degradation using tracing.
We introduce SuanMing to mitigate performance issues before they impact the user experience. This contribution is applied in scenarios where tracing tools enable application-level monitoring. SuanMing predicts explainable causes of expected performance degradations and prevents performance degradations before they occur. Evaluation results show that SuanMing can predict and pinpoint future performance degradations with an accuracy of over 90%.
SARDE: Continuous and autonomous estimation of resource demands.
We present SARDE to learn application models for highly variable application deployments. This contribution focuses on the continuous estimation of application resource demands, a key parameter of performance models. SARDE represents an autonomous ensemble estimation technique. It dynamically and continuously optimizes, selects, and executes an ensemble of approaches to estimate resource demands in response to changes in the application or its environment. Through continuous online adaptation, SARDE efficiently achieves an average resource demand estimation error of 15.96% in our evaluation.
DepIC: Learning parametric dependencies from monitoring data.
DepIC utilizes feature selection techniques in combination with an ensemble regression approach to automatically identify and characterize parametric dependencies. Although parametric dependencies can massively improve the accuracy of performance models, DepIC is the first approach to automatically learn such parametric dependencies from passive monitoring data streams. Our evaluation shows that DepIC achieves 91.7% precision in identifying dependencies and reduces the characterization prediction error by 30% compared to the best individual approach.
Baloo: Modeling the configuration space of databases.
To study the impact of different configurations within distributed DBMSs, we introduce Baloo. Our last contribution models the configuration space of databases considering measurement variabilities in the cloud. More specifically, Baloo dynamically estimates the required benchmarking measurements and automatically builds a configuration space model of a given DBMS. Our evaluation of Baloo on a dataset consisting of 900 configuration points shows that the framework achieves a prediction error of less than 11% while saving up to 80% of the measurement effort.
Although the contributions themselves are orthogonally aligned, taken together they provide a holistic approach to performance management of modern cloud-native microservice applications.
Our contributions are a significant step forward as they specifically target novel and cloud-native software development and operation paradigms, surpassing the capabilities and limitations of previous approaches.
In addition, the research presented in this paper also has a significant impact on the industry, as the contributions were developed in collaboration with research teams from Nokia Bell Labs, Huawei, and Google.
Overall, our solutions open up new possibilities for managing and optimizing cloud applications and improve cost and energy efficiency.
This work introduced the reader to all relevant fields to tap into an ultrasound-based state of charge estimation and provides a blueprint for the procedure to achieve and test the fundamentals of such an approach. It spanned from an in-depth electrochemical characterization of the studied battery cells over establishing the measurement technique, digital processing of ultrasonic transmission signals, and characterization of the SoC dependent property changes of those signals to a proof of concept of an ultrasound-based state of charge estimation.
The State of the art & theoretical background chapter focused on the battery section on the mechanical property changes of lithium-ion batteries during operation. The components and the processes involved to manufacture a battery cell were described to establish the fundamentals for later interrogation. A comprehensive summary of methods for state estimation was given and an emphasis was laid on mechanical methods, including a critical review of the most recent research on ultrasound-based state estimation. Afterward, the fundamentals of ultrasonic non-destructive evaluation were introduced, starting with the sound propagation modes in isotropic boundary-free media, followed by the introduction of boundaries and non-isotropic structure to finally approach the class of fluid-saturated porous media, which batteries can be counted to. As the processing of the ultrasonic signals transmitted through lithium-ion battery cells with the aim of feature extraction was one of the main goals of this work, the fundamentals of digital signal processing and methods for the time of flight estimation were reviewed and compared in a separate section.
All available information on the interrogated battery cell and the instrumentation was collected in the Experimental methods & instrumentation chapter, including a detailed step-by-step manual of the process developed in this work to create and attach a sensor stack for ultrasonic interrogation based on low-cost off-the-shelf piezo elements.
The Results & discussion chapter opened with an in-depth electrochemical and post-mortem interrogation to reverse engineer the battery cell design and its internal structure. The combination of inductively coupled plasma-optical emission spectrometry and incremental capacity analysis applied to three-electrode lab cells, constructed from the studied battery cell’s materials, allowed to identify the SoC ranges in which phase transitions and staging occur and thereby directly links changes in the ultrasonic signal properties with the state of the active materials, which makes this work stand out among other studies on ultrasound-based state estimation. Additional dilatometer experiments were able to prove that the measured effect in ultrasonic time of flight cannot originate from the thickness increase of the battery cells alone, as this thickness increase is smaller and in opposite direction to the change in time of flight. Therefore, changes in elastic modulus and density have to be responsible for the observed effect.
The construction of the sensor stack from off-the-shelf piezo elements, its electromagnetic shielding, and attachment to both sides of the battery cells was treated in a subsequent section. Experiments verified the necessity of shielding and its negligible influence on the ultrasonic signals. A hypothesis describing the metal layer in the pouch foil to be the transport medium of an electrical coupling/distortion between sending and receiving sensor was formulated and tested. Impedance spectroscopy was shown to be a useful tool to characterize the resonant behavior of piezo elements and ensure the mechanical coupling of such to the surface of the battery cells. The excitation of the piezo elements by a raised cosine (RCn) waveform with varied center frequency in the range of 50 kHz to 250 kHz was studied in the frequency domain and the influence of the resonant behavior, as identified prior by impedance spectroscopy, on waveform and frequency content was evaluated to be uncritical. Therefore, the forced oscillation produced by this excitation was assumed to be mechanically coupled as ultrasonic waves into the battery cells.
The ultrasonic waves transmitted through the battery cell were recorded by piezo elements on the opposing side. A first inspection of the raw, unprocessed signals identified the transmission of two main wave packages and allowed the identification of two major trends: the time of flight of ultrasonic wave packages decreases with the center frequency of the RCn waveform, and with state of charge. These trends were to be assessed further in the subsequent sections. Therefore, methods for the extraction of features (properties) from the ultrasonic signals were established, compared, and tested in a dedicated section. Several simple and advanced thresholding methods were compared with envelope-based and cross-correlation methods to estimate the time of flight (ToF). It was demonstrated that the envelope-based method yields the most robust estimate for the first and second wave package. This finding is in accordance with the literature stating that an envelope-based method is best suited for dispersive, absorptive media [204], to which lithium-ion batteries are counted. Respective trends were already suggested by the heatmap plots of the raw signals vs. RCn frequency and SoC. To enable such a robust estimate, an FIR filter had to be designed to preprocess the transmitted signals and thereby attenuate frequency components that verifiably lead to a distorted shape of the envelope.
With a robust ToF estimation method selected, the characterization of the signal properties ToF and transmitted energy content (EC) was performed in-depth. A study of cycle-to-cycle variations unveiled that the signal properties are affected by a long rest period and the associated relaxation of the multi-particle system “battery cell” to equilibrium. In detail, during cycling, the signal properties don’t reach the same value at a given SoC in two subsequent cycles if the first of the two cycles follows a long rest period. In accordance with the literature, a break-in period, making up for more than ten cycles post-formation, was observed. During this break-in period, the mechanical properties of the system are said to change until a steady state is reached [25]. Experiments at different C-rate showed that ultrasonic signal properties can sense the non-equilibrium state of a battery cell, characterized by an increasing area between charge and discharge curve of the respective signal property vs. SoC plot. This non-equilibrium state relaxes in the rest period following the discharge after the cut-off voltage is reached. The relaxation in the rest period following the charge is much smaller and shows little C-rate dependency as the state is prepared by constant voltage charging at the end of charge voltage. For a purely statistical SoC estimation approach, as employed in this work, where only instantaneous measurements are taken into account and the historic course of the measurement is not utilized as a source of information, the presence of hysteresis and relaxation leads to a reduced estimation accuracy. Future research should address this issue or even utilize the relaxation to improve the estimation accuracy, by incorporating historic information, e.g., by using the derivative of a signal property as an additional feature. The signal properties were then tested for their correlation with SoC as a function of RCn frequency. This allowed identifying trends in the behavior of the signal properties as a function of RCn frequency and C-rate in a condensed fashion and thereby enabled to predict the frequency range, about 50 kHz to 125 kHz, in which the course of the signal properties is best suited for SoC estimation.
The final section provided a proof of concept of the ultrasound-based SoC estimation, by applying a support vector regression (SVR) to before thoroughly studied ultrasonic signal properties, as well as current and battery cell voltage. The included case study was split into different parts that assessed the ability of an SVR to estimate the SoC in a variety of scenarios. Seven battery cells, prepared with sensor stacks attached to both faces, were used to generate 14 datasets. First, a comparison of self-tests, where a portion of a dataset is used for training and another for testing, and cross-tests, which use the dataset of one cell for training and the dataset of another for testing, was performed. A root mean square error (RMSE) of 3.9% to 4.8% SoC and 3.6% to 10.0% SoC was achieved, respectively. In general, it was observed that the SVR is prone to overestimation at low SoCs and underestimation at high SoCs, which was attributed to the pronounced hysteresis and relaxation of the ultrasonic signal properties in this SoC ranges. The fact that higher accuracy is achieved, if the exact cell is known to the model, indicates that a variation between cells exists. This variation between cells can originate from differences in mechanical properties as a result of production variations or from differences in manual sensor placement, mechanical coupling, or resonant behavior of the ultrasonic sensors. To mitigate the effect of the cell-to-cell variations, a test was performed, where the datasets of six out of the seven cells were combined as training data, and the dataset of the seventh cell was used for testing. This reduced the spread of the RMSE from (3.6 - 10.0)% SoC to (5.9 – 8.5)% SoC, respectively, once again stating that a databased approach for state estimation becomes more reliable with a large data basis. Utilizing self-tests on seven datasets, the effect of additional features on the state estimation result was tested. The involvement of an additional feature did not necessarily improve the estimation accuracy, but it was shown that a combination of ultrasonic and electrical features is superior to the training with these features alone. To test the ability of the model to estimate the SoC in unknown cycling conditions, a test was performed where the C-rate of the test dataset was not included in the training data. The result suggests that for practical applications it might be sufficient to perform training with the boundary of the use cases in a controlled laboratory environment to handle the estimation in a broad spectrum of use cases.
In comparison with literature, this study stands out by utilizing and modifying off-the-shelf piezo elements to equip state-of-the-art lithium-ion battery cells with ultrasonic sensors, employing a range of center frequencies for the waveform, transmitted through the battery cell, instead of a fixed frequency and by allowing the SVR to choose the frequency that yields the best result. The characterization of the ultrasonic signal properties as a function of RCn frequency and SoC and the assignment of characteristic changes in the signal properties to electrochemical processes, such as phase transitions and staging, makes this work unique. By studying a range of use cases, it was demonstrated that an improved SoC estimation accuracy can be achieved with the aid of ultrasonic measurements – thanks to the correlation of the mechanical properties of the battery cells with the SoC.