TY  - JOUR
A1  - Liman, Leon
A1  - May, Bernd
A1  - Fette, Georg
A1  - Krebs, Jonathan
A1  - Puppe, Frank
T1  - Using a clinical data warehouse to calculate and present key metrics for the radiology department: implementation and performance evaluation
JF  - JMIR Medical Informatics
N2  - Background: Due to the importance of radiologic examinations, such as X-rays or computed tomography scans, for many clinical diagnoses, the optimal use of the radiology department is 1 of the primary goals of many hospitals.

Objective: This study aims to calculate the key metrics of this use by creating a radiology data warehouse solution, where data from radiology information systems (RISs) can be imported and then queried using a query language as well as a graphical user interface (GUI).

Methods: Using a simple configuration file, the developed system allowed for the processing of radiology data exported from any kind of RIS into a Microsoft Excel, comma-separated value (CSV), or JavaScript Object Notation (JSON) file. These data were then imported into a clinical data warehouse. Additional values based on the radiology data were calculated during this import process by implementing 1 of several provided interfaces. Afterward, the query language and GUI of the data warehouse were used to configure and calculate reports on these data. For the most common types of requested reports, a web interface was created to view their numbers as graphics.

Results: The tool was successfully tested with the data of 4 different German hospitals from 2018 to 2021, with a total of 1,436,111 examinations. The user feedback was good, since all their queries could be answered if the available data were sufficient. The initial processing of the radiology data for using them with the clinical data warehouse took (depending on the amount of data provided by each hospital) between 7 minutes and 1 hour 11 minutes. Calculating 3 reports of different complexities on the data of each hospital was possible in 1-3 seconds for reports with up to 200 individual calculations and in up to 1.5 minutes for reports with up to 8200 individual calculations.

Conclusions: A system was developed with the main advantage of being generic concerning the export of different RISs as well as concerning the configuration of queries for various reports. The queries could be configured easily using the GUI of the data warehouse, and their results could be exported into the standard formats Excel and CSV for further processing.
KW  - data warehouse
KW  - eHealth
KW  - hospital data
KW  - electronic health records
KW  - radiology
KW  - statistics and numerical data
KW  - medical records
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349411
SN  - 2291-9694
VL  - 11
ER  - 
TY  - JOUR
A1  - Wick, Christoph
A1  - Hartelt, Alexander
A1  - Puppe, Frank
T1  - Staff, symbol and melody detection of Medieval manuscripts written in square notation using deep Fully Convolutional Networks
JF  - Applied Sciences
N2  - Even today, the automatic digitisation of scanned documents in general, but especially the automatic optical music recognition (OMR) of historical manuscripts, still remains an enormous challenge, since both handwritten musical symbols and text have to be identified. This paper focuses on the Medieval so-called square notation developed in the 11th–12th century, which is already composed of staff lines, staves, clefs, accidentals, and neumes that are roughly spoken connected single notes. The aim is to develop an algorithm that captures both the neumes, and in particular its melody, which can be used to reconstruct the original writing. Our pipeline is similar to the standard OMR approach and comprises a novel staff line and symbol detection algorithm based on deep Fully Convolutional Networks (FCN), which perform pixel-based predictions for either staff lines or symbols and their respective types. Then, the staff line detection combines the extracted lines to staves and yields an F\(_1\) -score of over 99% for both detecting lines and complete staves. For the music symbol detection, we choose a novel approach that skips the step to identify neumes and instead directly predicts note components (NCs) and their respective affiliation to a neume. Furthermore, the algorithm detects clefs and accidentals. Our algorithm predicts the symbol sequence of a staff with a diplomatic symbol accuracy rate (dSAR) of about 87%, which includes symbol type and location. If only the NCs without their respective connection to a neume, all clefs and accidentals are of interest, the algorithm reaches an harmonic symbol accuracy rate (hSAR) of approximately 90%. In general, the algorithm recognises a symbol in the manuscript with an F\(_1\) -score of over 96%.
KW  - optical music recognition
KW  - historical document analysis
KW  - medieval manuscripts
KW  - neume notation
KW  - fully convolutional neural networks
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-197248
SN  - 2076-3417
VL  - 9
IS  - 13
ER  - 
TY  - JOUR
A1  - Djebko, Kirill
A1  - Puppe, Frank
A1  - Kayal, Hakan
T1  - Model-based fault detection and diagnosis for spacecraft with an application for the SONATE triple cube nano-satellite
JF  - Aerospace
N2  - The correct behavior of spacecraft components is the foundation of unhindered mission operation. However, no technical system is free of wear and degradation. A malfunction of one single component might significantly alter the behavior of the whole spacecraft and may even lead to a complete mission failure. Therefore, abnormal component behavior must be detected early in order to be able to perform counter measures. A dedicated fault detection system can be employed, as opposed to classical health monitoring, performed by human operators, to decrease the response time to a malfunction. In this paper, we present a generic model-based diagnosis system, which detects faults by analyzing the spacecraft’s housekeeping data. The observed behavior of the spacecraft components, given by the housekeeping data is compared to their expected behavior, obtained through simulation. Each discrepancy between the observed and the expected behavior of a component generates a so-called symptom. Given the symptoms, the diagnoses are derived by computing sets of components whose malfunction might cause the observed discrepancies. We demonstrate the applicability of the diagnosis system by using modified housekeeping data of the qualification model of an actual spacecraft and outline the advantages and drawbacks of our approach.
KW  - fault detection
KW  - model-based diagnosis
KW  - nano-satellite
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-198836
SN  - 2226-4310
VL  - 6
IS  - 10
ER  - 
TY  - JOUR
A1  - Fischer, Norbert
A1  - Hartelt, Alexander
A1  - Puppe, Frank
T1  - Line-level layout recognition of historical documents with background knowledge
JF  - Algorithms
N2  - Digitization and transcription of historic documents offer new research opportunities for humanists and are the topics of many edition projects. However, manual work is still required for the main phases of layout recognition and the subsequent optical character recognition (OCR) of early printed documents. This paper describes and evaluates how deep learning approaches recognize text lines and can be extended to layout recognition using background knowledge. The evaluation was performed on five corpora of early prints from the 15th and 16th Centuries, representing a variety of layout features. While the main text with standard layouts could be recognized in the correct reading order with a precision and recall of up to 99.9%, also complex layouts were recognized at a rate as high as 90% by using background knowledge, the full potential of which was revealed if many pages of the same source were transcribed.
KW  - layout recognition
KW  - background knowledge
KW  - historical document analysis
KW  - fully convolutional neural networks
KW  - baseline detection
KW  - text line detection
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-310938
SN  - 1999-4893
VL  - 16
IS  - 3
ER  - 
TY  - JOUR
A1  - Kempf, Sebastian
A1  - Krug, Markus
A1  - Puppe, Frank
T1  - KIETA: Key-insight extraction from scientific tables
JF  - Applied Intelligence
N2  - An important but very time consuming part of the research process is literature review. An already large and nevertheless growing ground set of publications as well as a steadily increasing publication rate continue to worsen the situation. Consequently, automating this task as far as possible is desirable. Experimental results of systems are key-insights of high importance during literature review and usually represented in form of tables. Our pipeline KIETA exploits these tables to contribute to the endeavor of automation by extracting them and their contained knowledge from scientific publications. The pipeline is split into multiple steps to guarantee modularity as well as analyzability, and agnosticim regarding the specific scientific domain up until the knowledge extraction step, which is based upon an ontology. Additionally, a dataset of corresponding articles has been manually annotated with information regarding table and knowledge extraction. Experiments show promising results that signal the possibility of an automated system, while also indicating limits of extracting knowledge from tables without any context.
KW  - table extraction
KW  - table understanding
KW  - ontology
KW  - key-insight extraction
KW  - information extraction
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-324180
SN  - 0924-669X
VL  - 53
IS  - 8
ER  - 
TY  - JOUR
A1  - Puppe, Frank
T1  - Gesellschaftliche Perspektiven einer fachspezifischen KI für automatisierte Entscheidungen
JF  - Informatik Spektrum
N2  - Die künstliche Intelligenz (KI) entwickelt sich rasant und hat bereits eindrucksvolle Erfolge zu verzeichnen, darunter übermenschliche Kompetenz in den meisten Spielen und vielen Quizshows, intelligente Suchmaschinen, individualisierte Werbung, Spracherkennung, -ausgabe und -übersetzung auf sehr hohem Niveau und hervorragende Leistungen bei der Bildverarbeitung, u. a. in der Medizin, der optischen Zeichenerkennung, beim autonomen Fahren, aber auch beim Erkennen von Menschen auf Bildern und Videos oder bei Deep Fakes für Fotos und Videos. Es ist zu erwarten, dass die KI auch in der Entscheidungsfindung Menschen übertreffen wird; ein alter Traum der Expertensysteme, der durch Lernverfahren, Big Data und Zugang zu dem gesammelten Wissen im Web in greifbare Nähe rückt. Gegenstand dieses Beitrags sind aber weniger die technischen Entwicklungen, sondern mögliche gesellschaftliche Auswirkungen einer spezialisierten, kompetenten KI für verschiedene Bereiche der autonomen, d. h. nicht nur unterstützenden Entscheidungsfindung: als Fußballschiedsrichter, in der Medizin, für richterliche Entscheidungen und sehr spekulativ auch im politischen Bereich. Dabei werden Vor- und Nachteile dieser Szenarien aus gesellschaftlicher Sicht diskutiert.
KW  - Künstliche Intelligenz
KW  - Ethik
KW  - Entscheidungsfindung
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-324197
SN  - 0170-6012
VL  - 45
IS  - 2
ER  - 
TY  - JOUR
A1  - Toepfer, Martin
A1  - Corovic, Hamo
A1  - Fette, Georg
A1  - Klügl, Peter
A1  - Störk, Stefan
A1  - Puppe, Frank
T1  - Fine-grained information extraction from German transthoracic echocardiography reports
JF  - BMC Medical Informatics and Decision Making
N2  - Background
Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of Würzburg.

Methods
A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts.

Results
The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout.

Conclusions
The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports with very high precision and high recall on the majority of documents at the University Hospital of Würzburg. Extracted results populate a clinical data warehouse which supports clinical research.
Y1  - 2015
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-125509
VL  - 15
IS  - 91
ER  - 
TY  - JOUR
A1  - Krenzer, Adrian
A1  - Makowski, Kevin
A1  - Hekalo, Amar
A1  - Fitting, Daniel
A1  - Troya, Joel
A1  - Zoller, Wolfram G.
A1  - Hann, Alexander
A1  - Puppe, Frank
T1  - Fast machine learning annotation in the medical domain: a semi-automated video annotation tool for gastroenterologists
JF  - BioMedical Engineering OnLine
N2  - Background
Machine learning, especially deep learning, is becoming more and more relevant in research and development in the medical domain. For all the supervised deep learning applications, data is the most critical factor in securing successful implementation and sustaining the progress of the machine learning model. Especially gastroenterological data, which often involves endoscopic videos, are cumbersome to annotate. Domain experts are needed to interpret and annotate the videos. To support those domain experts, we generated a framework. With this framework, instead of annotating every frame in the video sequence, experts are just performing key annotations at the beginning and the end of sequences with pathologies, e.g., visible polyps. Subsequently, non-expert annotators supported by machine learning add the missing annotations for the frames in-between.
Methods
In our framework, an expert reviews the video and annotates a few video frames to verify the object’s annotations for the non-expert. In a second step, a non-expert has visual confirmation of the given object and can annotate all following and preceding frames with AI assistance. After the expert has finished, relevant frames will be selected and passed on to an AI model. This information allows the AI model to detect and mark the desired object on all following and preceding frames with an annotation. Therefore, the non-expert can adjust and modify the AI predictions and export the results, which can then be used to train the AI model.
Results
Using this framework, we were able to reduce workload of domain experts on average by a factor of 20 on our data. This is primarily due to the structure of the framework, which is designed to minimize the workload of the domain expert. Pairing this framework with a state-of-the-art semi-automated AI model enhances the annotation speed further. Through a prospective study with 10 participants, we show that semi-automated annotation using our tool doubles the annotation speed of non-expert annotators compared to a well-known state-of-the-art annotation tool.
Conclusion
In summary, we introduce a framework for fast expert annotation for gastroenterologists, which reduces the workload of the domain expert considerably while maintaining a very high annotation quality. The framework incorporates a semi-automated annotation system utilizing trained object detection models. The software and framework are open-source.
KW  - object detection
KW  - machine learning
KW  - deep learning
KW  - annotation
KW  - endoscopy
KW  - gastroenterology
KW  - automation
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-300231
VL  - 21
IS  - 1
ER  - 
TY  - JOUR
A1  - Loda, Sophia
A1  - Krebs, Jonathan
A1  - Danhof, Sophia
A1  - Schreder, Martin
A1  - Solimando, Antonio G.
A1  - Strifler, Susanne
A1  - Rasche, Leo
A1  - Kortüm, Martin
A1  - Kerscher, Alexander
A1  - Knop, Stefan
A1  - Puppe, Frank
A1  - Einsele, Hermann
A1  - Bittrich, Max
T1  - Exploration of artificial intelligence use with ARIES in multiple myeloma research
JF  - Journal of Clinical Medicine
N2  - Background: Natural language processing (NLP) is a powerful tool supporting the generation of Real-World Evidence (RWE). There is no NLP system that enables the extensive querying of parameters specific to multiple myeloma (MM) out of unstructured medical reports. We therefore created a MM-specific ontology to accelerate the information extraction (IE) out of unstructured text. Methods: Our MM ontology consists of extensive MM-specific and hierarchically structured attributes and values. We implemented “A Rule-based Information Extraction System” (ARIES) that uses this ontology. We evaluated ARIES on 200 randomly selected medical reports of patients diagnosed with MM. Results: Our system achieved a high F1-Score of 0.92 on the evaluation dataset with a precision of 0.87 and recall of 0.98. Conclusions: Our rule-based IE system enables the comprehensive querying of medical reports. The IE accelerates the extraction of data and enables clinicians to faster generate RWE on hematological issues. RWE helps clinicians to make decisions in an evidence-based manner. Our tool easily accelerates the integration of research evidence into everyday clinical practice.
KW  - natural language processing
KW  - ontology
KW  - artificial intelligence
KW  - multiple myeloma
KW  - real world evidence
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-197231
SN  - 2077-0383
VL  - 8
IS  - 7
ER  - 
TY  - JOUR
A1  - Gehrke, Alexander
A1  - Balbach, Nico
A1  - Rauch, Yong-Mi
A1  - Degkwitz, Andreas
A1  - Puppe, Frank
T1  - Erkennung von handschriftlichen Unterstreichungen in Alten Drucken
JF  - Bibliothek Forschung und Praxis
N2  - Die Erkennung handschriftlicher Artefakte wie Unterstreichungen in Buchdrucken ermöglicht Rückschlüsse auf das Rezeptionsverhalten und die Provenienzgeschichte und wird auch für eine OCR benötigt. Dabei soll zwischen handschriftlichen Unterstreichungen und waagerechten Linien im Druck (z. B. Trennlinien usw.) unterschieden werden, da letztere nicht ausgezeichnet werden sollen. Im Beitrag wird ein Ansatz basierend auf einem auf Unterstreichungen trainierten Neuronalen Netz gemäß der U-Net Architektur vorgestellt, dessen Ergebnisse in einem zweiten Schritt mit heuristischen Regeln nachbearbeitet werden. Die Evaluationen zeigen, dass Unterstreichungen sehr gut erkannt werden, wenn bei der Binarisierung der Scans nicht zu viele Pixel der Unterstreichung wegen geringem Kontrast verloren gehen. Zukünftig sollen die Worte oberhalb der Unterstreichung mit OCR transkribiert werden und auch andere Artefakte wie handschriftliche Notizen in alten Drucken erkannt werden.
N2  - The recognition of handwritten artefacts like underlines in historical printings allows inference on the reception and provenance history and is necessary for OCR (optical character recognition). In this context it is important to differentiate between handwritten and printed lines, since the latter are common in printings, but should be ignored. We present an approach based on neural nets with the U-Net architecture, whose segmentation results are post processed with heuristic rules. The evaluations show that handwritten underlines are very well recognized if the binarisation of the scans is adequate. Future work includes transcription of the underlined words with OCR and recognition of other artefacts like handwritten notes in historical printings.
T2  - Recognition of handwritten underlines in historical printings
KW  - Brüder Grimm Privatbibliothek
KW  - Erkennung handschriftlicher Artefakte
KW  - Convolutional Neural Network
KW  - regelbasierte Nachbearbeitung
KW  - Grimm brothers personal library
KW  - handwritten artefact recognition
KW  - convolutional neural network
KW  - rule based post processing
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-193377
SN  - 1865-7648
SN  - 0341-4183
N1  - Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG-geförderten) Allianz- bzw. Nationallizenz frei zugänglich.
VL  - 43
IS  - 3
SP  - 447
EP  - 452
ER  - 
TY  - JOUR
A1  - Mandel, Alexander
A1  - Hörnlein, Alexander
A1  - Ifland, Marianus
A1  - Lüneburg, Edeltraud
A1  - Deckert, Jürgen
A1  - Puppe, Frank
T1  - Aufwandsanalyse für computerunterstützte Multiple-Choice Papierklausuren
T1  - Cost analysis for computer supported multiple-choice paper examinations
JF  - GMS Journal for Medical Education
N2  - Introduction: 

Multiple-choice-examinations are still fundamental for assessment in medical degree programs. In addition to content related research, the optimization of the technical procedure is an important question. Medical examiners face three options: paper-based examinations with or without computer support or completely electronic examinations. Critical aspects are the effort for formatting, the logistic effort during the actual examination, quality, promptness and effort of the correction, the time for making the documents available for inspection by the students, and the statistical analysis of the examination results.

Methods: 

Since three semesters a computer program for input and formatting of MC-questions in medical and other paper-based examinations is used and continuously improved at Wuerzburg University. In the winter semester (WS) 2009/10 eleven, in the summer semester (SS) 2010 twelve and in WS 2010/11 thirteen medical examinations were accomplished with the program and automatically evaluated. For the last two semesters the remaining manual workload was recorded.

Results: 

The cost of the formatting and the subsequent analysis including adjustments of the analysis of an average examination with about 140 participants and about 35 questions was 5-7 hours for exams without complications in the winter semester 2009/2010, about 2 hours in SS 2010 and about 1.5 hours in the winter semester 2010/11. Including exams with complications, the average time was about 3 hours per exam in SS 2010 and 2.67 hours for the WS 10/11.

Discussion: 

For conventional multiple-choice exams the computer-based formatting and evaluation of paper-based exams offers a significant time reduction for lecturers in comparison with the manual correction of paper-based exams and compared to purely electronically conducted exams it needs a much simpler technological infrastructure and fewer staff during the exam."
N2  - Einleitung: 

Multiple-Choice-Klausuren spielen immer noch eine herausragende Rolle für fakultätsinterne medizinische Prüfungen. Neben inhaltlichen Arbeiten stellt sich die Frage, wie die technische Abwicklung optimiert werden kann. Für Dozenten in der Medizin gibt es zunehmend drei Optionen zur Durchführung von MC-Klausuren: Papierklausuren mit oder ohne Computerunterstützung oder vollständig elektronische Klausuren. Kritische Faktoren sind der Aufwand für die Formatierung der Klausur, der logistische Aufwand bei der Klausurdurchführung, die Qualität, Schnelligkeit und der Aufwand der Klausurkorrektur, die Bereitstellung der Dokumente für die Einsichtnahme, und die statistische Analyse der Klausurergebnisse.

Methoden: 

An der Universität Würzburg wird seit drei Semestern ein Computerprogramm zur Eingabe und Formatierung der MC-Fragen in medizinischen und anderen Papierklausuren verwendet und optimiert, mit dem im Wintersemester (WS) 2009/2010 elf, im Sommersemester (SS) 2010 zwölf und im WS 2010/11 dreizehn medizinische Klausuren erstellt und anschließend die eingescannten Antwortblätter automatisch ausgewertet wurden. In den letzten beiden Semestern wurden die Aufwände protokolliert.

Ergebnisse: 

Der Aufwand der Formatierung und der Auswertung einschl. nachträglicher Anpassung der Auswertung einer Durchschnittsklausur mit ca. 140 Teilnehmern und ca. 35 Fragen ist von 5-7 Stunden für Klausuren ohne Komplikation im WS 2009/2010 über ca. 2 Stunden im SS 2010 auf ca. 1,5 Stunden im WS 2010/11 gefallen. Einschließlich der Klausuren mit Komplikationen bei der Auswertung betrug die durchschnittliche Zeit im SS 2010 ca. 3 Stunden und im WS 10/11 ca. 2,67 Stunden pro Klausur.

Diskussion: 

Für konventionelle Multiple-Choice-Klausuren bietet die computergestützte Formatierung und Auswertung von Papierklausuren einen beträchtlichen Zeitvorteil für die Dozenten im Vergleich zur manuellen Korrektur von Papierklausuren und benötigt im Vergleich zu rein elektronischen Klausuren eine deutlich einfachere technische Infrastruktur und weniger Personal bei der Klausurdurchführung.
KW  - Multiple-Choice Prüfungen
KW  - Automatisierte Prüfungskorrektur
KW  - Aufwandsanalyse
KW  - Educational Measurement (I2.399)
KW  - Self-Evaluation Programs (I2.399.780)
KW  - Multiple-Choice Examination
KW  - Cost Analysis
Y1  - 2011
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-134386
VL  - 28
IS  - 4
ER  -