TY - JOUR A1 - Maron, Roman C. A1 - Haggenmüller, Sarah A1 - von Kalle, Christof A1 - Utikal, Jochen S. A1 - Meier, Friedegund A1 - Gellrich, Frank F. A1 - Hauschild, Axel A1 - French, Lars E. A1 - Schlaak, Max A1 - Ghoreschi, Kamran A1 - Kutzner, Heinz A1 - Heppt, Markus V. A1 - Haferkamp, Sebastian A1 - Sondermann, Wiebke A1 - Schadendorf, Dirk A1 - Schilling, Bastian A1 - Hekler, Achim A1 - Krieghoff-Henning, Eva A1 - Kather, Jakob N. A1 - Fröhling, Stefan A1 - Lipka, Daniel B. A1 - Brinker, Titus J. T1 - Robustness of convolutional neural networks in recognition of pigmented skin lesions JF - European Journal of Cancer N2 - Background A basic requirement for artificial intelligence (AI)–based image analysis systems, which are to be integrated into clinical practice, is a high robustness. Minor changes in how those images are acquired, for example, during routine skin cancer screening, should not change the diagnosis of such assistance systems. Objective To quantify to what extent minor image perturbations affect the convolutional neural network (CNN)–mediated skin lesion classification and to evaluate three possible solutions for this problem (additional data augmentation, test-time augmentation, anti-aliasing). Methods We trained three commonly used CNN architectures to differentiate between dermoscopic melanoma and nevus images. Subsequently, their performance and susceptibility to minor changes (‘brittleness’) was tested on two distinct test sets with multiple images per lesion. For the first set, image changes, such as rotations or zooms, were generated artificially. The second set contained natural changes that stemmed from multiple photographs taken of the same lesions. Results All architectures exhibited brittleness on the artificial and natural test set. The three reviewed methods were able to decrease brittleness to varying degrees while still maintaining performance. The observed improvement was greater for the artificial than for the natural test set, where enhancements were minor. Conclusions Minor image changes, relatively inconspicuous for humans, can have an effect on the robustness of CNNs differentiating skin lesions. By the methods tested here, this effect can be reduced, but not fully eliminated. Thus, further research to sustain the performance of AI classifiers is needed to facilitate the translation of such systems into the clinic. KW - artificial intelligence KW - machine learning KW - deep learning KW - neural networks KW - dermatology KW - skin neoplasms KW - melanoma KW - nevus Y1 - 2021 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-370245 VL - 145 ER - TY - JOUR A1 - Brinker, Titus J. A1 - Hekler, Achim A1 - Enk, Alexander H. A1 - Berking, Carola A1 - Haferkamp, Sebastian A1 - Hauschild, Axel A1 - Weichenthal, Michael A1 - Klode, Joachim A1 - Schadendorf, Dirk A1 - Holland-Letz, Tim A1 - von Kalle, Christof A1 - Fröhling, Stefan A1 - Schilling, Bastian A1 - Utikal, Jochen S. T1 - Deep neural networks are superior to dermatologists in melanoma image classification JF - European Journal of Cancer N2 - Background Melanoma is the most dangerous type of skin cancer but is curable if detected early. Recent publications demonstrated that artificial intelligence is capable in classifying images of benign nevi and melanoma with dermatologist-level precision. However, a statistically significant improvement compared with dermatologist classification has not been reported to date. Methods For this comparative study, 4204 biopsy-proven images of melanoma and nevi (1:1) were used for the training of a convolutional neural network (CNN). New techniques of deep learning were integrated. For the experiment, an additional 804 biopsy-proven dermoscopic images of melanoma and nevi (1:1) were randomly presented to dermatologists of nine German university hospitals, who evaluated the quality of each image and stated their recommended treatment (19,296 recommendations in total). Three McNemar's tests comparing the results of the CNN's test runs in terms of sensitivity, specificity and overall correctness were predefined as the main outcomes. Findings The respective sensitivity and specificity of lesion classification by the dermatologists were 67.2% (95% confidence interval [CI]: 62.6%–71.7%) and 62.2% (95% CI: 57.6%–66.9%). In comparison, the trained CNN achieved a higher sensitivity of 82.3% (95% CI: 78.3%–85.7%) and a higher specificity of 77.9% (95% CI: 73.8%–81.8%). The three McNemar's tests in 2 × 2 tables all reached a significance level of p < 0.001. This significance level was sustained for both subgroups. Interpretation For the first time, automated dermoscopic melanoma image classification was shown to be significantly superior to both junior and board-certified dermatologists (p < 0.001). KW - deep learning KW - melanoma KW - skin cancer KW - artificial intelligence Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-220539 VL - 119 ER - TY - JOUR A1 - Brinker, Titus J. A1 - Hekler, Achim A1 - Hauschild, Axel A1 - Berking, Carola A1 - Schilling, Bastian A1 - Enk, Alexander H. A1 - Haferkamp, Sebastian A1 - Karoglan, Ante A1 - von Kalle, Christof A1 - Weichenthal, Michael A1 - Sattler, Elke A1 - Schadendorf, Dirk A1 - Gaiser, Maria R. A1 - Klode, Joachim A1 - Utikal, Jochen S. T1 - Comparing artificial intelligence algorithms to 157 German dermatologists: the melanoma classification benchmark JF - European Journal of Cancer N2 - Background Several recent publications have demonstrated the use of convolutional neural networks to classify images of melanoma at par with board-certified dermatologists. However, the non-availability of a public human benchmark restricts the comparability of the performance of these algorithms and thereby the technical progress in this field. Methods An electronic questionnaire was sent to dermatologists at 12 German university hospitals. Each questionnaire comprised 100 dermoscopic and 100 clinical images (80 nevi images and 20 biopsy-verified melanoma images, each), all open-source. The questionnaire recorded factors such as the years of experience in dermatology, performed skin checks, age, sex and the rank within the university hospital or the status as resident physician. For each image, the dermatologists were asked to provide a management decision (treat/biopsy lesion or reassure the patient). Main outcome measures were sensitivity, specificity and the receiver operating characteristics (ROC). Results Total 157 dermatologists assessed all 100 dermoscopic images with an overall sensitivity of 74.1%, specificity of 60.0% and an ROC of 0.67 (range = 0.538–0.769); 145 dermatologists assessed all 100 clinical images with an overall sensitivity of 89.4%, specificity of 64.4% and an ROC of 0.769 (range = 0.613–0.9). Results between test-sets were significantly different (P < 0.05) confirming the need for a standardised benchmark. Conclusions We present the first public melanoma classification benchmark for both non-dermoscopic and dermoscopic images for comparing artificial intelligence algorithms with diagnostic performance of 145 or 157 dermatologists. Melanoma Classification Benchmark should be considered as a reference standard for white-skinned Western populations in the field of binary algorithmic melanoma classification. KW - benchmark KW - artificial intelligence KW - deep learning KW - melanoma Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-220569 VL - 111 ER - TY - JOUR A1 - Vollmer, Andreas A1 - Nagler, Simon A1 - Hörner, Marius A1 - Hartmann, Stefan A1 - Brands, Roman C. A1 - Breitenbücher, Niko A1 - Straub, Anton A1 - Kübler, Alexander A1 - Vollmer, Michael A1 - Gubik, Sebastian A1 - Lang, Gernot A1 - Wollborn, Jakob A1 - Saravi, Babak T1 - Performance of artificial intelligence-based algorithms to predict prolonged length of stay after head and neck cancer surgery JF - Heliyon N2 - Background Medical resource management can be improved by assessing the likelihood of prolonged length of stay (LOS) for head and neck cancer surgery patients. The objective of this study was to develop predictive models that could be used to determine whether a patient's LOS after cancer surgery falls within the normal range of the cohort. Methods We conducted a retrospective analysis of a dataset consisting of 300 consecutive patients who underwent head and neck cancer surgery between 2017 and 2022 at a single university medical center. Prolonged LOS was defined as LOS exceeding the 75th percentile of the cohort. Feature importance analysis was performed to evaluate the most important predictors for prolonged LOS. We then constructed 7 machine learning and deep learning algorithms for the prediction modeling of prolonged LOS. Results The algorithms reached accuracy values of 75.40 (radial basis function neural network) to 97.92 (Random Trees) for the training set and 64.90 (multilayer perceptron neural network) to 84.14 (Random Trees) for the testing set. The leading parameters predicting prolonged LOS were operation time, ischemia time, the graft used, the ASA score, the intensive care stay, and the pathological stages. The results revealed that patients who had a higher number of harvested lymph nodes (LN) had a lower probability of recurrence but also a greater LOS. However, patients with prolonged LOS were also at greater risk of recurrence, particularly when fewer (LN) were extracted. Further, LOS was more strongly correlated with the overall number of extracted lymph nodes than with the number of positive lymph nodes or the ratio of positive to overall extracted lymph nodes, indicating that particularly unnecessary lymph node extraction might be associated with prolonged LOS. Conclusions The results emphasize the need for a closer follow-up of patients who experience prolonged LOS. Prospective trials are warranted to validate the present results. KW - prediction KW - head and neck cancer KW - machine learning KW - deep learning KW - artificial intelligence KW - length of stay KW - cancer Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-350416 SN - 2405-8440 VL - 9 IS - 11 ER - TY - JOUR A1 - Lux, Thomas J. A1 - Banck, Michael A1 - Saßmannshausen, Zita A1 - Troya, Joel A1 - Krenzer, Adrian A1 - Fitting, Daniel A1 - Sudarevic, Boban A1 - Zoller, Wolfram G. A1 - Puppe, Frank A1 - Meining, Alexander A1 - Hann, Alexander T1 - Pilot study of a new freely available computer-aided polyp detection system in clinical practice JF - International Journal of Colorectal Disease N2 - Purpose Computer-aided polyp detection (CADe) systems for colonoscopy are already presented to increase adenoma detection rate (ADR) in randomized clinical trials. Those commercially available closed systems often do not allow for data collection and algorithm optimization, for example regarding the usage of different endoscopy processors. Here, we present the first clinical experiences of a, for research purposes publicly available, CADe system. Methods We developed an end-to-end data acquisition and polyp detection system named EndoMind. Examiners of four centers utilizing four different endoscopy processors used EndoMind during their clinical routine. Detected polyps, ADR, time to first detection of a polyp (TFD), and system usability were evaluated (NCT05006092). Results During 41 colonoscopies, EndoMind detected 29 of 29 adenomas in 66 of 66 polyps resulting in an ADR of 41.5%. Median TFD was 130 ms (95%-CI, 80–200 ms) while maintaining a median false positive rate of 2.2% (95%-CI, 1.7–2.8%). The four participating centers rated the system using the System Usability Scale with a median of 96.3 (95%-CI, 70–100). Conclusion EndoMind’s ability to acquire data, detect polyps in real-time, and high usability score indicate substantial practical value for research and clinical practice. Still, clinical benefit, measured by ADR, has to be determined in a prospective randomized controlled trial. KW - colonoscopy KW - polyp KW - artificial intelligence KW - deep learning KW - CADe Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-324459 VL - 37 IS - 6 ER - TY - JOUR A1 - Kunz, Felix A1 - Stellzig-Eisenhauer, Angelika A1 - Boldt, Julian T1 - Applications of artificial intelligence in orthodontics — an overview and perspective based on the current state of the art JF - Applied Sciences N2 - Artificial intelligence (AI) has already arrived in many areas of our lives and, because of the increasing availability of computing power, can now be used for complex tasks in medicine and dentistry. This is reflected by an exponential increase in scientific publications aiming to integrate AI into everyday clinical routines. Applications of AI in orthodontics are already manifold and range from the identification of anatomical/pathological structures or reference points in imaging to the support of complex decision-making in orthodontic treatment planning. The aim of this article is to give the reader an overview of the current state of the art regarding applications of AI in orthodontics and to provide a perspective for the use of such AI solutions in clinical routine. For this purpose, we present various use cases for AI in orthodontics, for which research is already available. Considering the current scientific progress, it is not unreasonable to assume that AI will become an integral part of orthodontic diagnostics and treatment planning in the near future. Although AI will equally likely not be able to replace the knowledge and experience of human experts in the not-too-distant future, it probably will be able to support practitioners, thus serving as a quality-assuring component in orthodontic patient care. KW - orthodontics KW - artificial intelligence KW - machine learning KW - deep learning KW - cephalometry KW - age determination by skeleton KW - tooth extraction KW - orthognathic surgery Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-310940 SN - 2076-3417 VL - 13 IS - 6 ER - TY - JOUR A1 - Vollmer, Andreas A1 - Vollmer, Michael A1 - Lang, Gernot A1 - Straub, Anton A1 - Kübler, Alexander A1 - Gubik, Sebastian A1 - Brands, Roman C. A1 - Hartmann, Stefan A1 - Saravi, Babak T1 - Automated assessment of radiographic bone loss in the posterior maxilla utilizing a multi-object detection artificial intelligence algorithm JF - Applied Sciences N2 - Periodontitis is one of the most prevalent diseases worldwide. The degree of radiographic bone loss can be used to assess the course of therapy or the severity of the disease. Since automated bone loss detection has many benefits, our goal was to develop a multi-object detection algorithm based on artificial intelligence that would be able to detect and quantify radiographic bone loss using standard two-dimensional radiographic images in the maxillary posterior region. This study was conducted by combining three recent online databases and validating the results using an external validation dataset from our organization. There were 1414 images for training and testing and 341 for external validation in the final dataset. We applied a Keypoint RCNN with a ResNet-50-FPN backbone network for both boundary box and keypoint detection. The intersection over union (IoU) and the object keypoint similarity (OKS) were used for model evaluation. The evaluation of the boundary box metrics showed a moderate overlapping with the ground truth, revealing an average precision of up to 0.758. The average precision and recall over all five folds were 0.694 and 0.611, respectively. Mean average precision and recall for the keypoint detection were 0.632 and 0.579, respectively. Despite only using a small and heterogeneous set of images for training, our results indicate that the algorithm is able to learn the objects of interest, although without sufficient accuracy due to the limited number of images and a large amount of information available in panoramic radiographs. Considering the widespread availability of panoramic radiographs as well as the increasing use of online databases, the presented model can be further improved in the future to facilitate its implementation in clinics. KW - radiographic bone loss KW - alveolar bone loss KW - maxillofacial surgery KW - deep learning KW - classification KW - artificial intelligence KW - object detection Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-305050 SN - 2076-3417 VL - 13 IS - 3 ER - TY - JOUR A1 - Henckert, David A1 - Malorgio, Amos A1 - Schweiger, Giovanna A1 - Raimann, Florian J. A1 - Piekarski, Florian A1 - Zacharowski, Kai A1 - Hottenrott, Sebastian A1 - Meybohm, Patrick A1 - Tscholl, David W. A1 - Spahn, Donat R. A1 - Roche, Tadzio R. T1 - Attitudes of anesthesiologists toward artificial intelligence in anesthesia: a multicenter, mixed qualitative–quantitative study JF - Journal of Clinical Medicine N2 - Artificial intelligence (AI) is predicted to play an increasingly important role in perioperative medicine in the very near future. However, little is known about what anesthesiologists know and think about AI in this context. This is important because the successful introduction of new technologies depends on the understanding and cooperation of end users. We sought to investigate how much anesthesiologists know about AI and what they think about the introduction of AI-based technologies into the clinical setting. In order to better understand what anesthesiologists think of AI, we recruited 21 anesthesiologists from 2 university hospitals for face-to-face structured interviews. The interview transcripts were subdivided sentence-by-sentence into discrete statements, and statements were then grouped into key themes. Subsequently, a survey of closed questions based on these themes was sent to 70 anesthesiologists from 3 university hospitals for rating. In the interviews, the base level of knowledge of AI was good at 86 of 90 statements (96%), although awareness of the potential applications of AI in anesthesia was poor at only 7 of 42 statements (17%). Regarding the implementation of AI in anesthesia, statements were split roughly evenly between pros (46 of 105, 44%) and cons (59 of 105, 56%). Interviewees considered that AI could usefully be used in diverse tasks such as risk stratification, the prediction of vital sign changes, or as a treatment guide. The validity of these themes was probed in a follow-up survey of 70 anesthesiologists with a response rate of 70%, which confirmed an overall positive view of AI in this group. Anesthesiologists hold a range of opinions, both positive and negative, regarding the application of AI in their field of work. Survey-based studies do not always uncover the full breadth of nuance of opinion amongst clinicians. Engagement with specific concerns, both technical and ethical, will prove important as this technology moves from research to the clinic. KW - artificial intelligence KW - machine learning KW - anesthesia KW - anesthesiology KW - qualitative research KW - clinical decision support Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-311189 SN - 2077-0383 VL - 12 IS - 6 ER - TY - JOUR A1 - Herm, Lukas-Valentin A1 - Steinbach, Theresa A1 - Wanner, Jonas A1 - Janiesch, Christian T1 - A nascent design theory for explainable intelligent systems JF - Electronic Markets N2 - Due to computational advances in the past decades, so-called intelligent systems can learn from increasingly complex data, analyze situations, and support users in their decision-making to address them. However, in practice, the complexity of these intelligent systems renders the user hardly able to comprehend the inherent decision logic of the underlying machine learning model. As a result, the adoption of this technology, especially for high-stake scenarios, is hampered. In this context, explainable artificial intelligence offers numerous starting points for making the inherent logic explainable to people. While research manifests the necessity for incorporating explainable artificial intelligence into intelligent systems, there is still a lack of knowledge about how to socio-technically design these systems to address acceptance barriers among different user groups. In response, we have derived and evaluated a nascent design theory for explainable intelligent systems based on a structured literature review, two qualitative expert studies, a real-world use case application, and quantitative research. Our design theory includes design requirements, design principles, and design features covering the topics of global explainability, local explainability, personalized interface design, as well as psychological/emotional factors. KW - artificial intelligence KW - explainable artificial intelligence KW - XAI KW - design science research KW - design theory KW - intelligent systems Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-323809 SN - 1019-6781 VL - 32 IS - 4 ER - TY - JOUR A1 - Wanner, Jonas A1 - Herm, Lukas-Valentin A1 - Heinrich, Kai A1 - Janiesch, Christian T1 - The effect of transparency and trust on intelligent system acceptance: evidence from a user-based study JF - Electronic Markets N2 - Contemporary decision support systems are increasingly relying on artificial intelligence technology such as machine learning algorithms to form intelligent systems. These systems have human-like decision capacity for selected applications based on a decision rationale which cannot be looked-up conveniently and constitutes a black box. As a consequence, acceptance by end-users remains somewhat hesitant. While lacking transparency has been said to hinder trust and enforce aversion towards these systems, studies that connect user trust to transparency and subsequently acceptance are scarce. In response, our research is concerned with the development of a theoretical model that explains end-user acceptance of intelligent systems. We utilize the unified theory of acceptance and use in information technology as well as explanation theory and related theories on initial trust and user trust in information systems. The proposed model is tested in an industrial maintenance workplace scenario using maintenance experts as participants to represent the user group. Results show that acceptance is performance-driven at first sight. However, transparency plays an important indirect role in regulating trust and the perception of performance. KW - user acceptance KW - intelligent system KW - artificial intelligence KW - trust KW - system transparency Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-323829 SN - 1019-6781 VL - 32 IS - 4 ER -