TY - JOUR A1 - Woznicki, Piotr A1 - Laqua, Fabian Christopher A1 - Al-Haj, Adam A1 - Bley, Thorsten A1 - Baeßler, Bettina T1 - Addressing challenges in radiomics research: systematic review and repository of open-access cancer imaging datasets JF - Insights into Imaging N2 - Objectives Open-access cancer imaging datasets have become integral for evaluating novel AI approaches in radiology. However, their use in quantitative analysis with radiomics features presents unique challenges, such as incomplete documentation, low visibility, non-uniform data formats, data inhomogeneity, and complex preprocessing. These issues may cause problems with reproducibility and standardization in radiomics studies. Methods We systematically reviewed imaging datasets with public copyright licenses, published up to March 2023 across four large online cancer imaging archives. We included only datasets with tomographic images (CT, MRI, or PET), segmentations, and clinical annotations, specifically identifying those suitable for radiomics research. Reproducible preprocessing and feature extraction were performed for each dataset to enable their easy reuse. Results We discovered 29 datasets with corresponding segmentations and labels in the form of health outcomes, tumor pathology, staging, imaging-based scores, genetic markers, or repeated imaging. We compiled a repository encompassing 10,354 patients and 49,515 scans. Of the 29 datasets, 15 were licensed under Creative Commons licenses, allowing both non-commercial and commercial usage and redistribution, while others featured custom or restricted licenses. Studies spanned from the early 1990s to 2021, with the majority concluding after 2013. Seven different formats were used for the imaging data. Preprocessing and feature extraction were successfully performed for each dataset. Conclusion RadiomicsHub is a comprehensive public repository with radiomics features derived from a systematic review of public cancer imaging datasets. By converting all datasets to a standardized format and ensuring reproducible and traceable processing, RadiomicsHub addresses key reproducibility and standardization challenges in radiomics. Critical relevance statement This study critically addresses the challenges associated with locating, preprocessing, and extracting quantitative features from open-access datasets, to facilitate more robust and reliable evaluations of radiomics models. Key points - Through a systematic review, we identified 29 cancer imaging datasets suitable for radiomics research. - A public repository with collection overview and radiomics features, encompassing 10,354 patients and 49,515 scans, was compiled. - Most datasets can be shared, used, and built upon freely under a Creative Commons license. - All 29 identified datasets have been converted into a common format to enable reproducible radiomics feature extraction. KW - radiomics KW - radiology KW - cancer imaging KW - machine learning KW - reproducibility of results Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-357936 SN - 1869-4101 VL - 14 ER - TY - JOUR A1 - Rosales-Alvarez, Reyna Edith A1 - Rettkowski, Jasmin A1 - Herman, Josip Stefan A1 - Dumbović, Gabrijela A1 - Cabezas-Wallscheid, Nina A1 - Grün, Dominic T1 - VarID2 quantifies gene expression noise dynamics and unveils functional heterogeneity of ageing hematopoietic stem cells JF - Genome Biology N2 - Variability of gene expression due to stochasticity of transcription or variation of extrinsic signals, termed biological noise, is a potential driving force of cellular differentiation. Utilizing single-cell RNA-sequencing, we develop VarID2 for the quantification of biological noise at single-cell resolution. VarID2 reveals enhanced nuclear versus cytoplasmic noise, and distinct regulatory modes stratified by correlation between noise, expression, and chromatin accessibility. Noise levels are minimal in murine hematopoietic stem cells (HSCs) and increase during differentiation and ageing. Differential noise identifies myeloid-biased Dlk1+ long-term HSCs in aged mice with enhanced quiescence and self-renewal capacity. VarID2 reveals noise dynamics invisible to conventional single-cell transcriptome analysis. KW - gene expression noise KW - single-cell RNA sequencing KW - stem cell differentiation KW - cell sate variability KW - ageing KW - hematopoietic stem cells KW - machine learning KW - mathematical modeling Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-358042 VL - 24 ER - TY - JOUR A1 - Wehrheim, Maren H. A1 - Faskowitz, Joshua A1 - Sporns, Olaf A1 - Fiebach, Christian J. A1 - Kaschube, Matthias A1 - Hilger, Kirsten T1 - Few temporally distributed brain connectivity states predict human cognitive abilities JF - NeuroImage N2 - Highlights • Brain connectivity states identified by cofluctuation strength. • CMEP as new method to robustly predict human traits from brain imaging data. • Network-identifying connectivity ‘events’ are not predictive of cognitive ability. • Sixteen temporally independent fMRI time frames allow for significant prediction. • Neuroimaging-based assessment of cognitive ability requires sufficient scan lengths. Abstract Human functional brain connectivity can be temporally decomposed into states of high and low cofluctuation, defined as coactivation of brain regions over time. Rare states of particularly high cofluctuation have been shown to reflect fundamentals of intrinsic functional network architecture and to be highly subject-specific. However, it is unclear whether such network-defining states also contribute to individual variations in cognitive abilities – which strongly rely on the interactions among distributed brain regions. By introducing CMEP, a new eigenvector-based prediction framework, we show that as few as 16 temporally separated time frames (< 1.5% of 10 min resting-state fMRI) can significantly predict individual differences in intelligence (N = 263, p < .001). Against previous expectations, individual's network-defining time frames of particularly high cofluctuation do not predict intelligence. Multiple functional brain networks contribute to the prediction, and all results replicate in an independent sample (N = 831). Our results suggest that although fundamentals of person-specific functional connectomes can be derived from few time frames of highest connectivity, temporally distributed information is necessary to extract information about cognitive abilities. This information is not restricted to specific connectivity states, like network-defining high-cofluctuation states, but rather reflected across the entire length of the brain connectivity time series. KW - functional connectivity KW - resting state KW - machine learning KW - predictive modeling KW - general cognitive ability Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349874 VL - 277 ER - TY - JOUR A1 - Beierle, Felix A1 - Pryss, Rüdiger A1 - Aizawa, Akiko T1 - Sentiments about mental health on Twitter — before and during the COVID-19 pandemic JF - Healthcare N2 - During the COVID-19 pandemic, the novel coronavirus had an impact not only on public health but also on the mental health of the population. Public sentiment on mental health and depression is often captured only in small, survey-based studies, while work based on Twitter data often only looks at the period during the pandemic and does not make comparisons with the pre-pandemic situation. We collected tweets that included the hashtags #MentalHealth and #Depression from before and during the pandemic (8.5 months each). We used LDA (Latent Dirichlet Allocation) for topic modeling and LIWC, VADER, and NRC for sentiment analysis. We used three machine-learning classifiers to seek evidence regarding an automatically detectable change in tweets before vs. during the pandemic: (1) based on TF-IDF values, (2) based on the values from the sentiment libraries, (3) based on tweet content (deep-learning BERT classifier). Topic modeling revealed that Twitter users who explicitly used the hashtags #Depression and especially #MentalHealth did so to raise awareness. We observed an overall positive sentiment, and in tough times such as during the COVID-19 pandemic, tweets with #MentalHealth were often associated with gratitude. Among the three classification approaches, the BERT classifier showed the best performance, with an accuracy of 81% for #MentalHealth and 79% for #Depression. Although the data may have come from users familiar with mental health, these findings can help gauge public sentiment on the topic. The combination of (1) sentiment analysis, (2) topic modeling, and (3) tweet classification with machine learning proved useful in gaining comprehensive insight into public sentiment and could be applied to other data sources and topics. KW - COVID-19 KW - coronavirus KW - public health KW - sentiment analysis KW - topic modeling KW - machine learning Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-355192 SN - 2227-9032 VL - 11 IS - 21 ER - TY - JOUR A1 - Griebel, Matthias A1 - Segebarth, Dennis A1 - Stein, Nikolai A1 - Schukraft, Nina A1 - Tovote, Philip A1 - Blum, Robert A1 - Flath, Christoph M. T1 - Deep learning-enabled segmentation of ambiguous bioimages with deepflash2 JF - Nature Communications N2 - Bioimages frequently exhibit low signal-to-noise ratios due to experimental conditions, specimen characteristics, and imaging trade-offs. Reliable segmentation of such ambiguous images is difficult and laborious. Here we introduce deepflash2, a deep learning-enabled segmentation tool for bioimage analysis. The tool addresses typical challenges that may arise during the training, evaluation, and application of deep learning models on ambiguous data. The tool’s training and evaluation pipeline uses multiple expert annotations and deep model ensembles to achieve accurate results. The application pipeline supports various use-cases for expert annotations and includes a quality assurance mechanism in the form of uncertainty measures. Benchmarked against other tools, deepflash2 offers both high predictive accuracy and efficient computational resource usage. The tool is built upon established deep learning libraries and enables sharing of trained model ensembles with the research community. deepflash2 aims to simplify the integration of deep learning into bioimage analysis projects while improving accuracy and reliability. KW - machine learning KW - microscopy KW - quality control KW - software Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-357286 VL - 14 ER - TY - JOUR A1 - Vollmer, Andreas A1 - Nagler, Simon A1 - Hörner, Marius A1 - Hartmann, Stefan A1 - Brands, Roman C. A1 - Breitenbücher, Niko A1 - Straub, Anton A1 - Kübler, Alexander A1 - Vollmer, Michael A1 - Gubik, Sebastian A1 - Lang, Gernot A1 - Wollborn, Jakob A1 - Saravi, Babak T1 - Performance of artificial intelligence-based algorithms to predict prolonged length of stay after head and neck cancer surgery JF - Heliyon N2 - Background Medical resource management can be improved by assessing the likelihood of prolonged length of stay (LOS) for head and neck cancer surgery patients. The objective of this study was to develop predictive models that could be used to determine whether a patient's LOS after cancer surgery falls within the normal range of the cohort. Methods We conducted a retrospective analysis of a dataset consisting of 300 consecutive patients who underwent head and neck cancer surgery between 2017 and 2022 at a single university medical center. Prolonged LOS was defined as LOS exceeding the 75th percentile of the cohort. Feature importance analysis was performed to evaluate the most important predictors for prolonged LOS. We then constructed 7 machine learning and deep learning algorithms for the prediction modeling of prolonged LOS. Results The algorithms reached accuracy values of 75.40 (radial basis function neural network) to 97.92 (Random Trees) for the training set and 64.90 (multilayer perceptron neural network) to 84.14 (Random Trees) for the testing set. The leading parameters predicting prolonged LOS were operation time, ischemia time, the graft used, the ASA score, the intensive care stay, and the pathological stages. The results revealed that patients who had a higher number of harvested lymph nodes (LN) had a lower probability of recurrence but also a greater LOS. However, patients with prolonged LOS were also at greater risk of recurrence, particularly when fewer (LN) were extracted. Further, LOS was more strongly correlated with the overall number of extracted lymph nodes than with the number of positive lymph nodes or the ratio of positive to overall extracted lymph nodes, indicating that particularly unnecessary lymph node extraction might be associated with prolonged LOS. Conclusions The results emphasize the need for a closer follow-up of patients who experience prolonged LOS. Prospective trials are warranted to validate the present results. KW - prediction KW - head and neck cancer KW - machine learning KW - deep learning KW - artificial intelligence KW - length of stay KW - cancer Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-350416 SN - 2405-8440 VL - 9 IS - 11 ER - TY - JOUR A1 - Caliskan, Aylin A1 - Caliskan, Deniz A1 - Rasbach, Lauritz A1 - Yu, Weimeng A1 - Dandekar, Thomas A1 - Breitenbach, Tim T1 - Optimized cell type signatures revealed from single-cell data by combining principal feature analysis, mutual information, and machine learning JF - Computational and Structural Biotechnology Journal N2 - Machine learning techniques are excellent to analyze expression data from single cells. These techniques impact all fields ranging from cell annotation and clustering to signature identification. The presented framework evaluates gene selection sets how far they optimally separate defined phenotypes or cell groups. This innovation overcomes the present limitation to objectively and correctly identify a small gene set of high information content regarding separating phenotypes for which corresponding code scripts are provided. The small but meaningful subset of the original genes (or feature space) facilitates human interpretability of the differences of the phenotypes including those found by machine learning results and may even turn correlations between genes and phenotypes into a causal explanation. For the feature selection task, the principal feature analysis is utilized which reduces redundant information while selecting genes that carry the information for separating the phenotypes. In this context, the presented framework shows explainability of unsupervised learning as it reveals cell-type specific signatures. Apart from a Seurat preprocessing tool and the PFA script, the pipeline uses mutual information to balance accuracy and size of the gene set if desired. A validation part to evaluate the gene selection for their information content regarding the separation of the phenotypes is provided as well, binary and multiclass classification of 3 or 4 groups are studied. Results from different single-cell data are presented. In each, only about ten out of more than 30000 genes are identified as carrying the relevant information. The code is provided in a GitHub repository at https://github.com/AC-PHD/Seurat_PFA_pipeline. KW - single cell analysis KW - machine learning KW - explainability of machine learning KW - principal KW - feature analysis KW - model reduction KW - feature selection Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349989 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Dresia, Kai A1 - Kurudzija, Eldin A1 - Deeken, Jan A1 - Waxenegger-Wilfing, Günther T1 - Improved wall temperature prediction for the LUMEN rocket combustion chamber with neural networks JF - Aerospace N2 - Accurate calculations of the heat transfer and the resulting maximum wall temperature are essential for the optimal design of reliable and efficient regenerative cooling systems. However, predicting the heat transfer of supercritical methane flowing in cooling channels of a regeneratively cooled rocket combustor presents a significant challenge. High-fidelity CFD calculations provide sufficient accuracy but are computationally too expensive to be used within elaborate design optimization routines. In a previous work it has been shown that a surrogate model based on neural networks is able to predict the maximum wall temperature along straight cooling channels with convincing precision when trained with data from CFD simulations for simple cooling channel segments. In this paper, the methodology is extended to cooling channels with curvature. The predictions of the extended model are tested against CFD simulations with different boundary conditions for the representative LUMEN combustor contour with varying geometries and heat flux densities. The high accuracy of the extended model’s predictions, suggests that it will be a valuable tool for designing and analyzing regenerative cooling systems with greater efficiency and effectiveness. KW - neural network KW - surrogate model KW - heat transfer KW - machine learning KW - LUMEN KW - rocket engine KW - regenerative cooling Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-319169 SN - 2226-4310 VL - 10 IS - 5 ER - TY - JOUR A1 - Haufe, Stefan A1 - Isaias, Ioannis U. A1 - Pellegrini, Franziska A1 - Palmisano, Chiara T1 - Gait event prediction using surface electromyography in parkinsonian patients JF - Bioengineering N2 - Gait disturbances are common manifestations of Parkinson’s disease (PD), with unmet therapeutic needs. Inertial measurement units (IMUs) are capable of monitoring gait, but they lack neurophysiological information that may be crucial for studying gait disturbances in these patients. Here, we present a machine learning approach to approximate IMU angular velocity profiles and subsequently gait events using electromyographic (EMG) channels during overground walking in patients with PD. We recorded six parkinsonian patients while they walked for at least three minutes. Patient-agnostic regression models were trained on temporally embedded EMG time series of different combinations of up to five leg muscles bilaterally (i.e., tibialis anterior, soleus, gastrocnemius medialis, gastrocnemius lateralis, and vastus lateralis). Gait events could be detected with high temporal precision (median displacement of <50 ms), low numbers of missed events (<2%), and next to no false-positive event detections (<0.1%). Swing and stance phases could thus be determined with high fidelity (median F1-score of ~0.9). Interestingly, the best performance was obtained using as few as two EMG probes placed on the left and right vastus lateralis. Our results demonstrate the practical utility of the proposed EMG-based system for gait event prediction, which allows the simultaneous acquisition of an electromyographic signal to be performed. This gait analysis approach has the potential to make additional measurement devices such as IMUs and force plates less essential, thereby reducing financial and preparation overheads and discomfort factors in gait studies. KW - electromyography KW - inertial measurement units KW - gait-phase prediction KW - machine learning KW - Parkinson’s disease Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-304380 SN - 2306-5354 VL - 10 IS - 2 ER - TY - JOUR A1 - Kunz, Felix A1 - Stellzig-Eisenhauer, Angelika A1 - Boldt, Julian T1 - Applications of artificial intelligence in orthodontics — an overview and perspective based on the current state of the art JF - Applied Sciences N2 - Artificial intelligence (AI) has already arrived in many areas of our lives and, because of the increasing availability of computing power, can now be used for complex tasks in medicine and dentistry. This is reflected by an exponential increase in scientific publications aiming to integrate AI into everyday clinical routines. Applications of AI in orthodontics are already manifold and range from the identification of anatomical/pathological structures or reference points in imaging to the support of complex decision-making in orthodontic treatment planning. The aim of this article is to give the reader an overview of the current state of the art regarding applications of AI in orthodontics and to provide a perspective for the use of such AI solutions in clinical routine. For this purpose, we present various use cases for AI in orthodontics, for which research is already available. Considering the current scientific progress, it is not unreasonable to assume that AI will become an integral part of orthodontic diagnostics and treatment planning in the near future. Although AI will equally likely not be able to replace the knowledge and experience of human experts in the not-too-distant future, it probably will be able to support practitioners, thus serving as a quality-assuring component in orthodontic patient care. KW - orthodontics KW - artificial intelligence KW - machine learning KW - deep learning KW - cephalometry KW - age determination by skeleton KW - tooth extraction KW - orthognathic surgery Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-310940 SN - 2076-3417 VL - 13 IS - 6 ER -