TY - JOUR A1 - Marquardt, André A1 - Hartrampf, Philipp A1 - Kollmannsberger, Philip A1 - Solimando, Antonio G. A1 - Meierjohann, Svenja A1 - Kübler, Hubert A1 - Bargou, Ralf A1 - Schilling, Bastian A1 - Serfling, Sebastian E. A1 - Buck, Andreas A1 - Werner, Rudolf A. A1 - Lapa, Constantin A1 - Krebs, Markus T1 - Predicting microenvironment in CXCR4- and FAP-positive solid tumors — a pan-cancer machine learning workflow for theranostic target structures JF - Cancers N2 - (1) Background: C-X-C Motif Chemokine Receptor 4 (CXCR4) and Fibroblast Activation Protein Alpha (FAP) are promising theranostic targets. However, it is unclear whether CXCR4 and FAP positivity mark distinct microenvironments, especially in solid tumors. (2) Methods: Using Random Forest (RF) analysis, we searched for entity-independent mRNA and microRNA signatures related to CXCR4 and FAP overexpression in our pan-cancer cohort from The Cancer Genome Atlas (TCGA) database — representing n = 9242 specimens from 29 tumor entities. CXCR4- and FAP-positive samples were assessed via StringDB cluster analysis, EnrichR, Metascape, and Gene Set Enrichment Analysis (GSEA). Findings were validated via correlation analyses in n = 1541 tumor samples. TIMER2.0 analyzed the association of CXCR4 / FAP expression and infiltration levels of immune-related cells. (3) Results: We identified entity-independent CXCR4 and FAP gene signatures representative for the majority of solid cancers. While CXCR4 positivity marked an immune-related microenvironment, FAP overexpression highlighted an angiogenesis-associated niche. TIMER2.0 analysis confirmed characteristic infiltration levels of CD8+ cells for CXCR4-positive tumors and endothelial cells for FAP-positive tumors. (4) Conclusions: CXCR4- and FAP-directed PET imaging could provide a non-invasive decision aid for entity-agnostic treatment of microenvironment in solid malignancies. Moreover, this machine learning workflow can easily be transferred towards other theranostic targets. KW - machine learning KW - tumor microenvironment KW - immune infiltration KW - angiogenesis KW - mRNA KW - miRNA KW - transcriptome Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-305036 SN - 2072-6694 VL - 15 IS - 2 ER - TY - JOUR A1 - Marquardt, André A1 - Solimando, Antonio Giovanni A1 - Kerscher, Alexander A1 - Bittrich, Max A1 - Kalogirou, Charis A1 - Kübler, Hubert A1 - Rosenwald, Andreas A1 - Bargou, Ralf A1 - Kollmannsberger, Philip A1 - Schilling, Bastian A1 - Meierjohann, Svenja A1 - Krebs, Markus T1 - Subgroup-Independent Mapping of Renal Cell Carcinoma — Machine Learning Reveals Prognostic Mitochondrial Gene Signature Beyond Histopathologic Boundaries JF - Frontiers in Oncology N2 - Background: Renal cell carcinoma (RCC) is divided into three major histopathologic groups—clear cell (ccRCC), papillary (pRCC) and chromophobe RCC (chRCC). We performed a comprehensive re-analysis of publicly available RCC datasets from the TCGA (The Cancer Genome Atlas) database, thereby combining samples from all three subgroups, for an exploratory transcriptome profiling of RCC subgroups. Materials and Methods: We used FPKM (fragments per kilobase per million) files derived from the ccRCC, pRCC and chRCC cohorts of the TCGA database, representing transcriptomic data of 891 patients. Using principal component analysis, we visualized datasets as t-SNE plot for cluster detection. Clusters were characterized by machine learning, resulting gene signatures were validated by correlation analyses in the TCGA dataset and three external datasets (ICGC RECA-EU, CPTAC-3-Kidney, and GSE157256). Results: Many RCC samples co-clustered according to histopathology. However, a substantial number of samples clustered independently from histopathologic origin (mixed subgroup)—demonstrating divergence between histopathology and transcriptomic data. Further analyses of mixed subgroup via machine learning revealed a predominant mitochondrial gene signature—a trait previously known for chRCC—across all histopathologic subgroups. Additionally, ccRCC samples from mixed subgroup presented an inverse correlation of mitochondrial and angiogenesis-related genes in the TCGA and in three external validation cohorts. Moreover, mixed subgroup affiliation was associated with a highly significant shorter overall survival for patients with ccRCC—and a highly significant longer overall survival for chRCC patients. Conclusions: Pan-RCC clustering according to RNA-sequencing data revealed a distinct histology-independent subgroup characterized by strengthened mitochondrial and weakened angiogenesis-related gene signatures. Moreover, affiliation to mixed subgroup went along with a significantly shorter overall survival for ccRCC and a longer overall survival for chRCC patients. Further research could offer a therapy stratification by specifically addressing the mitochondrial metabolism of such tumors and its microenvironment. KW - kidney cancer KW - pan-RCC KW - machine learning KW - mitochondrial DNA KW - mtDNA KW - mTOR Y1 - 2021 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-232107 SN - 2234-943X VL - 11 ER - TY - JOUR A1 - Pook, Torsten A1 - Freudenthal, Jan A1 - Korte, Arthur A1 - Simianer, Henner T1 - Using Local Convolutional Neural Networks for Genomic Prediction JF - Frontiers in Genetics N2 - The prediction of breeding values and phenotypes is of central importance for both livestock and crop breeding. In this study, we analyze the use of artificial neural networks (ANN) and, in particular, local convolutional neural networks (LCNN) for genomic prediction, as a region-specific filter corresponds much better with our prior genetic knowledge on the genetic architecture of traits than traditional convolutional neural networks. Model performances are evaluated on a simulated maize data panel (n = 10,000; p = 34,595) and real Arabidopsis data (n = 2,039; p = 180,000) for a variety of traits based on their predictive ability. The baseline LCNN, containing one local convolutional layer (kernel size: 10) and two fully connected layers with 64 nodes each, is outperforming commonly proposed ANNs (multi layer perceptrons and convolutional neural networks) for basically all considered traits. For traits with high heritability and large training population as present in the simulated data, LCNN are even outperforming state-of-the-art methods like genomic best linear unbiased prediction (GBLUP), Bayesian models and extended GBLUP, indicated by an increase in predictive ability of up to 24%. However, for small training populations, these state-of-the-art methods outperform all considered ANNs. Nevertheless, the LCNN still outperforms all other considered ANNs by around 10%. Minor improvements to the tested baseline network architecture of the LCNN were obtained by increasing the kernel size and of reducing the stride, whereas the number of subsequent fully connected layers and their node sizes had neglectable impact. Although gains in predictive ability were obtained for large scale data sets by using LCNNs, the practical use of ANNs comes with additional problems, such as the need of genotyping all considered individuals, the lack of estimation of heritability and reliability. Furthermore, breeding values are additive by design, whereas ANN-based estimates are not. However, ANNs also comes with new opportunities, as networks can easily be extended to account for additional inputs (omics, weather etc.) and outputs (multi-trait models), and computing time increases linearly with the number of individuals. With advances in high-throughput phenotyping and cheaper genotyping, ANNs can become a valid alternative for genomic prediction. KW - phenotype prediction KW - Keras KW - genomic selection KW - selection KW - breeding KW - machine learning KW - deep learning Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-216436 VL - 11 ER - TY - JOUR A1 - Marquardt, André A1 - Landwehr, Laura-Sophie A1 - Ronchi, Cristina L. A1 - di Dalmazi, Guido A1 - Riester, Anna A1 - Kollmannsberger, Philip A1 - Altieri, Barbara A1 - Fassnacht, Martin A1 - Sbiera, Silviu T1 - Identifying New Potential Biomarkers in Adrenocortical Tumors Based on mRNA Expression Data Using Machine Learning JF - Cancers N2 - Simple Summary Using a visual-based clustering method on the TCGA RNA sequencing data of a large adrenocortical carcinoma (ACC) cohort, we were able to classify these tumors in two distinct clusters largely overlapping with previously identified ones. As previously shown, the identified clusters also correlated with patient survival. Applying the visual clustering method to a second dataset also including benign adrenocortical samples additionally revealed that one of the ACC clusters is more closely located to the benign samples, providing a possible explanation for the better survival of this ACC cluster. Furthermore, the subsequent use of machine learning identified new possible biomarker genes with prognostic potential for this rare disease, that are significantly differentially expressed in the different survival clusters and should be further evaluated. Abstract Adrenocortical carcinoma (ACC) is a rare disease, associated with poor survival. Several “multiple-omics” studies characterizing ACC on a molecular level identified two different clusters correlating with patient survival (C1A and C1B). We here used the publicly available transcriptome data from the TCGA-ACC dataset (n = 79), applying machine learning (ML) methods to classify the ACC based on expression pattern in an unbiased manner. UMAP (uniform manifold approximation and projection)-based clustering resulted in two distinct groups, ACC-UMAP1 and ACC-UMAP2, that largely overlap with clusters C1B and C1A, respectively. However, subsequent use of random-forest-based learning revealed a set of new possible marker genes showing significant differential expression in the described clusters (e.g., SOAT1, EIF2A1). For validation purposes, we used a secondary dataset based on a previous study from our group, consisting of 4 normal adrenal glands and 52 benign and 7 malignant tumor samples. The results largely confirmed those obtained for the TCGA-ACC cohort. In addition, the ENSAT dataset showed a correlation between benign adrenocortical tumors and the good prognosis ACC cluster ACC-UMAP1/C1B. In conclusion, the use of ML approaches re-identified and redefined known prognostic ACC subgroups. On the other hand, the subsequent use of random-forest-based learning identified new possible prognostic marker genes for ACC. KW - adrenocortical carcinoma KW - in silico analysis KW - machine learning KW - bioinformatic clustering KW - biomarker prediction Y1 - 2021 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-246245 SN - 2072-6694 VL - 13 IS - 18 ER - TY - JOUR A1 - Kaltdorf, Kristin Verena A1 - Theiss, Maria A1 - Markert, Sebastian Matthias A1 - Zhen, Mei A1 - Dandekar, Thomas A1 - Stigloher, Christian A1 - Kollmannsberger, Philipp T1 - Automated classification of synaptic vesicles in electron tomograms of C. elegans using machine learning JF - PLoS ONE N2 - Synaptic vesicles (SVs) are a key component of neuronal signaling and fulfil different roles depending on their composition. In electron micrograms of neurites, two types of vesicles can be distinguished by morphological criteria, the classical “clear core” vesicles (CCV) and the typically larger “dense core” vesicles (DCV), with differences in electron density due to their diverse cargos. Compared to CCVs, the precise function of DCVs is less defined. DCVs are known to store neuropeptides, which function as neuronal messengers and modulators [1]. In C. elegans, they play a role in locomotion, dauer formation, egg-laying, and mechano- and chemosensation [2]. Another type of DCVs, also referred to as granulated vesicles, are known to transport Bassoon, Piccolo and further constituents of the presynaptic density in the center of the active zone (AZ), and therefore are important for synaptogenesis [3]. To better understand the role of different types of SVs, we present here a new automated approach to classify vesicles. We combine machine learning with an extension of our previously developed vesicle segmentation workflow, the ImageJ macro 3D ART VeSElecT. With that we reliably distinguish CCVs and DCVs in electron tomograms of C. elegans NMJs using image-based features. Analysis of the underlying ground truth data shows an increased fraction of DCVs as well as a higher mean distance between DCVs and AZs in dauer larvae compared to young adult hermaphrodites. Our machine learning based tools are adaptable and can be applied to study properties of different synaptic vesicle pools in electron tomograms of diverse model organisms. KW - synaptic vesicles KW - Caenorhabditis elegans KW - machine learning Y1 - 2018 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-176831 VL - 13 IS - 10 ER -