TY - JOUR A1 - Bencurova, Elena A1 - Akash, Aman A1 - Dobson, Renwick C.J. A1 - Dandekar, Thomas T1 - DNA storage-from natural biology to synthetic biology JF - Computational and Structural Biotechnology Journal N2 - Natural DNA storage allows cellular differentiation, evolution, the growth of our children and controls all our ecosystems. Here, we discuss the fundamental aspects of DNA storage and recent advances in this field, with special emphasis on natural processes and solutions that can be exploited. We point out new ways of efficient DNA and nucleotide storage that are inspired by nature. Within a few years DNA-based information storage may become an attractive and natural complementation to current electronic data storage systems. We discuss rapid and directed access (e.g. DNA elements such as promotors, enhancers), regulatory signals and modulation (e.g. lncRNA) as well as integrated high-density storage and processing modules (e.g. chromosomal territories). There is pragmatic DNA storage for use in biotechnology and human genetics. We examine DNA storage as an approach for synthetic biology (e.g. light-controlled nucleotide processing enzymes). The natural polymers of DNA and RNA offer much for direct storage operations (read-in, read-out, access control). The inbuilt parallelism (many molecules at many places working at the same time) is important for fast processing of information. Using biology concepts from chromosomal storage, nucleic acid processing as well as polymer material sciences such as electronical effects in enzymes, graphene, nanocellulose up to DNA macramé , DNA wires and DNA-based aptamer field effect transistors will open up new applications gradually replacing classical information storage methods in ever more areas over time (decades). KW - DNA KW - RNA KW - data storage KW - natural processing KW - synthetic biology Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349971 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Caliskan, Aylin A1 - Dangwal, Seema A1 - Dandekar, Thomas T1 - Metadata integrity in bioinformatics: bridging the gap between data and knowledge JF - Computational and Structural Biotechnology Journal N2 - In the fast-evolving landscape of biomedical research, the emergence of big data has presented researchers with extraordinary opportunities to explore biological complexities. In biomedical research, big data imply also a big responsibility. This is not only due to genomics data being sensitive information but also due to genomics data being shared and re-analysed among the scientific community. This saves valuable resources and can even help to find new insights in silico. To fully use these opportunities, detailed and correct metadata are imperative. This includes not only the availability of metadata but also their correctness. Metadata integrity serves as a fundamental determinant of research credibility, supporting the reliability and reproducibility of data-driven findings. Ensuring metadata availability, curation, and accuracy are therefore essential for bioinformatic research. Not only must metadata be readily available, but they must also be meticulously curated and ideally error-free. Motivated by an accidental discovery of a critical metadata error in patient data published in two high-impact journals, we aim to raise awareness for the need of correct, complete, and curated metadata. We describe how the metadata error was found, addressed, and present examples for metadata-related challenges in omics research, along with supporting measures, including tools for checking metadata and software to facilitate various steps from data analysis to published research. Highlights • Data awareness and data integrity underpins the trustworthiness of results and subsequent further analysis. • Big data and bioinformatics enable efficient resource use by repurposing publicly available RNA-Sequencing data. • Manual checks of data quality and integrity are insufficient due to the overwhelming volume and rapidly growing data. • Automation and artificial intelligence provide cost-effective and efficient solutions for data integrity and quality checks. • FAIR data management, various software solutions and analysis tools assist metadata maintenance. KW - meta-data KW - error KW - annotation KW - error-transfer KW - wrong labelling KW - patient data KW - control group KW - tools overview Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349990 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Caliskan, Aylin A1 - Caliskan, Deniz A1 - Rasbach, Lauritz A1 - Yu, Weimeng A1 - Dandekar, Thomas A1 - Breitenbach, Tim T1 - Optimized cell type signatures revealed from single-cell data by combining principal feature analysis, mutual information, and machine learning JF - Computational and Structural Biotechnology Journal N2 - Machine learning techniques are excellent to analyze expression data from single cells. These techniques impact all fields ranging from cell annotation and clustering to signature identification. The presented framework evaluates gene selection sets how far they optimally separate defined phenotypes or cell groups. This innovation overcomes the present limitation to objectively and correctly identify a small gene set of high information content regarding separating phenotypes for which corresponding code scripts are provided. The small but meaningful subset of the original genes (or feature space) facilitates human interpretability of the differences of the phenotypes including those found by machine learning results and may even turn correlations between genes and phenotypes into a causal explanation. For the feature selection task, the principal feature analysis is utilized which reduces redundant information while selecting genes that carry the information for separating the phenotypes. In this context, the presented framework shows explainability of unsupervised learning as it reveals cell-type specific signatures. Apart from a Seurat preprocessing tool and the PFA script, the pipeline uses mutual information to balance accuracy and size of the gene set if desired. A validation part to evaluate the gene selection for their information content regarding the separation of the phenotypes is provided as well, binary and multiclass classification of 3 or 4 groups are studied. Results from different single-cell data are presented. In each, only about ten out of more than 30000 genes are identified as carrying the relevant information. The code is provided in a GitHub repository at https://github.com/AC-PHD/Seurat_PFA_pipeline. KW - single cell analysis KW - machine learning KW - explainability of machine learning KW - principal KW - feature analysis KW - model reduction KW - feature selection Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349989 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Salihoglu, Rana A1 - Srivastava, Mugdha A1 - Liang, Chunguang A1 - Schilling, Klaus A1 - Szalay, Aladar A1 - Bencurova, Elena A1 - Dandekar, Thomas T1 - PRO-Simat: Protein network simulation and design tool JF - Computational and Structural Biotechnology Journal N2 - PRO-Simat is a simulation tool for analysing protein interaction networks, their dynamic change and pathway engineering. It provides GO enrichment, KEGG pathway analyses, and network visualisation from an integrated database of more than 8 million protein-protein interactions across 32 model organisms and the human proteome. We integrated dynamical network simulation using the Jimena framework, which quickly and efficiently simulates Boolean genetic regulatory networks. It enables simulation outputs with in-depth analysis of the type, strength, duration and pathway of the protein interactions on the website. Furthermore, the user can efficiently edit and analyse the effect of network modifications and engineering experiments. In case studies, applications of PRO-Simat are demonstrated: (i) understanding mutually exclusive differentiation pathways in Bacillus subtilis, (ii) making Vaccinia virus oncolytic by switching on its viral replication mainly in cancer cells and triggering cancer cell apoptosis and (iii) optogenetic control of nucleotide processing protein networks to operate DNA storage. Multilevel communication between components is critical for efficient network switching, as demonstrated by a general census on prokaryotic and eukaryotic networks and comparing design with synthetic networks using PRO-Simat. The tool is available at https://prosimat.heinzelab.de/ as a web-based query server. KW - network simulation KW - protein analysis KW - signalling pathways KW - dynamic protein-protein interactions KW - optogenetics KW - oncolytic virus KW - DNA storage Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-350034 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Rackevei, Antonia S. A1 - Borges, Alyssa A1 - Engstler, Markus A1 - Dandekar, Thomas A1 - Wolf, Matthias T1 - About the analysis of 18S rDNA sequence data from trypanosomes in barcoding and phylogenetics: tracing a continuation error occurring in the literature JF - Biology N2 - The variable regions (V1–V9) of the 18S rDNA are routinely used in barcoding and phylogenetics. In handling these data for trypanosomes, we have noticed a misunderstanding that has apparently taken a life of its own in the literature over the years. In particular, in recent years, when studying the phylogenetic relationship of trypanosomes, the use of V7/V8 was systematically established. However, considering the current numbering system for all other organisms (including other Euglenozoa), V7/V8 was never used. In Maia da Silva et al. [Parasitology 2004, 129, 549–561], V7/V8 was promoted for the first time for trypanosome phylogenetics, and since then, more than 70 publications have replicated this nomenclature and even discussed the benefits of the use of this region in comparison to V4. However, the primers used to amplify the variable region of trypanosomes have actually amplified V4 (concerning the current 18S rDNA numbering system). KW - RNA secondary structure KW - variable regions KW - V1–V9 KW - V4 KW - V7/V8 KW - Trypanosoma Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-297562 SN - 2079-7737 VL - 11 IS - 11 ER - TY - JOUR A1 - Han, Chao A1 - Ren, Pengxuan A1 - Mamtimin, Medina A1 - Kruk, Linus A1 - Sarukhanyan, Edita A1 - Li, Chenyu A1 - Anders, Hans-Joachim A1 - Dandekar, Thomas A1 - Krueger, Irena A1 - Elvers, Margitta A1 - Goebel, Silvia A1 - Adler, Kristin A1 - Münch, Götz A1 - Gudermann, Thomas A1 - Braun, Attila A1 - Mammadova-Bach, Elmina T1 - Minimal collagen-binding epitope of glycoprotein VI in human and mouse platelets JF - Biomedicines N2 - Glycoprotein VI (GPVI) is a platelet-specific receptor for collagen and fibrin, regulating important platelet functions such as platelet adhesion and thrombus growth. Although the blockade of GPVI function is widely recognized as a potent anti-thrombotic approach, there are limited studies focused on site-specific targeting of GPVI. Using computational modeling and bioinformatics, we analyzed collagen- and CRP-binding surfaces of GPVI monomers and dimers, and compared the interacting surfaces with other mammalian GPVI isoforms. We could predict a minimal collagen-binding epitope of GPVI dimer and designed an EA-20 antibody that recognizes a linear epitope of this surface. Using platelets and whole blood samples donated from wild-type and humanized GPVI transgenic mice and also humans, our experimental results show that the EA-20 antibody inhibits platelet adhesion and aggregation in response to collagen and CRP, but not to fibrin. The EA-20 antibody also prevents thrombus formation in whole blood, on the collagen-coated surface, in arterial flow conditions. We also show that EA-20 does not influence GPVI clustering or receptor shedding. Therefore, we propose that blockade of this minimal collagen-binding epitope of GPVI with the EA-20 antibody could represent a new anti-thrombotic approach by inhibiting specific interactions between GPVI and the collagen matrix. KW - GPVI KW - collagen KW - blood platelets KW - thrombosis KW - anti-thrombotic therapies Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-304148 SN - 2227-9059 VL - 11 IS - 2 ER - TY - JOUR A1 - Gupta, Shishir K. A1 - Srivastava, Mugdha A1 - Minocha, Rashmi A1 - Akash, Aman A1 - Dangwal, Seema A1 - Dandekar, Thomas T1 - Alveolar regeneration in COVID-19 patients: a network perspective JF - International Journal of Molecular Sciences N2 - A viral infection involves entry and replication of viral nucleic acid in a host organism, subsequently leading to biochemical and structural alterations in the host cell. In the case of SARS-CoV-2 viral infection, over-activation of the host immune system may lead to lung damage. Albeit the regeneration and fibrotic repair processes being the two protective host responses, prolonged injury may lead to excessive fibrosis, a pathological state that can result in lung collapse. In this review, we discuss regeneration and fibrosis processes in response to SARS-CoV-2 and provide our viewpoint on the triggering of alveolar regeneration in coronavirus disease 2019 (COVID-19) patients. KW - COVID-19 KW - SARS-CoV-2 KW - alveolar regeneration KW - alveolar fibrosis KW - signaling pathway KW - network biology Y1 - 2021 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-284307 SN - 1422-0067 VL - 22 IS - 20 ER - TY - JOUR A1 - Naseem, Muhammad A1 - Osmanoğlu, Özge A1 - Kaltdorf, Martin A1 - Alblooshi, Afnan Ali M. A. A1 - Iqbal, Jibran A1 - Howari, Fares M. A1 - Srivastava, Mugdha A1 - Dandekar, Thomas T1 - Integrated framework of the immune-defense transcriptional signatures in the Arabidopsis shoot apical meristem JF - International Journal of Molecular Sciences N2 - The growing tips of plants grow sterile; therefore, disease-free plants can be generated from them. How plants safeguard growing apices from pathogen infection is still a mystery. The shoot apical meristem (SAM) is one of the three stem cells niches that give rise to the above ground plant organs. This is very well explored; however, how signaling networks orchestrate immune responses against pathogen infections in the SAM remains unclear. To reconstruct a transcriptional framework of the differentially expressed genes (DEGs) pertaining to various SAM cellular populations, we acquired large-scale transcriptome datasets from the public repository Gene Expression Omnibus (GEO). We identify here distinct sets of genes for various SAM cellular populations that are enriched in immune functions, such as immune defense, pathogen infection, biotic stress, and response to salicylic acid and jasmonic acid and their biosynthetic pathways in the SAM. We further linked those immune genes to their respective proteins and identify interactions among them by mapping a transcriptome-guided SAM-interactome. Furthermore, we compared stem-cells regulated transcriptome with innate immune responses in plants showing transcriptional separation among their DEGs in Arabidopsis. Besides unleashing a repertoire of immune-related genes in the SAM, our analysis provides a SAM-interactome that will help the community in designing functional experiments to study the specific defense dynamics of the SAM-cellular populations. Moreover, our study promotes the essence of large-scale omics data re-analysis, allowing a fresh look at the SAM-cellular transcriptome repurposing data-sets for new questions. KW - defense signaling KW - shoot apical meristem KW - CLV3p KW - meta-transcriptome KW - system inference KW - stem-cell-triggered immunity Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-285730 SN - 1422-0067 VL - 21 IS - 16 ER - TY - JOUR A1 - Breitenbach, Tim A1 - Lorenz, Kristina A1 - Dandekar, Thomas T1 - How to steer and control ERK and the ERK signaling cascade exemplified by looking at cardiac insufficiency JF - International Journal of Molecular Sciences N2 - Mathematical optimization framework allows the identification of certain nodes within a signaling network. In this work, we analyzed the complex extracellular-signal-regulated kinase 1 and 2 (ERK1/2) cascade in cardiomyocytes using the framework to find efficient adjustment screws for this cascade that is important for cardiomyocyte survival and maladaptive heart muscle growth. We modeled optimal pharmacological intervention points that are beneficial for the heart, but avoid the occurrence of a maladaptive ERK1/2 modification, the autophosphorylation of ERK at threonine 188 (ERK\(^{Thr188}\) phosphorylation), which causes cardiac hypertrophy. For this purpose, a network of a cardiomyocyte that was fitted to experimental data was equipped with external stimuli that model the pharmacological intervention points. Specifically, two situations were considered. In the first one, the cardiomyocyte was driven to a desired expression level with different treatment strategies. These strategies were quantified with respect to beneficial effects and maleficent side effects and then which one is the best treatment strategy was evaluated. In the second situation, it was shown how to model constitutively activated pathways and how to identify drug targets to obtain a desired activity level that is associated with a healthy state and in contrast to the maleficent expression pattern caused by the constitutively activated pathway. An implementation of the algorithms used for the calculations is also presented in this paper, which simplifies the application of the presented framework for drug targeting, optimal drug combinations and the systematic and automatic search for pharmacological intervention points. The codes were designed such that they can be combined with any mathematical model given by ordinary differential equations. KW - optimal pharmacological modulation KW - efficient intervention points KW - ERK signaling KW - optimal treatment strategies KW - optimal drug targeting KW - optimal drug combination Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-285164 SN - 1422-0067 VL - 20 IS - 9 ER - TY - JOUR A1 - Kaltdorf, Kristin Verena A1 - Schulze, Katja A1 - Helmprobst, Frederik A1 - Kollmannsberger, Philip A1 - Dandekar, Thomas A1 - Stigloher, Christian T1 - Fiji macro 3D ART VeSElecT: 3D automated reconstruction tool for vesicle structures of electron tomograms JF - PLoS Computational Biology N2 - Automatic image reconstruction is critical to cope with steadily increasing data from advanced microscopy. We describe here the Fiji macro 3D ART VeSElecT which we developed to study synaptic vesicles in electron tomograms. We apply this tool to quantify vesicle properties (i) in embryonic Danio rerio 4 and 8 days past fertilization (dpf) and (ii) to compare Caenorhabditis elegans N2 neuromuscular junctions (NMJ) wild-type and its septin mutant (unc-59(e261)). We demonstrate development-specific and mutant-specific changes in synaptic vesicle pools in both models. We confirm the functionality of our macro by applying our 3D ART VeSElecT on zebrafish NMJ showing smaller vesicles in 8 dpf embryos then 4 dpf, which was validated by manual reconstruction of the vesicle pool. Furthermore, we analyze the impact of C. elegans septin mutant unc-59(e261) on vesicle pool formation and vesicle size. Automated vesicle registration and characterization was implemented in Fiji as two macros (registration and measurement). This flexible arrangement allows in particular reducing false positives by an optional manual revision step. Preprocessing and contrast enhancement work on image-stacks of 1nm/pixel in x and y direction. Semi-automated cell selection was integrated. 3D ART VeSElecT removes interfering components, detects vesicles by 3D segmentation and calculates vesicle volume and diameter (spherical approximation, inner/outer diameter). Results are collected in color using the RoiManager plugin including the possibility of manual removal of non-matching confounder vesicles. Detailed evaluation considered performance (detected vesicles) and specificity (true vesicles) as well as precision and recall. We furthermore show gain in segmentation and morphological filtering compared to learning based methods and a large time gain compared to manual segmentation. 3D ART VeSElecT shows small error rates and its speed gain can be up to 68 times faster in comparison to manual annotation. Both automatic and semi-automatic modes are explained including a tutorial. KW - Biology KW - Vesicles KW - Caenorhabditis elegans KW - Zebrafish KW - Septins KW - Synaptic vesicles KW - Neuromuscular junctions KW - Computer software KW - Synapses Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-172112 VL - 13 IS - 1 ER -