TY - JOUR A1 - Kaltdorf, Martin A1 - Breitenbach, Tim A1 - Karl, Stefan A1 - Fuchs, Maximilian A1 - Kessie, David Komla A1 - Psota, Eric A1 - Prelog, Martina A1 - Sarukhanyan, Edita A1 - Ebert, Regina A1 - Jakob, Franz A1 - Dandekar, Gudrun A1 - Naseem, Muhammad A1 - Liang, Chunguang A1 - Dandekar, Thomas T1 - Software JimenaE allows efficient dynamic simulations of Boolean networks, centrality and system state analysis JF - Scientific Reports N2 - The signal modelling framework JimenaE simulates dynamically Boolean networks. In contrast to SQUAD, there is systematic and not just heuristic calculation of all system states. These specific features are not present in CellNetAnalyzer and BoolNet. JimenaE is an expert extension of Jimena, with new optimized code, network conversion into different formats, rapid convergence both for system state calculation as well as for all three network centralities. It allows higher accuracy in determining network states and allows to dissect networks and identification of network control type and amount for each protein with high accuracy. Biological examples demonstrate this: (i) High plasticity of mesenchymal stromal cells for differentiation into chondrocytes, osteoblasts and adipocytes and differentiation-specific network control focusses on wnt-, TGF-beta and PPAR-gamma signaling. JimenaE allows to study individual proteins, removal or adding interactions (or autocrine loops) and accurately quantifies effects as well as number of system states. (ii) Dynamical modelling of cell–cell interactions of plant Arapidopsis thaliana against Pseudomonas syringae DC3000: We analyze for the first time the pathogen perspective and its interaction with the host. We next provide a detailed analysis on how plant hormonal regulation stimulates specific proteins and who and which protein has which type and amount of network control including a detailed heatmap of the A.thaliana response distinguishing between two states of the immune response. (iii) In an immune response network of dendritic cells confronted with Aspergillus fumigatus, JimenaE calculates now accurately the specific values for centralities and protein-specific network control including chemokine and pattern recognition receptors. KW - cellular signalling networks KW - computer modelling Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-313303 VL - 13 ER - TY - JOUR A1 - Han, Chao A1 - Ren, Pengxuan A1 - Mamtimin, Medina A1 - Kruk, Linus A1 - Sarukhanyan, Edita A1 - Li, Chenyu A1 - Anders, Hans-Joachim A1 - Dandekar, Thomas A1 - Krueger, Irena A1 - Elvers, Margitta A1 - Goebel, Silvia A1 - Adler, Kristin A1 - Münch, Götz A1 - Gudermann, Thomas A1 - Braun, Attila A1 - Mammadova-Bach, Elmina T1 - Minimal collagen-binding epitope of glycoprotein VI in human and mouse platelets JF - Biomedicines N2 - Glycoprotein VI (GPVI) is a platelet-specific receptor for collagen and fibrin, regulating important platelet functions such as platelet adhesion and thrombus growth. Although the blockade of GPVI function is widely recognized as a potent anti-thrombotic approach, there are limited studies focused on site-specific targeting of GPVI. Using computational modeling and bioinformatics, we analyzed collagen- and CRP-binding surfaces of GPVI monomers and dimers, and compared the interacting surfaces with other mammalian GPVI isoforms. We could predict a minimal collagen-binding epitope of GPVI dimer and designed an EA-20 antibody that recognizes a linear epitope of this surface. Using platelets and whole blood samples donated from wild-type and humanized GPVI transgenic mice and also humans, our experimental results show that the EA-20 antibody inhibits platelet adhesion and aggregation in response to collagen and CRP, but not to fibrin. The EA-20 antibody also prevents thrombus formation in whole blood, on the collagen-coated surface, in arterial flow conditions. We also show that EA-20 does not influence GPVI clustering or receptor shedding. Therefore, we propose that blockade of this minimal collagen-binding epitope of GPVI with the EA-20 antibody could represent a new anti-thrombotic approach by inhibiting specific interactions between GPVI and the collagen matrix. KW - GPVI KW - collagen KW - blood platelets KW - thrombosis KW - anti-thrombotic therapies Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-304148 SN - 2227-9059 VL - 11 IS - 2 ER - TY - JOUR A1 - Salihoglu, Rana A1 - Srivastava, Mugdha A1 - Liang, Chunguang A1 - Schilling, Klaus A1 - Szalay, Aladar A1 - Bencurova, Elena A1 - Dandekar, Thomas T1 - PRO-Simat: Protein network simulation and design tool JF - Computational and Structural Biotechnology Journal N2 - PRO-Simat is a simulation tool for analysing protein interaction networks, their dynamic change and pathway engineering. It provides GO enrichment, KEGG pathway analyses, and network visualisation from an integrated database of more than 8 million protein-protein interactions across 32 model organisms and the human proteome. We integrated dynamical network simulation using the Jimena framework, which quickly and efficiently simulates Boolean genetic regulatory networks. It enables simulation outputs with in-depth analysis of the type, strength, duration and pathway of the protein interactions on the website. Furthermore, the user can efficiently edit and analyse the effect of network modifications and engineering experiments. In case studies, applications of PRO-Simat are demonstrated: (i) understanding mutually exclusive differentiation pathways in Bacillus subtilis, (ii) making Vaccinia virus oncolytic by switching on its viral replication mainly in cancer cells and triggering cancer cell apoptosis and (iii) optogenetic control of nucleotide processing protein networks to operate DNA storage. Multilevel communication between components is critical for efficient network switching, as demonstrated by a general census on prokaryotic and eukaryotic networks and comparing design with synthetic networks using PRO-Simat. The tool is available at https://prosimat.heinzelab.de/ as a web-based query server. KW - network simulation KW - protein analysis KW - signalling pathways KW - dynamic protein-protein interactions KW - optogenetics KW - oncolytic virus KW - DNA storage Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-350034 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Caliskan, Aylin A1 - Caliskan, Deniz A1 - Rasbach, Lauritz A1 - Yu, Weimeng A1 - Dandekar, Thomas A1 - Breitenbach, Tim T1 - Optimized cell type signatures revealed from single-cell data by combining principal feature analysis, mutual information, and machine learning JF - Computational and Structural Biotechnology Journal N2 - Machine learning techniques are excellent to analyze expression data from single cells. These techniques impact all fields ranging from cell annotation and clustering to signature identification. The presented framework evaluates gene selection sets how far they optimally separate defined phenotypes or cell groups. This innovation overcomes the present limitation to objectively and correctly identify a small gene set of high information content regarding separating phenotypes for which corresponding code scripts are provided. The small but meaningful subset of the original genes (or feature space) facilitates human interpretability of the differences of the phenotypes including those found by machine learning results and may even turn correlations between genes and phenotypes into a causal explanation. For the feature selection task, the principal feature analysis is utilized which reduces redundant information while selecting genes that carry the information for separating the phenotypes. In this context, the presented framework shows explainability of unsupervised learning as it reveals cell-type specific signatures. Apart from a Seurat preprocessing tool and the PFA script, the pipeline uses mutual information to balance accuracy and size of the gene set if desired. A validation part to evaluate the gene selection for their information content regarding the separation of the phenotypes is provided as well, binary and multiclass classification of 3 or 4 groups are studied. Results from different single-cell data are presented. In each, only about ten out of more than 30000 genes are identified as carrying the relevant information. The code is provided in a GitHub repository at https://github.com/AC-PHD/Seurat_PFA_pipeline. KW - single cell analysis KW - machine learning KW - explainability of machine learning KW - principal KW - feature analysis KW - model reduction KW - feature selection Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349989 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Caliskan, Aylin A1 - Dangwal, Seema A1 - Dandekar, Thomas T1 - Metadata integrity in bioinformatics: bridging the gap between data and knowledge JF - Computational and Structural Biotechnology Journal N2 - In the fast-evolving landscape of biomedical research, the emergence of big data has presented researchers with extraordinary opportunities to explore biological complexities. In biomedical research, big data imply also a big responsibility. This is not only due to genomics data being sensitive information but also due to genomics data being shared and re-analysed among the scientific community. This saves valuable resources and can even help to find new insights in silico. To fully use these opportunities, detailed and correct metadata are imperative. This includes not only the availability of metadata but also their correctness. Metadata integrity serves as a fundamental determinant of research credibility, supporting the reliability and reproducibility of data-driven findings. Ensuring metadata availability, curation, and accuracy are therefore essential for bioinformatic research. Not only must metadata be readily available, but they must also be meticulously curated and ideally error-free. Motivated by an accidental discovery of a critical metadata error in patient data published in two high-impact journals, we aim to raise awareness for the need of correct, complete, and curated metadata. We describe how the metadata error was found, addressed, and present examples for metadata-related challenges in omics research, along with supporting measures, including tools for checking metadata and software to facilitate various steps from data analysis to published research. Highlights • Data awareness and data integrity underpins the trustworthiness of results and subsequent further analysis. • Big data and bioinformatics enable efficient resource use by repurposing publicly available RNA-Sequencing data. • Manual checks of data quality and integrity are insufficient due to the overwhelming volume and rapidly growing data. • Automation and artificial intelligence provide cost-effective and efficient solutions for data integrity and quality checks. • FAIR data management, various software solutions and analysis tools assist metadata maintenance. KW - meta-data KW - error KW - annotation KW - error-transfer KW - wrong labelling KW - patient data KW - control group KW - tools overview Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349990 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Bencurova, Elena A1 - Akash, Aman A1 - Dobson, Renwick C.J. A1 - Dandekar, Thomas T1 - DNA storage-from natural biology to synthetic biology JF - Computational and Structural Biotechnology Journal N2 - Natural DNA storage allows cellular differentiation, evolution, the growth of our children and controls all our ecosystems. Here, we discuss the fundamental aspects of DNA storage and recent advances in this field, with special emphasis on natural processes and solutions that can be exploited. We point out new ways of efficient DNA and nucleotide storage that are inspired by nature. Within a few years DNA-based information storage may become an attractive and natural complementation to current electronic data storage systems. We discuss rapid and directed access (e.g. DNA elements such as promotors, enhancers), regulatory signals and modulation (e.g. lncRNA) as well as integrated high-density storage and processing modules (e.g. chromosomal territories). There is pragmatic DNA storage for use in biotechnology and human genetics. We examine DNA storage as an approach for synthetic biology (e.g. light-controlled nucleotide processing enzymes). The natural polymers of DNA and RNA offer much for direct storage operations (read-in, read-out, access control). The inbuilt parallelism (many molecules at many places working at the same time) is important for fast processing of information. Using biology concepts from chromosomal storage, nucleic acid processing as well as polymer material sciences such as electronical effects in enzymes, graphene, nanocellulose up to DNA macramé , DNA wires and DNA-based aptamer field effect transistors will open up new applications gradually replacing classical information storage methods in ever more areas over time (decades). KW - DNA KW - RNA KW - data storage KW - natural processing KW - synthetic biology Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349971 SN - 2001-0370 VL - 21 ER - TY - JOUR A1 - Bencurova, Elena A1 - Shityakov, Sergey A1 - Schaack, Dominik A1 - Kaltdorf, Martin A1 - Sarukhanyan, Edita A1 - Hilgarth, Alexander A1 - Rath, Christin A1 - Montenegro, Sergio A1 - Roth, Günter A1 - Lopez, Daniel A1 - Dandekar, Thomas T1 - Nanocellulose composites as smart devices with chassis, light-directed DNA Storage, engineered electronic properties, and chip integration JF - Frontiers in Bioengineering and Biotechnology N2 - The rapid development of green and sustainable materials opens up new possibilities in the field of applied research. Such materials include nanocellulose composites that can integrate many components into composites and provide a good chassis for smart devices. In our study, we evaluate four approaches for turning a nanocellulose composite into an information storage or processing device: 1) nanocellulose can be a suitable carrier material and protect information stored in DNA. 2) Nucleotide-processing enzymes (polymerase and exonuclease) can be controlled by light after fusing them with light-gating domains; nucleotide substrate specificity can be changed by mutation or pH change (read-in and read-out of the information). 3) Semiconductors and electronic capabilities can be achieved: we show that nanocellulose is rendered electronic by iodine treatment replacing silicon including microstructures. Nanocellulose semiconductor properties are measured, and the resulting potential including single-electron transistors (SET) and their properties are modeled. Electric current can also be transported by DNA through G-quadruplex DNA molecules; these as well as classical silicon semiconductors can easily be integrated into the nanocellulose composite. 4) To elaborate upon miniaturization and integration for a smart nanocellulose chip device, we demonstrate pH-sensitive dyes in nanocellulose, nanopore creation, and kinase micropatterning on bacterial membranes as well as digital PCR micro-wells. Future application potential includes nano-3D printing and fast molecular processors (e.g., SETs) integrated with DNA storage and conventional electronics. This would also lead to environment-friendly nanocellulose chips for information processing as well as smart nanocellulose composites for biomedical applications and nano-factories. KW - nanocellulose KW - DNA storage KW - light-gated proteins KW - single-electron transistors KW - protein chip Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-283033 SN - 2296-4185 VL - 10 ER - TY - JOUR A1 - Gupta, Shishir K. A1 - Osmanoglu, Özge A1 - Minocha, Rashmi A1 - Bandi, Sourish Reddy A1 - Bencurova, Elena A1 - Srivastava, Mugdha A1 - Dandekar, Thomas T1 - Genome-wide scan for potential CD4+ T-cell vaccine candidates in Candida auris by exploiting reverse vaccinology and evolutionary information JF - Frontiers in Medicine N2 - Candida auris is a globally emerging fungal pathogen responsible for causing nosocomial outbreaks in healthcare associated settings. It is known to cause infection in all age groups and exhibits multi-drug resistance with high potential for horizontal transmission. Because of this reason combined with limited therapeutic choices available, C. auris infection has been acknowledged as a potential risk for causing a future pandemic, and thus seeking a promising strategy for its treatment is imperative. Here, we combined evolutionary information with reverse vaccinology approach to identify novel epitopes for vaccine design that could elicit CD4+ T-cell responses against C. auris. To this end, we extensively scanned the family of proteins encoded by C. auris genome. In addition, a pathogen may acquire substitutions in epitopes over a period of time which could cause its escape from the immune response thus rendering the vaccine ineffective. To lower this possibility in our design, we eliminated all rapidly evolving genes of C. auris with positive selection. We further employed highly conserved regions of multiple C. auris strains and identified two immunogenic and antigenic T-cell epitopes that could generate the most effective immune response against C. auris. The antigenicity scores of our predicted vaccine candidates were calculated as 0.85 and 1.88 where 0.5 is the threshold for prediction of fungal antigenic sequences. Based on our results, we conclude that our vaccine candidates have the potential to be successfully employed for the treatment of C. auris infection. However, in vivo experiments are imperative to further demonstrate the efficacy of our design. KW - T-cell epitope KW - epitope prediction KW - positive selection KW - evolution KW - immune-informatics Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-293953 SN - 2296-858X VL - 9 ER - TY - JOUR A1 - Prada, Juan Pablo A1 - Maag, Luca Estelle A1 - Siegmund, Laura A1 - Bencurova, Elena A1 - Liang, Chunguang A1 - Koutsilieri, Eleni A1 - Dandekar, Thomas A1 - Scheller, Carsten T1 - Estimation of R0 for the spread of SARS-CoV-2 in Germany from excess mortality JF - Scientific Reports N2 - For SARS-CoV-2, R0 calculations in the range of 2–3 dominate the literature, but much higher estimates have also been published. Because capacity for RT-PCR testing increased greatly in the early phase of the Covid-19 pandemic, R0 determinations based on these incidence values are subject to strong bias. We propose to use Covid-19-induced excess mortality to determine R0 regardless of RT-PCR testing capacity. We used data from the Robert Koch Institute (RKI) on the incidence of Covid cases, Covid-related deaths, number of RT-PCR tests performed, and excess mortality calculated from data from the Federal Statistical Office in Germany. We determined R0 using exponential growth estimates with a serial interval of 4.7 days. We used only datasets that were not yet under the influence of policy measures (e.g., lockdowns or school closures). The uncorrected R0 value for the spread of SARS-CoV-2 based on RT-PCR incidence data was 2.56 (95% CI 2.52–2.60) for Covid-19 cases and 2.03 (95% CI 1.96–2.10) for Covid-19-related deaths. However, because the number of RT-PCR tests increased by a growth factor of 1.381 during the same period, these R0 values must be corrected accordingly (R0corrected = R0uncorrected/1.381), yielding 1.86 for Covid-19 cases and 1.47 for Covid-19 deaths. The R0 value based on excess deaths was calculated to be 1.34 (95% CI 1.32–1.37). A sine-function-based adjustment for seasonal effects of 40% corresponds to a maximum value of R0January = 1.68 and a minimum value of R0July = 1.01. Our calculations show an R0 that is much lower than previously thought. This relatively low range of R0 fits very well with the observed seasonal pattern of infection across Europe in 2020 and 2021, including the emergence of more contagious escape variants such as delta or omicron. In general, our study shows that excess mortality can be used as a reliable surrogate to determine the R0 in pandemic situations. KW - SARS-CoV-2 KW - R0 KW - mortality Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-301415 VL - 12 IS - 1 ER - TY - JOUR A1 - Aydinli, Muharrem A1 - Liang, Chunguang A1 - Dandekar, Thomas T1 - Motif and conserved module analysis in DNA (promoters, enhancers) and RNA (lncRNA, mRNA) using AlModules JF - Scientific Reports N2 - Nucleic acid motifs consist of conserved and variable nucleotide regions. For functional action, several motifs are combined to modules. The tool AIModules allows identification of such motifs including combinations of them and conservation in several nucleic acid stretches. AIModules recognizes conserved motifs and combinations of motifs (modules) allowing a number of interesting biological applications such as analysis of promoter and transcription factor binding sites (TFBS), identification of conserved modules shared between several gene families, e.g. promoter regions, but also analysis of shared and conserved other DNA motifs such as enhancers and silencers, in mRNA (motifs or regulatory elements e.g. for polyadenylation) and lncRNAs. The tool AIModules presented here is an integrated solution for motif analysis, offered as a Web service as well as downloadable software. Several nucleotide sequences are queried for TFBSs using predefined matrices from the JASPAR DB or by using one’s own matrices for diverse types of DNA or RNA motif discovery. Furthermore, AIModules can find TFBSs common to two or more sequences. Demanding high or low conservation, AIModules outperforms other solutions in speed and finds more modules (specific combinations of TFBS) than alternative available software. The application also searches RNA motifs such as polyadenylation site or RNA–protein binding motifs as well as DNA motifs such as enhancers as well as user-specified motif combinations (https://bioinfo-wuerz.de/aimodules/; alternative entry pages: https://aimodules.heinzelab.de or https://www.biozentrum.uni-wuerzburg.de/bioinfo/computing/aimodules). The application is free and open source whether used online, on-site, or locally. KW - AIModules KW - nucleic acid motifs KW - DNA Y1 - 2022 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-301268 VL - 12 IS - 1 ER -