TY  - JOUR
A1  - Kaltdorf, Martin
A1  - Breitenbach, Tim
A1  - Karl, Stefan
A1  - Fuchs, Maximilian
A1  - Kessie, David Komla
A1  - Psota, Eric
A1  - Prelog, Martina
A1  - Sarukhanyan, Edita
A1  - Ebert, Regina
A1  - Jakob, Franz
A1  - Dandekar, Gudrun
A1  - Naseem, Muhammad
A1  - Liang, Chunguang
A1  - Dandekar, Thomas
T1  - Software JimenaE allows efficient dynamic simulations of Boolean networks, centrality and system state analysis
JF  - Scientific Reports
N2  - The signal modelling framework JimenaE simulates dynamically Boolean networks. In contrast to SQUAD, there is systematic and not just heuristic calculation of all system states. These specific features are not present in CellNetAnalyzer and BoolNet. JimenaE is an expert extension of Jimena, with new optimized code, network conversion into different formats, rapid convergence both for system state calculation as well as for all three network centralities. It allows higher accuracy in determining network states and allows to dissect networks and identification of network control type and amount for each protein with high accuracy. Biological examples demonstrate this: (i) High plasticity of mesenchymal stromal cells for differentiation into chondrocytes, osteoblasts and adipocytes and differentiation-specific network control focusses on wnt-, TGF-beta and PPAR-gamma signaling. JimenaE allows to study individual proteins, removal or adding interactions (or autocrine loops) and accurately quantifies effects as well as number of system states. (ii) Dynamical modelling of cell–cell interactions of plant Arapidopsis thaliana against Pseudomonas syringae DC3000: We analyze for the first time the pathogen perspective and its interaction with the host. We next provide a detailed analysis on how plant hormonal regulation stimulates specific proteins and who and which protein has which type and amount of network control including a detailed heatmap of the A.thaliana response distinguishing between two states of the immune response. (iii) In an immune response network of dendritic cells confronted with Aspergillus fumigatus, JimenaE calculates now accurately the specific values for centralities and protein-specific network control including chemokine and pattern recognition receptors.
KW  - cellular signalling networks
KW  - computer modelling
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-313303
VL  - 13
ER  - 
TY  - JOUR
A1  - Han, Chao
A1  - Ren, Pengxuan
A1  - Mamtimin, Medina
A1  - Kruk, Linus
A1  - Sarukhanyan, Edita
A1  - Li, Chenyu
A1  - Anders, Hans-Joachim
A1  - Dandekar, Thomas
A1  - Krueger, Irena
A1  - Elvers, Margitta
A1  - Goebel, Silvia
A1  - Adler, Kristin
A1  - Münch, Götz
A1  - Gudermann, Thomas
A1  - Braun, Attila
A1  - Mammadova-Bach, Elmina
T1  - Minimal collagen-binding epitope of glycoprotein VI in human and mouse platelets
JF  - Biomedicines
N2  - Glycoprotein VI (GPVI) is a platelet-specific receptor for collagen and fibrin, regulating important platelet functions such as platelet adhesion and thrombus growth. Although the blockade of GPVI function is widely recognized as a potent anti-thrombotic approach, there are limited studies focused on site-specific targeting of GPVI. Using computational modeling and bioinformatics, we analyzed collagen- and CRP-binding surfaces of GPVI monomers and dimers, and compared the interacting surfaces with other mammalian GPVI isoforms. We could predict a minimal collagen-binding epitope of GPVI dimer and designed an EA-20 antibody that recognizes a linear epitope of this surface. Using platelets and whole blood samples donated from wild-type and humanized GPVI transgenic mice and also humans, our experimental results show that the EA-20 antibody inhibits platelet adhesion and aggregation in response to collagen and CRP, but not to fibrin. The EA-20 antibody also prevents thrombus formation in whole blood, on the collagen-coated surface, in arterial flow conditions. We also show that EA-20 does not influence GPVI clustering or receptor shedding. Therefore, we propose that blockade of this minimal collagen-binding epitope of GPVI with the EA-20 antibody could represent a new anti-thrombotic approach by inhibiting specific interactions between GPVI and the collagen matrix.
KW  - GPVI
KW  - collagen
KW  - blood platelets
KW  - thrombosis
KW  - anti-thrombotic therapies
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-304148
SN  - 2227-9059
VL  - 11
IS  - 2
ER  - 
TY  - JOUR
A1  - Salihoglu, Rana
A1  - Srivastava, Mugdha
A1  - Liang, Chunguang
A1  - Schilling, Klaus
A1  - Szalay, Aladar
A1  - Bencurova, Elena
A1  - Dandekar, Thomas
T1  - PRO-Simat: Protein network simulation and design tool
JF  - Computational and Structural Biotechnology Journal
N2  - PRO-Simat is a simulation tool for analysing protein interaction networks, their dynamic change and pathway engineering. It provides GO enrichment, KEGG pathway analyses, and network visualisation from an integrated database of more than 8 million protein-protein interactions across 32 model organisms and the human proteome. We integrated dynamical network simulation using the Jimena framework, which quickly and efficiently simulates Boolean genetic regulatory networks. It enables simulation outputs with in-depth analysis of the type, strength, duration and pathway of the protein interactions on the website. Furthermore, the user can efficiently edit and analyse the effect of network modifications and engineering experiments. In case studies, applications of PRO-Simat are demonstrated: (i) understanding mutually exclusive differentiation pathways in Bacillus subtilis, (ii) making Vaccinia virus oncolytic by switching on its viral replication mainly in cancer cells and triggering cancer cell apoptosis and (iii) optogenetic control of nucleotide processing protein networks to operate DNA storage. Multilevel communication between components is critical for efficient network switching, as demonstrated by a general census on prokaryotic and eukaryotic networks and comparing design with synthetic networks using PRO-Simat. The tool is available at https://prosimat.heinzelab.de/ as a web-based query server.
KW  - network simulation
KW  - protein analysis
KW  - signalling pathways
KW  - dynamic protein-protein interactions
KW  - optogenetics
KW  - oncolytic virus
KW  - DNA storage
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-350034
SN  - 2001-0370
VL  - 21
ER  - 
TY  - JOUR
A1  - Caliskan, Aylin
A1  - Caliskan, Deniz
A1  - Rasbach, Lauritz
A1  - Yu, Weimeng
A1  - Dandekar, Thomas
A1  - Breitenbach, Tim
T1  - Optimized cell type signatures revealed from single-cell data by combining principal feature analysis, mutual information, and machine learning
JF  - Computational and Structural Biotechnology Journal
N2  - Machine learning techniques are excellent to analyze expression data from single cells. These techniques impact all fields ranging from cell annotation and clustering to signature identification. The presented framework evaluates gene selection sets how far they optimally separate defined phenotypes or cell groups. This innovation overcomes the present limitation to objectively and correctly identify a small gene set of high information content regarding separating phenotypes for which corresponding code scripts are provided. The small but meaningful subset of the original genes (or feature space) facilitates human interpretability of the differences of the phenotypes including those found by machine learning results and may even turn correlations between genes and phenotypes into a causal explanation. For the feature selection task, the principal feature analysis is utilized which reduces redundant information while selecting genes that carry the information for separating the phenotypes. In this context, the presented framework shows explainability of unsupervised learning as it reveals cell-type specific signatures. Apart from a Seurat preprocessing tool and the PFA script, the pipeline uses mutual information to balance accuracy and size of the gene set if desired. A validation part to evaluate the gene selection for their information content regarding the separation of the phenotypes is provided as well, binary and multiclass classification of 3 or 4 groups are studied. Results from different single-cell data are presented. In each, only about ten out of more than 30000 genes are identified as carrying the relevant information. The code is provided in a GitHub repository at https://github.com/AC-PHD/Seurat_PFA_pipeline.
KW  - single cell analysis
KW  - machine learning
KW  - explainability of machine learning
KW  - principal
KW  - feature analysis
KW  - model reduction
KW  - feature selection
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349989
SN  - 2001-0370
VL  - 21
ER  - 
TY  - JOUR
A1  - Caliskan, Aylin
A1  - Dangwal, Seema
A1  - Dandekar, Thomas
T1  - Metadata integrity in bioinformatics: bridging the gap between data and knowledge
JF  - Computational and Structural Biotechnology Journal
N2  - In the fast-evolving landscape of biomedical research, the emergence of big data has presented researchers with extraordinary opportunities to explore biological complexities. In biomedical research, big data imply also a big responsibility. This is not only due to genomics data being sensitive information but also due to genomics data being shared and re-analysed among the scientific community. This saves valuable resources and can even help to find new insights in silico. To fully use these opportunities, detailed and correct metadata are imperative. This includes not only the availability of metadata but also their correctness. Metadata integrity serves as a fundamental determinant of research credibility, supporting the reliability and reproducibility of data-driven findings. Ensuring metadata availability, curation, and accuracy are therefore essential for bioinformatic research. Not only must metadata be readily available, but they must also be meticulously curated and ideally error-free. Motivated by an accidental discovery of a critical metadata error in patient data published in two high-impact journals, we aim to raise awareness for the need of correct, complete, and curated metadata. We describe how the metadata error was found, addressed, and present examples for metadata-related challenges in omics research, along with supporting measures, including tools for checking metadata and software to facilitate various steps from data analysis to published research.

Highlights
• Data awareness and data integrity underpins the trustworthiness of results and subsequent further analysis.
• Big data and bioinformatics enable efficient resource use by repurposing publicly available RNA-Sequencing data.
• Manual checks of data quality and integrity are insufficient due to the overwhelming volume and rapidly growing data.
• Automation and artificial intelligence provide cost-effective and efficient solutions for data integrity and quality checks.
• FAIR data management, various software solutions and analysis tools assist metadata maintenance.
KW  - meta-data
KW  - error
KW  - annotation
KW  - error-transfer
KW  - wrong labelling
KW  - patient data
KW  - control group
KW  - tools overview
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349990
SN  - 2001-0370
VL  - 21
ER  - 
TY  - JOUR
A1  - Bencurova, Elena
A1  - Akash, Aman
A1  - Dobson, Renwick C.J.
A1  - Dandekar, Thomas
T1  - DNA storage-from natural biology to synthetic biology
JF  - Computational and Structural Biotechnology Journal
N2  - Natural DNA storage allows cellular differentiation, evolution, the growth of our children and controls all our ecosystems. Here, we discuss the fundamental aspects of DNA storage and recent advances in this field, with special emphasis on natural processes and solutions that can be exploited. We point out new ways of efficient DNA and nucleotide storage that are inspired by nature. Within a few years DNA-based information storage may become an attractive and natural complementation to current electronic data storage systems. We discuss rapid and directed access (e.g. DNA elements such as promotors, enhancers), regulatory signals and modulation (e.g. lncRNA) as well as integrated high-density storage and processing modules (e.g. chromosomal territories). There is pragmatic DNA storage for use in biotechnology and human genetics. We examine DNA storage as an approach for synthetic biology (e.g. light-controlled nucleotide processing enzymes). The natural polymers of DNA and RNA offer much for direct storage operations (read-in, read-out, access control). The inbuilt parallelism (many molecules at many places working at the same time) is important for fast processing of information. Using biology concepts from chromosomal storage, nucleic acid processing as well as polymer material sciences such as electronical effects in enzymes, graphene, nanocellulose up to DNA macramé , DNA wires and DNA-based aptamer field effect transistors will open up new applications gradually replacing classical information storage methods in ever more areas over time (decades).
KW  - DNA
KW  - RNA
KW  - data storage
KW  - natural processing
KW  - synthetic biology
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-349971
SN  - 2001-0370
VL  - 21
ER  - 
TY  - JOUR
A1  - Bencurova, Elena
A1  - Shityakov, Sergey
A1  - Schaack, Dominik
A1  - Kaltdorf, Martin
A1  - Sarukhanyan, Edita
A1  - Hilgarth, Alexander
A1  - Rath, Christin
A1  - Montenegro, Sergio
A1  - Roth, Günter
A1  - Lopez, Daniel
A1  - Dandekar, Thomas
T1  - Nanocellulose composites as smart devices with chassis, light-directed DNA Storage, engineered electronic properties, and chip integration
JF  - Frontiers in Bioengineering and Biotechnology
N2  - The rapid development of green and sustainable materials opens up new possibilities in the field of applied research. Such materials include nanocellulose composites that can integrate many components into composites and provide a good chassis for smart devices. In our study, we evaluate four approaches for turning a nanocellulose composite into an information storage or processing device: 1) nanocellulose can be a suitable carrier material and protect information stored in DNA. 2) Nucleotide-processing enzymes (polymerase and exonuclease) can be controlled by light after fusing them with light-gating domains; nucleotide substrate specificity can be changed by mutation or pH change (read-in and read-out of the information). 3) Semiconductors and electronic capabilities can be achieved: we show that nanocellulose is rendered electronic by iodine treatment replacing silicon including microstructures. Nanocellulose semiconductor properties are measured, and the resulting potential including single-electron transistors (SET) and their properties are modeled. Electric current can also be transported by DNA through G-quadruplex DNA molecules; these as well as classical silicon semiconductors can easily be integrated into the nanocellulose composite. 4) To elaborate upon miniaturization and integration for a smart nanocellulose chip device, we demonstrate pH-sensitive dyes in nanocellulose, nanopore creation, and kinase micropatterning on bacterial membranes as well as digital PCR micro-wells. Future application potential includes nano-3D printing and fast molecular processors (e.g., SETs) integrated with DNA storage and conventional electronics. This would also lead to environment-friendly nanocellulose chips for information processing as well as smart nanocellulose composites for biomedical applications and nano-factories.
KW  - nanocellulose
KW  - DNA storage
KW  - light-gated proteins
KW  - single-electron transistors
KW  - protein chip
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-283033
SN  - 2296-4185
VL  - 10
ER  - 
TY  - JOUR
A1  - Gupta, Shishir K.
A1  - Osmanoglu, Özge
A1  - Minocha, Rashmi
A1  - Bandi, Sourish Reddy
A1  - Bencurova, Elena
A1  - Srivastava, Mugdha
A1  - Dandekar, Thomas
T1  - Genome-wide scan for potential CD4+ T-cell vaccine candidates in Candida auris by exploiting reverse vaccinology and evolutionary information
JF  - Frontiers in Medicine
N2  - Candida auris is a globally emerging fungal pathogen responsible for causing nosocomial outbreaks in healthcare associated settings. It is known to cause infection in all age groups and exhibits multi-drug resistance with high potential for horizontal transmission. Because of this reason combined with limited therapeutic choices available, C. auris infection has been acknowledged as a potential risk for causing a future pandemic, and thus seeking a promising strategy for its treatment is imperative. Here, we combined evolutionary information with reverse vaccinology approach to identify novel epitopes for vaccine design that could elicit CD4+ T-cell responses against C. auris. To this end, we extensively scanned the family of proteins encoded by C. auris genome. In addition, a pathogen may acquire substitutions in epitopes over a period of time which could cause its escape from the immune response thus rendering the vaccine ineffective. To lower this possibility in our design, we eliminated all rapidly evolving genes of C. auris with positive selection. We further employed highly conserved regions of multiple C. auris strains and identified two immunogenic and antigenic T-cell epitopes that could generate the most effective immune response against C. auris. The antigenicity scores of our predicted vaccine candidates were calculated as 0.85 and 1.88 where 0.5 is the threshold for prediction of fungal antigenic sequences. Based on our results, we conclude that our vaccine candidates have the potential to be successfully employed for the treatment of C. auris infection. However, in vivo experiments are imperative to further demonstrate the efficacy of our design.
KW  - T-cell epitope
KW  - epitope prediction
KW  - positive selection
KW  - evolution
KW  - immune-informatics
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-293953
SN  - 2296-858X
VL  - 9
ER  - 
TY  - JOUR
A1  - Prada, Juan Pablo
A1  - Maag, Luca Estelle
A1  - Siegmund, Laura
A1  - Bencurova, Elena
A1  - Liang, Chunguang
A1  - Koutsilieri, Eleni
A1  - Dandekar, Thomas
A1  - Scheller, Carsten
T1  - Estimation of R0 for the spread of SARS-CoV-2 in Germany from excess mortality
JF  - Scientific Reports
N2  - For SARS-CoV-2, R0 calculations in the range of 2–3 dominate the literature, but much higher estimates have also been published. Because capacity for RT-PCR testing increased greatly in the early phase of the Covid-19 pandemic, R0 determinations based on these incidence values are subject to strong bias. We propose to use Covid-19-induced excess mortality to determine R0 regardless of RT-PCR testing capacity. We used data from the Robert Koch Institute (RKI) on the incidence of Covid cases, Covid-related deaths, number of RT-PCR tests performed, and excess mortality calculated from data from the Federal Statistical Office in Germany. We determined R0 using exponential growth estimates with a serial interval of 4.7 days. We used only datasets that were not yet under the influence of policy measures (e.g., lockdowns or school closures). The uncorrected R0 value for the spread of SARS-CoV-2 based on RT-PCR incidence data was 2.56 (95% CI 2.52–2.60) for Covid-19 cases and 2.03 (95% CI 1.96–2.10) for Covid-19-related deaths. However, because the number of RT-PCR tests increased by a growth factor of 1.381 during the same period, these R0 values must be corrected accordingly (R0corrected = R0uncorrected/1.381), yielding 1.86 for Covid-19 cases and 1.47 for Covid-19 deaths. The R0 value based on excess deaths was calculated to be 1.34 (95% CI 1.32–1.37). A sine-function-based adjustment for seasonal effects of 40% corresponds to a maximum value of R0January = 1.68 and a minimum value of R0July = 1.01. Our calculations show an R0 that is much lower than previously thought. This relatively low range of R0 fits very well with the observed seasonal pattern of infection across Europe in 2020 and 2021, including the emergence of more contagious escape variants such as delta or omicron. In general, our study shows that excess mortality can be used as a reliable surrogate to determine the R0 in pandemic situations.
KW  - SARS-CoV-2
KW  - R0
KW  - mortality
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-301415
VL  - 12
IS  - 1
ER  - 
TY  - JOUR
A1  - Aydinli, Muharrem
A1  - Liang, Chunguang
A1  - Dandekar, Thomas
T1  - Motif and conserved module analysis in DNA (promoters, enhancers) and RNA (lncRNA, mRNA) using AlModules
JF  - Scientific Reports
N2  - Nucleic acid motifs consist of conserved and variable nucleotide regions. For functional action, several motifs are combined to modules. The tool AIModules allows identification of such motifs including combinations of them and conservation in several nucleic acid stretches. AIModules recognizes conserved motifs and combinations of motifs (modules) allowing a number of interesting biological applications such as analysis of promoter and transcription factor binding sites (TFBS), identification of conserved modules shared between several gene families, e.g. promoter regions, but also analysis of shared and conserved other DNA motifs such as enhancers and silencers, in mRNA (motifs or regulatory elements e.g. for polyadenylation) and lncRNAs. The tool AIModules presented here is an integrated solution for motif analysis, offered as a Web service as well as downloadable software. Several nucleotide sequences are queried for TFBSs using predefined matrices from the JASPAR DB or by using one’s own matrices for diverse types of DNA or RNA motif discovery. Furthermore, AIModules can find TFBSs common to two or more sequences. Demanding high or low conservation, AIModules outperforms other solutions in speed and finds more modules (specific combinations of TFBS) than alternative available software. The application also searches RNA motifs such as polyadenylation site or RNA–protein binding motifs as well as DNA motifs such as enhancers as well as user-specified motif combinations (https://bioinfo-wuerz.de/aimodules/; alternative entry pages: https://aimodules.heinzelab.de or https://www.biozentrum.uni-wuerzburg.de/bioinfo/computing/aimodules). The application is free and open source whether used online, on-site, or locally.
KW  - AIModules
KW  - nucleic acid motifs
KW  - DNA
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-301268
VL  - 12
IS  - 1
ER  -