TY  - JOUR
A1  - García-Betancur, Juan-Carlos
A1  - Goñi-Moreno, Angel
A1  - Horger, Thomas
A1  - Schott, Melanie
A1  - Sharan, Malvika
A1  - Eikmeier, Julian
A1  - Wohlmuth, Barbara
A1  - Zernecke, Alma
A1  - Ohlsen, Knut
A1  - Kuttler, Christina
A1  - Lopez, Daniel
T1  - Cell differentiation defines acute and chronic infection cell types in Staphylococcus aureus
JF  - eLife
N2  - A central question to biology is how pathogenic bacteria initiate acute or chronic infections. Here we describe a genetic program for cell-fate decision in the opportunistic human pathogen Staphylococcus aureus, which generates the phenotypic bifurcation of the cells into two genetically identical but different cell types during the course of an infection. Whereas one cell type promotes the formation of biofilms that contribute to chronic infections, the second type is planktonic and produces the toxins that contribute to acute bacteremia. We identified a bimodal switch in the agr quorum sensing system that antagonistically regulates the differentiation of these two physiologically distinct cell types. We found that extracellular signals affect the behavior of the agr bimodal switch and modify the size of the specialized subpopulations in specific colonization niches. For instance, magnesium-enriched colonization niches causes magnesium binding to S. aureusteichoic acids and increases bacterial cell wall rigidity. This signal triggers a genetic program that ultimately downregulates the agr bimodal switch. Colonization niches with different magnesium concentrations influence the bimodal system activity, which defines a distinct ratio between these subpopulations; this in turn leads to distinct infection outcomes in vitro and in an in vivo murine infection model. Cell differentiation generates physiological heterogeneity in clonal bacterial infections and helps to determine the distinct infection types.
KW  - Staphylococcus aureus
KW  - infection
KW  - cell differentiation
KW  - pathogenic bacteria
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-170346
VL  - 6
IS  - e28023
ER  - 
TY  - JOUR
A1  - Jiang, Yuxiang
A1  - Oron, Tal Ronnen
A1  - Clark, Wyatt T.
A1  - Bankapur, Asma R.
A1  - D'Andrea, Daniel
A1  - Lepore, Rosalba
A1  - Funk, Christopher S.
A1  - Kahanda, Indika
A1  - Verspoor, Karin M.
A1  - Ben-Hur, Asa
A1  - Koo, Da Chen Emily
A1  - Penfold-Brown, Duncan
A1  - Shasha, Dennis
A1  - Youngs, Noah
A1  - Bonneau, Richard
A1  - Lin, Alexandra
A1  - Sahraeian, Sayed M. E.
A1  - Martelli, Pier Luigi
A1  - Profiti, Giuseppe
A1  - Casadio, Rita
A1  - Cao, Renzhi
A1  - Zhong, Zhaolong
A1  - Cheng, Jianlin
A1  - Altenhoff, Adrian
A1  - Skunca, Nives
A1  - Dessimoz, Christophe
A1  - Dogan, Tunca
A1  - Hakala, Kai
A1  - Kaewphan, Suwisa
A1  - Mehryary, Farrokh
A1  - Salakoski, Tapio
A1  - Ginter, Filip
A1  - Fang, Hai
A1  - Smithers, Ben
A1  - Oates, Matt
A1  - Gough, Julian
A1  - Törönen, Petri
A1  - Koskinen, Patrik
A1  - Holm, Liisa
A1  - Chen, Ching-Tai
A1  - Hsu, Wen-Lian
A1  - Bryson, Kevin
A1  - Cozzetto, Domenico
A1  - Minneci, Federico
A1  - Jones, David T.
A1  - Chapman, Samuel
A1  - BKC, Dukka
A1  - Khan, Ishita K.
A1  - Kihara, Daisuke
A1  - Ofer, Dan
A1  - Rappoport, Nadav
A1  - Stern, Amos
A1  - Cibrian-Uhalte, Elena
A1  - Denny, Paul
A1  - Foulger, Rebecca E.
A1  - Hieta, Reija
A1  - Legge, Duncan
A1  - Lovering, Ruth C.
A1  - Magrane, Michele
A1  - Melidoni, Anna N.
A1  - Mutowo-Meullenet, Prudence
A1  - Pichler, Klemens
A1  - Shypitsyna, Aleksandra
A1  - Li, Biao
A1  - Zakeri, Pooya
A1  - ElShal, Sarah
A1  - Tranchevent, Léon-Charles
A1  - Das, Sayoni
A1  - Dawson, Natalie L.
A1  - Lee, David
A1  - Lees, Jonathan G.
A1  - Sillitoe, Ian
A1  - Bhat, Prajwal
A1  - Nepusz, Tamás
A1  - Romero, Alfonso E.
A1  - Sasidharan, Rajkumar
A1  - Yang, Haixuan
A1  - Paccanaro, Alberto
A1  - Gillis, Jesse
A1  - Sedeño-Cortés, Adriana E.
A1  - Pavlidis, Paul
A1  - Feng, Shou
A1  - Cejuela, Juan M.
A1  - Goldberg, Tatyana
A1  - Hamp, Tobias
A1  - Richter, Lothar
A1  - Salamov, Asaf
A1  - Gabaldon, Toni
A1  - Marcet-Houben, Marina
A1  - Supek, Fran
A1  - Gong, Qingtian
A1  - Ning, Wei
A1  - Zhou, Yuanpeng
A1  - Tian, Weidong
A1  - Falda, Marco
A1  - Fontana, Paolo
A1  - Lavezzo, Enrico
A1  - Toppo, Stefano
A1  - Ferrari, Carlo
A1  - Giollo, Manuel
A1  - Piovesan, Damiano
A1  - Tosatto, Silvio C. E.
A1  - del Pozo, Angela
A1  - Fernández, José M.
A1  - Maietta, Paolo
A1  - Valencia, Alfonso
A1  - Tress, Michael L.
A1  - Benso, Alfredo
A1  - Di Carlo, Stefano
A1  - Politano, Gianfranco
A1  - Savino, Alessandro
A1  - Rehman, Hafeez Ur
A1  - Re, Matteo
A1  - Mesiti, Marco
A1  - Valentini, Giorgio
A1  - Bargsten, Joachim W.
A1  - van Dijk, Aalt D. J.
A1  - Gemovic, Branislava
A1  - Glisic, Sanja
A1  - Perovic, Vladmir
A1  - Veljkovic, Veljko
A1  - Almeida-e-Silva, Danillo C.
A1  - Vencio, Ricardo Z. N.
A1  - Sharan, Malvika
A1  - Vogel, Jörg
A1  - Kansakar, Lakesh
A1  - Zhang, Shanshan
A1  - Vucetic, Slobodan
A1  - Wang, Zheng
A1  - Sternberg, Michael J. E.
A1  - Wass, Mark N.
A1  - Huntley, Rachael P.
A1  - Martin, Maria J.
A1  - O'Donovan, Claire
A1  - Robinson, Peter N.
A1  - Moreau, Yves
A1  - Tramontano, Anna
A1  - Babbitt, Patricia C.
A1  - Brenner, Steven E.
A1  - Linial, Michal
A1  - Orengo, Christine A.
A1  - Rost, Burkhard
A1  - Greene, Casey S.
A1  - Mooney, Sean D.
A1  - Friedberg, Iddo
A1  - Radivojac, Predrag
A1  - Veljkovic, Nevena
T1  - An expanded evaluation of protein function prediction methods shows an improvement in accuracy
JF  - Genome Biology
N2  - Background
A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.

Results
We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2.

Conclusions
The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.
KW  - Protein function prediction
KW  - Disease gene prioritization
Y1  - 2016
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-166293
VL  - 17
IS  - 184
ER  - 
TY  - THES
A1  - Sharan, Malvika
T1  - Bio-computational identification and characterization of RNA-binding proteins in bacteria
T1  - Bioinformatische Identifikation und Charakterisierung von RNA-bindenden Proteinen in Bakterien
N2  - RNA-binding proteins (RBPs) have been extensively studied in eukaryotes, where they post-transcriptionally regulate many cellular events including RNA transport, translation, and stability. Experimental techniques, such as cross-linking and co-purification followed by either mass spectrometry or RNA sequencing has enabled the identification and characterization of RBPs, their conserved RNA-binding domains (RBDs), and the regulatory roles of these proteins on a genome-wide scale. These developments in quantitative, high-resolution, and high-throughput screening techniques have greatly expanded our understanding of RBPs in human and yeast cells. In contrast, our knowledge of number and potential diversity of RBPs in bacteria is comparatively poor, in part due to the technical challenges associated with existing global screening approaches developed in eukaryotes. 
Genome- and proteome-wide screening approaches performed in silico may circumvent these technical issues to obtain a broad picture of the RNA interactome of bacteria and identify strong RBP candidates for more detailed experimental study. Here, I report APRICOT (“Analyzing Protein RNA Interaction by Combined Output Technique”), a computational pipeline for the sequence-based identification and characterization of candidate RNA-binding proteins encoded in the genomes of all domains of life using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences of an input proteome using position-specific scoring matrices and hidden Markov models of all conserved domains available in the databases and then statistically score them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them according to functionally relevant structural properties. APRICOT performed better than other existing tools for the sequence-based prediction on the known RBP data sets. The applications and adaptability of the software was demonstrated on several large bacterial RBP data sets including the complete proteome of Salmonella Typhimurium strain SL1344. APRICOT reported 1068 Salmonella proteins as RBP candidates, which were subsequently categorized using the RBDs that have been reported in both eukaryotic and bacterial proteins. A set of 131 strong RBP candidates was selected for experimental confirmation and characterization of RNA-binding activity using RNA co-immunoprecipitation followed by high-throughput sequencing (RIP-Seq) experiments. Based on the relative abundance of transcripts across the RIP-Seq libraries, a catalogue of enriched genes was established for each candidate, which shows the RNA-binding potential of 90% of these proteins. Furthermore, the direct targets of few of these putative RBPs were validated by means of cross-linking and co-immunoprecipitation (CLIP) experiments. 
This thesis presents the computational pipeline APRICOT for the global screening of protein primary sequences for potential RBPs in bacteria using RBD information from all kingdoms of life. Furthermore, it provides the first bio-computational resource of putative RBPs in Salmonella, which could now be further studied for their biological and regulatory roles. The command line tool and its documentation are available at https://malvikasharan.github.io/APRICOT/.
N2  - RNA-bindende Proteine (RBPs) wurden umfangreich in Eukaryoten erforscht, in denen sie viele Prozesse wie RNA-Transport, -Translation und -Stabilität post-transkriptionell regulieren. Experimentelle Methoden wie Cross-linking and Koimmunpräzipitation mit nachfolgedener Massenspektromentrie / RNA-Sequenzierung ermöglichten eine weitreichende Charakterisierung von RBPs, RNA-bindenden Domänen (RBDs) und deren regulatorischen Rollen in eukaryotischen Spezies wie Mensch und Hefe. Weitere Entwicklungen im Bereich der hochdurchsatzbasierten Screeningverfahren konnten das Verständnis von RBPs in Eukaryoten enorm erweitern. Im Gegensatz dazu ist das Wissen über die Anzahl und die potenzielle Vielfalt von RBPs in Bakterien dürftig.
In der vorliegenden Arbeit präsentiere ich APRICOT, eine bioinformatische Pipeline zur sequenzbasierten Identifikation und Charakterisierung von Proteinen aller Domänen des Lebens, die auf RBD-Informationen aus experimentellen Studien aufbaut. Die Pipeline nutzt Position Specific Scoring Matrices und Hidden-MarkovModelle konservierter Domänen, um funktionelle Motive in Proteinsequenzen zu identifizieren und diese anhand von sequenzbasierter Eigenschaften statistisch zu bewerten. Anschließend identifiziert APRICOT mögliche RBPs und charakterisiert auf Basis ihrer biologischeren Eigenschaften. In Vergleichen mit ähnlichen Werkzeugen übertraf APRICOT andere Programme zur sequenzbasierten Vorhersage von RBPs. Die Anwendungsöglichkeiten und die Flexibilität der Software wird am Beispiel einiger großer RBP-Kollektionen, die auch das komplette Proteom von Salmonella Typhimurium SL1344 beinhalten, dargelegt. APRICOT identifiziert 1068 Proteine von Salmonella als RBP-Kandidaten, die anschließend unter Nutzung der bereits bekannten bakteriellen und eukaryotischen RBDs klassifiziert wurden. 131 der RBP-Kandidaten wurden zur Charakterisierung durch RNA co-immunoprecipitation followed by high-throughput sequencing (RIP-seq) ausgewählt. Basierend auf der relativen Menge an Transkripten in den RIP-seq-Bibliotheken wurde ein Katalog von angereicherten Genen erstellt, der auf eine potentielle RNA-bindende Funktion in 90% dieser Proteine hindeutet. Weiterhin wurden die Bindungstellen einiger dieser möglichen RBPs mit Cross-linking and Co-immunoprecipitation (CLIP) bestimmt.
Diese Doktorarbeit beschreibt die bioinformatische Pipeline APRICOT, die ein globales Screening von RBPs in Bakterien anhand von Informationen bekannter RBDs ermöglicht. Zudem enthält sie eine Zusammenstellung aller potentieller RPS in Salmonella, die nun auf ihre biologsche Funktion hin untersucht werden können. Das Kommondozeilen-Programm und seine Dokumentation sind auf https://malvikasharan.github.io/APRICOT/ verfügbar.
KW  - Bioinformatics
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-153573
ER  - 
TY  - JOUR
A1  - Sharan, Malvika
A1  - Förstner, Konrad U.
A1  - Eulalio, Ana
A1  - Vogel, Jörg
T1  - APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins
JF  - Nucleic Acids Research
N2  - RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated with the existing global screening approaches. We introduce APRICOT, a computational pipeline for the sequence-based identification and characterization of proteins using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences using position-specific scoring matrices and Hidden Markov Models of the functional domains and statistically scores them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them by several biological properties. Here we demonstrate the application and adaptability of the pipeline on large-scale protein sets, including the bacterial proteome of Escherichia coli. APRICOT showed better performance on various datasets compared to other existing tools for the sequence-based prediction of RBPs by achieving an average sensitivity and specificity of 0.90 and 0.91 respectively. The command-line tool and its documentation are available at https://pypi.python.org/pypi/bio-apricot.
KW  - RNA-binding proteins
KW  - identification
KW  - characterization
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-157963
VL  - 45
IS  - 11
ER  - 
TY  - JOUR
A1  - Sunkavalli, Ushasree
A1  - Aguilar, Carmen
A1  - Silva, Ricardo Jorge
A1  - Sharan, Malvika
A1  - Cruz, Ana Rita
A1  - Tawk, Caroline
A1  - Maudet, Claire
A1  - Mano, Miguel
A1  - Eulalio, Ana
T1  - Analysis of host microRNA function uncovers a role for miR-29b-2-5p in Shigella capture by filopodia
JF  - PLoS Pathogens
N2  - MicroRNAs play an important role in the interplay between bacterial pathogens and host cells, participating as host defense mechanisms, as well as exploited by bacteria to subvert host cellular functions. Here, we show that microRNAs modulate infection by Shigella flexneri, a major causative agent of bacillary dysentery in humans. Specifically, we characterize the dual regulatory role of miR-29b-2-5p during infection, showing that this microRNA strongly favors Shigella infection by promoting both bacterial binding to host cells and intracellular replication. Using a combination of transcriptome analysis and targeted high-content RNAi screening, we identify UNC5C as a direct target of miR-29b-2-5p and show its pivotal role in the modulation of Shigella binding to host cells. MiR-29b-2-5p, through repression of UNC5C, strongly enhances filopodia formation thus increasing Shigella capture and promoting bacterial invasion. The increase of filopodia formation mediated by miR-29b-2-5p is dependent on RhoF and Cdc42 Rho-GTPases. Interestingly, the levels of miR-29b-2-5p, but not of other mature microRNAs from the same precursor, are decreased upon Shigella replication at late times post-infection, through degradation of the mature microRNA by the exonuclease PNPT1. While the relatively high basal levels of miR-29b-2-5p at the start of infection ensure efficient Shigella capture by host cell filopodia, dampening of miR-29b-2-5p levels later during infection may constitute a bacterial strategy to favor a balanced intracellular replication to avoid premature cell death and favor dissemination to neighboring cells, or alternatively, part of the host response to counteract Shigella infection. Overall, these findings reveal a previously unappreciated role of microRNAs, and in particular miR-29b-2-5p, in the interaction of Shigella with host cells.
KW  - hos tcells
KW  - Salmonellosis
KW  - Shigellosis
KW  - microRNAs
KW  - Shigella
KW  - small interfering RNAs
KW  - HeLa cells
KW  - Cell binding
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-158204
VL  - 13
IS  - 4
ER  - 
TY  - JOUR
A1  - Tawk, Caroline
A1  - Sharan, Malvika
A1  - Eulalio, Ana
A1  - Vogel, Jörg
T1  - A systematic analysis of the RNA-targeting potential of secreted bacterial effector proteins
JF  - Scientific Reports
N2  - Many pathogenic bacteria utilize specialized secretion systems to deliver proteins called effectors into eukaryotic cells for manipulation of host pathways. The vast majority of known effector targets are host proteins, whereas a potential targeting of host nucleic acids remains little explored. There is only one family of effectors known to target DNA directly, and effectors binding host RNA are unknown. Here, we take a two-pronged approach to search for RNA-binding effectors, combining biocomputational prediction of RNA-binding domains (RBDs) in a newly assembled comprehensive dataset of bacterial secreted proteins, and experimental screening for RNA binding in mammalian cells. Only a small subset of effectors were predicted to carry an RBD, indicating that if RNA targeting was common, it would likely involve new types of RBDs. Our experimental evaluation of effectors with predicted RBDs further argues for a general paucity of RNA binding activities amongst bacterial effectors. We obtained evidence that PipB2 and Lpg2844, effector proteins of Salmonella and Legionella species, respectively, may harbor novel biochemical activities. Our study presenting the first systematic evaluation of the RNA-targeting potential of bacterial effectors offers a basis for discussion of whether or not host RNA is a prominent target of secreted bacterial proteins.
KW  - pathogens
KW  - bacterial secretion
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-158815
VL  - 7
ER  - 
TY  - JOUR
A1  - Wagner, Ines
A1  - Volkmer, Michael
A1  - Sharan, Malvika
A1  - Villaveces, Jose M.
A1  - Oswald, Felix
A1  - Surendranath, Vineeth
A1  - Habermann, Bianca H.
T1  - morFeus: a web-based program to detect remotely conserved orthologs using symmetrical best hits and orthology network scoring
JF  - BMC Bioinformatics
N2  - Background: Searching the orthologs of a given protein or DNA sequence is one of the most important and most commonly used Bioinformatics methods in Biology. Programs like BLAST or the orthology search engine Inparanoid can be used to find orthologs when the similarity between two sequences is sufficiently high. They however fail when the level of conservation is low. The detection of remotely conserved proteins oftentimes involves sophisticated manual intervention that is difficult to automate. 
Results: Here, we introduce morFeus, a search program to find remotely conserved orthologs. Based on relaxed sequence similarity searches, morFeus selects sequences based on the similarity of their alignments to the query, tests for orthology by iterative reciprocal BLAST searches and calculates a network score for the resulting network of orthologs that is a measure of orthology independent of the E-value. Detecting remotely conserved orthologs of a protein using morFeus thus requires no manual intervention. We demonstrate the performance of morFeus by comparing it to state-of-the-art orthology resources and methods. We provide an example of remotely conserved orthologs, which were experimentally shown to be functionally equivalent in the respective organisms and therefore meet the criteria of the orthology-function conjecture. 
Conclusions: Based on our results, we conclude that morFeus is a powerful and specific search method for detecting remotely conserved orthologs.
KW  - reciprocal best hit
KW  - finder using symmetrical best hits
KW  - sequences
KW  - annotation
KW  - identification
KW  - database
KW  - genomes
KW  - proteins
KW  - homologs
KW  - hidden markov-models
KW  - phylogenetic trees
KW  - PSI-blast
KW  - eigenvector centrality
KW  - meta-analysis based orthology
KW  - orthology
KW  - remote sequence conservation
KW  - alignment clustering
KW  - orthology network
Y1  - 2014
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-115590
VL  - 15
IS  - 263
ER  -