TY - JOUR A1 - García-Betancur, Juan-Carlos A1 - Goñi-Moreno, Angel A1 - Horger, Thomas A1 - Schott, Melanie A1 - Sharan, Malvika A1 - Eikmeier, Julian A1 - Wohlmuth, Barbara A1 - Zernecke, Alma A1 - Ohlsen, Knut A1 - Kuttler, Christina A1 - Lopez, Daniel T1 - Cell differentiation defines acute and chronic infection cell types in Staphylococcus aureus JF - eLife N2 - A central question to biology is how pathogenic bacteria initiate acute or chronic infections. Here we describe a genetic program for cell-fate decision in the opportunistic human pathogen Staphylococcus aureus, which generates the phenotypic bifurcation of the cells into two genetically identical but different cell types during the course of an infection. Whereas one cell type promotes the formation of biofilms that contribute to chronic infections, the second type is planktonic and produces the toxins that contribute to acute bacteremia. We identified a bimodal switch in the agr quorum sensing system that antagonistically regulates the differentiation of these two physiologically distinct cell types. We found that extracellular signals affect the behavior of the agr bimodal switch and modify the size of the specialized subpopulations in specific colonization niches. For instance, magnesium-enriched colonization niches causes magnesium binding to S. aureusteichoic acids and increases bacterial cell wall rigidity. This signal triggers a genetic program that ultimately downregulates the agr bimodal switch. Colonization niches with different magnesium concentrations influence the bimodal system activity, which defines a distinct ratio between these subpopulations; this in turn leads to distinct infection outcomes in vitro and in an in vivo murine infection model. Cell differentiation generates physiological heterogeneity in clonal bacterial infections and helps to determine the distinct infection types. KW - Staphylococcus aureus KW - infection KW - cell differentiation KW - pathogenic bacteria Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-170346 VL - 6 IS - e28023 ER - TY - JOUR A1 - Jiang, Yuxiang A1 - Oron, Tal Ronnen A1 - Clark, Wyatt T. A1 - Bankapur, Asma R. A1 - D'Andrea, Daniel A1 - Lepore, Rosalba A1 - Funk, Christopher S. A1 - Kahanda, Indika A1 - Verspoor, Karin M. A1 - Ben-Hur, Asa A1 - Koo, Da Chen Emily A1 - Penfold-Brown, Duncan A1 - Shasha, Dennis A1 - Youngs, Noah A1 - Bonneau, Richard A1 - Lin, Alexandra A1 - Sahraeian, Sayed M. E. A1 - Martelli, Pier Luigi A1 - Profiti, Giuseppe A1 - Casadio, Rita A1 - Cao, Renzhi A1 - Zhong, Zhaolong A1 - Cheng, Jianlin A1 - Altenhoff, Adrian A1 - Skunca, Nives A1 - Dessimoz, Christophe A1 - Dogan, Tunca A1 - Hakala, Kai A1 - Kaewphan, Suwisa A1 - Mehryary, Farrokh A1 - Salakoski, Tapio A1 - Ginter, Filip A1 - Fang, Hai A1 - Smithers, Ben A1 - Oates, Matt A1 - Gough, Julian A1 - Törönen, Petri A1 - Koskinen, Patrik A1 - Holm, Liisa A1 - Chen, Ching-Tai A1 - Hsu, Wen-Lian A1 - Bryson, Kevin A1 - Cozzetto, Domenico A1 - Minneci, Federico A1 - Jones, David T. A1 - Chapman, Samuel A1 - BKC, Dukka A1 - Khan, Ishita K. A1 - Kihara, Daisuke A1 - Ofer, Dan A1 - Rappoport, Nadav A1 - Stern, Amos A1 - Cibrian-Uhalte, Elena A1 - Denny, Paul A1 - Foulger, Rebecca E. A1 - Hieta, Reija A1 - Legge, Duncan A1 - Lovering, Ruth C. A1 - Magrane, Michele A1 - Melidoni, Anna N. A1 - Mutowo-Meullenet, Prudence A1 - Pichler, Klemens A1 - Shypitsyna, Aleksandra A1 - Li, Biao A1 - Zakeri, Pooya A1 - ElShal, Sarah A1 - Tranchevent, Léon-Charles A1 - Das, Sayoni A1 - Dawson, Natalie L. A1 - Lee, David A1 - Lees, Jonathan G. A1 - Sillitoe, Ian A1 - Bhat, Prajwal A1 - Nepusz, Tamás A1 - Romero, Alfonso E. A1 - Sasidharan, Rajkumar A1 - Yang, Haixuan A1 - Paccanaro, Alberto A1 - Gillis, Jesse A1 - Sedeño-Cortés, Adriana E. A1 - Pavlidis, Paul A1 - Feng, Shou A1 - Cejuela, Juan M. A1 - Goldberg, Tatyana A1 - Hamp, Tobias A1 - Richter, Lothar A1 - Salamov, Asaf A1 - Gabaldon, Toni A1 - Marcet-Houben, Marina A1 - Supek, Fran A1 - Gong, Qingtian A1 - Ning, Wei A1 - Zhou, Yuanpeng A1 - Tian, Weidong A1 - Falda, Marco A1 - Fontana, Paolo A1 - Lavezzo, Enrico A1 - Toppo, Stefano A1 - Ferrari, Carlo A1 - Giollo, Manuel A1 - Piovesan, Damiano A1 - Tosatto, Silvio C. E. A1 - del Pozo, Angela A1 - Fernández, José M. A1 - Maietta, Paolo A1 - Valencia, Alfonso A1 - Tress, Michael L. A1 - Benso, Alfredo A1 - Di Carlo, Stefano A1 - Politano, Gianfranco A1 - Savino, Alessandro A1 - Rehman, Hafeez Ur A1 - Re, Matteo A1 - Mesiti, Marco A1 - Valentini, Giorgio A1 - Bargsten, Joachim W. A1 - van Dijk, Aalt D. J. A1 - Gemovic, Branislava A1 - Glisic, Sanja A1 - Perovic, Vladmir A1 - Veljkovic, Veljko A1 - Almeida-e-Silva, Danillo C. A1 - Vencio, Ricardo Z. N. A1 - Sharan, Malvika A1 - Vogel, Jörg A1 - Kansakar, Lakesh A1 - Zhang, Shanshan A1 - Vucetic, Slobodan A1 - Wang, Zheng A1 - Sternberg, Michael J. E. A1 - Wass, Mark N. A1 - Huntley, Rachael P. A1 - Martin, Maria J. A1 - O'Donovan, Claire A1 - Robinson, Peter N. A1 - Moreau, Yves A1 - Tramontano, Anna A1 - Babbitt, Patricia C. A1 - Brenner, Steven E. A1 - Linial, Michal A1 - Orengo, Christine A. A1 - Rost, Burkhard A1 - Greene, Casey S. A1 - Mooney, Sean D. A1 - Friedberg, Iddo A1 - Radivojac, Predrag A1 - Veljkovic, Nevena T1 - An expanded evaluation of protein function prediction methods shows an improvement in accuracy JF - Genome Biology N2 - Background A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. Results We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. Conclusions The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent. KW - Protein function prediction KW - Disease gene prioritization Y1 - 2016 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-166293 VL - 17 IS - 184 ER - TY - THES A1 - Sharan, Malvika T1 - Bio-computational identification and characterization of RNA-binding proteins in bacteria T1 - Bioinformatische Identifikation und Charakterisierung von RNA-bindenden Proteinen in Bakterien N2 - RNA-binding proteins (RBPs) have been extensively studied in eukaryotes, where they post-transcriptionally regulate many cellular events including RNA transport, translation, and stability. Experimental techniques, such as cross-linking and co-purification followed by either mass spectrometry or RNA sequencing has enabled the identification and characterization of RBPs, their conserved RNA-binding domains (RBDs), and the regulatory roles of these proteins on a genome-wide scale. These developments in quantitative, high-resolution, and high-throughput screening techniques have greatly expanded our understanding of RBPs in human and yeast cells. In contrast, our knowledge of number and potential diversity of RBPs in bacteria is comparatively poor, in part due to the technical challenges associated with existing global screening approaches developed in eukaryotes. Genome- and proteome-wide screening approaches performed in silico may circumvent these technical issues to obtain a broad picture of the RNA interactome of bacteria and identify strong RBP candidates for more detailed experimental study. Here, I report APRICOT (“Analyzing Protein RNA Interaction by Combined Output Technique”), a computational pipeline for the sequence-based identification and characterization of candidate RNA-binding proteins encoded in the genomes of all domains of life using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences of an input proteome using position-specific scoring matrices and hidden Markov models of all conserved domains available in the databases and then statistically score them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them according to functionally relevant structural properties. APRICOT performed better than other existing tools for the sequence-based prediction on the known RBP data sets. The applications and adaptability of the software was demonstrated on several large bacterial RBP data sets including the complete proteome of Salmonella Typhimurium strain SL1344. APRICOT reported 1068 Salmonella proteins as RBP candidates, which were subsequently categorized using the RBDs that have been reported in both eukaryotic and bacterial proteins. A set of 131 strong RBP candidates was selected for experimental confirmation and characterization of RNA-binding activity using RNA co-immunoprecipitation followed by high-throughput sequencing (RIP-Seq) experiments. Based on the relative abundance of transcripts across the RIP-Seq libraries, a catalogue of enriched genes was established for each candidate, which shows the RNA-binding potential of 90% of these proteins. Furthermore, the direct targets of few of these putative RBPs were validated by means of cross-linking and co-immunoprecipitation (CLIP) experiments. This thesis presents the computational pipeline APRICOT for the global screening of protein primary sequences for potential RBPs in bacteria using RBD information from all kingdoms of life. Furthermore, it provides the first bio-computational resource of putative RBPs in Salmonella, which could now be further studied for their biological and regulatory roles. The command line tool and its documentation are available at https://malvikasharan.github.io/APRICOT/. N2 - RNA-bindende Proteine (RBPs) wurden umfangreich in Eukaryoten erforscht, in denen sie viele Prozesse wie RNA-Transport, -Translation und -Stabilität post-transkriptionell regulieren. Experimentelle Methoden wie Cross-linking and Koimmunpräzipitation mit nachfolgedener Massenspektromentrie / RNA-Sequenzierung ermöglichten eine weitreichende Charakterisierung von RBPs, RNA-bindenden Domänen (RBDs) und deren regulatorischen Rollen in eukaryotischen Spezies wie Mensch und Hefe. Weitere Entwicklungen im Bereich der hochdurchsatzbasierten Screeningverfahren konnten das Verständnis von RBPs in Eukaryoten enorm erweitern. Im Gegensatz dazu ist das Wissen über die Anzahl und die potenzielle Vielfalt von RBPs in Bakterien dürftig. In der vorliegenden Arbeit präsentiere ich APRICOT, eine bioinformatische Pipeline zur sequenzbasierten Identifikation und Charakterisierung von Proteinen aller Domänen des Lebens, die auf RBD-Informationen aus experimentellen Studien aufbaut. Die Pipeline nutzt Position Specific Scoring Matrices und Hidden-MarkovModelle konservierter Domänen, um funktionelle Motive in Proteinsequenzen zu identifizieren und diese anhand von sequenzbasierter Eigenschaften statistisch zu bewerten. Anschließend identifiziert APRICOT mögliche RBPs und charakterisiert auf Basis ihrer biologischeren Eigenschaften. In Vergleichen mit ähnlichen Werkzeugen übertraf APRICOT andere Programme zur sequenzbasierten Vorhersage von RBPs. Die Anwendungsöglichkeiten und die Flexibilität der Software wird am Beispiel einiger großer RBP-Kollektionen, die auch das komplette Proteom von Salmonella Typhimurium SL1344 beinhalten, dargelegt. APRICOT identifiziert 1068 Proteine von Salmonella als RBP-Kandidaten, die anschließend unter Nutzung der bereits bekannten bakteriellen und eukaryotischen RBDs klassifiziert wurden. 131 der RBP-Kandidaten wurden zur Charakterisierung durch RNA co-immunoprecipitation followed by high-throughput sequencing (RIP-seq) ausgewählt. Basierend auf der relativen Menge an Transkripten in den RIP-seq-Bibliotheken wurde ein Katalog von angereicherten Genen erstellt, der auf eine potentielle RNA-bindende Funktion in 90% dieser Proteine hindeutet. Weiterhin wurden die Bindungstellen einiger dieser möglichen RBPs mit Cross-linking and Co-immunoprecipitation (CLIP) bestimmt. Diese Doktorarbeit beschreibt die bioinformatische Pipeline APRICOT, die ein globales Screening von RBPs in Bakterien anhand von Informationen bekannter RBDs ermöglicht. Zudem enthält sie eine Zusammenstellung aller potentieller RPS in Salmonella, die nun auf ihre biologsche Funktion hin untersucht werden können. Das Kommondozeilen-Programm und seine Dokumentation sind auf https://malvikasharan.github.io/APRICOT/ verfügbar. KW - Bioinformatics Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-153573 ER - TY - JOUR A1 - Sharan, Malvika A1 - Förstner, Konrad U. A1 - Eulalio, Ana A1 - Vogel, Jörg T1 - APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins JF - Nucleic Acids Research N2 - RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated with the existing global screening approaches. We introduce APRICOT, a computational pipeline for the sequence-based identification and characterization of proteins using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences using position-specific scoring matrices and Hidden Markov Models of the functional domains and statistically scores them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them by several biological properties. Here we demonstrate the application and adaptability of the pipeline on large-scale protein sets, including the bacterial proteome of Escherichia coli. APRICOT showed better performance on various datasets compared to other existing tools for the sequence-based prediction of RBPs by achieving an average sensitivity and specificity of 0.90 and 0.91 respectively. The command-line tool and its documentation are available at https://pypi.python.org/pypi/bio-apricot. KW - RNA-binding proteins KW - identification KW - characterization Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-157963 VL - 45 IS - 11 ER - TY - JOUR A1 - Sunkavalli, Ushasree A1 - Aguilar, Carmen A1 - Silva, Ricardo Jorge A1 - Sharan, Malvika A1 - Cruz, Ana Rita A1 - Tawk, Caroline A1 - Maudet, Claire A1 - Mano, Miguel A1 - Eulalio, Ana T1 - Analysis of host microRNA function uncovers a role for miR-29b-2-5p in Shigella capture by filopodia JF - PLoS Pathogens N2 - MicroRNAs play an important role in the interplay between bacterial pathogens and host cells, participating as host defense mechanisms, as well as exploited by bacteria to subvert host cellular functions. Here, we show that microRNAs modulate infection by Shigella flexneri, a major causative agent of bacillary dysentery in humans. Specifically, we characterize the dual regulatory role of miR-29b-2-5p during infection, showing that this microRNA strongly favors Shigella infection by promoting both bacterial binding to host cells and intracellular replication. Using a combination of transcriptome analysis and targeted high-content RNAi screening, we identify UNC5C as a direct target of miR-29b-2-5p and show its pivotal role in the modulation of Shigella binding to host cells. MiR-29b-2-5p, through repression of UNC5C, strongly enhances filopodia formation thus increasing Shigella capture and promoting bacterial invasion. The increase of filopodia formation mediated by miR-29b-2-5p is dependent on RhoF and Cdc42 Rho-GTPases. Interestingly, the levels of miR-29b-2-5p, but not of other mature microRNAs from the same precursor, are decreased upon Shigella replication at late times post-infection, through degradation of the mature microRNA by the exonuclease PNPT1. While the relatively high basal levels of miR-29b-2-5p at the start of infection ensure efficient Shigella capture by host cell filopodia, dampening of miR-29b-2-5p levels later during infection may constitute a bacterial strategy to favor a balanced intracellular replication to avoid premature cell death and favor dissemination to neighboring cells, or alternatively, part of the host response to counteract Shigella infection. Overall, these findings reveal a previously unappreciated role of microRNAs, and in particular miR-29b-2-5p, in the interaction of Shigella with host cells. KW - hos tcells KW - Salmonellosis KW - Shigellosis KW - microRNAs KW - Shigella KW - small interfering RNAs KW - HeLa cells KW - Cell binding Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-158204 VL - 13 IS - 4 ER - TY - JOUR A1 - Tawk, Caroline A1 - Sharan, Malvika A1 - Eulalio, Ana A1 - Vogel, Jörg T1 - A systematic analysis of the RNA-targeting potential of secreted bacterial effector proteins JF - Scientific Reports N2 - Many pathogenic bacteria utilize specialized secretion systems to deliver proteins called effectors into eukaryotic cells for manipulation of host pathways. The vast majority of known effector targets are host proteins, whereas a potential targeting of host nucleic acids remains little explored. There is only one family of effectors known to target DNA directly, and effectors binding host RNA are unknown. Here, we take a two-pronged approach to search for RNA-binding effectors, combining biocomputational prediction of RNA-binding domains (RBDs) in a newly assembled comprehensive dataset of bacterial secreted proteins, and experimental screening for RNA binding in mammalian cells. Only a small subset of effectors were predicted to carry an RBD, indicating that if RNA targeting was common, it would likely involve new types of RBDs. Our experimental evaluation of effectors with predicted RBDs further argues for a general paucity of RNA binding activities amongst bacterial effectors. We obtained evidence that PipB2 and Lpg2844, effector proteins of Salmonella and Legionella species, respectively, may harbor novel biochemical activities. Our study presenting the first systematic evaluation of the RNA-targeting potential of bacterial effectors offers a basis for discussion of whether or not host RNA is a prominent target of secreted bacterial proteins. KW - pathogens KW - bacterial secretion Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-158815 VL - 7 ER - TY - JOUR A1 - Wagner, Ines A1 - Volkmer, Michael A1 - Sharan, Malvika A1 - Villaveces, Jose M. A1 - Oswald, Felix A1 - Surendranath, Vineeth A1 - Habermann, Bianca H. T1 - morFeus: a web-based program to detect remotely conserved orthologs using symmetrical best hits and orthology network scoring JF - BMC Bioinformatics N2 - Background: Searching the orthologs of a given protein or DNA sequence is one of the most important and most commonly used Bioinformatics methods in Biology. Programs like BLAST or the orthology search engine Inparanoid can be used to find orthologs when the similarity between two sequences is sufficiently high. They however fail when the level of conservation is low. The detection of remotely conserved proteins oftentimes involves sophisticated manual intervention that is difficult to automate. Results: Here, we introduce morFeus, a search program to find remotely conserved orthologs. Based on relaxed sequence similarity searches, morFeus selects sequences based on the similarity of their alignments to the query, tests for orthology by iterative reciprocal BLAST searches and calculates a network score for the resulting network of orthologs that is a measure of orthology independent of the E-value. Detecting remotely conserved orthologs of a protein using morFeus thus requires no manual intervention. We demonstrate the performance of morFeus by comparing it to state-of-the-art orthology resources and methods. We provide an example of remotely conserved orthologs, which were experimentally shown to be functionally equivalent in the respective organisms and therefore meet the criteria of the orthology-function conjecture. Conclusions: Based on our results, we conclude that morFeus is a powerful and specific search method for detecting remotely conserved orthologs. KW - reciprocal best hit KW - finder using symmetrical best hits KW - sequences KW - annotation KW - identification KW - database KW - genomes KW - proteins KW - homologs KW - hidden markov-models KW - phylogenetic trees KW - PSI-blast KW - eigenvector centrality KW - meta-analysis based orthology KW - orthology KW - remote sequence conservation KW - alignment clustering KW - orthology network Y1 - 2014 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-115590 VL - 15 IS - 263 ER -