Refine
Has Fulltext
- yes (16)
Is part of the Bibliography
- yes (16)
Year of publication
Document Type
- Doctoral Thesis (16)
Language
- English (16) (remove)
Keywords
- Genexpression (16) (remove)
Institute
- Theodor-Boveri-Institut für Biowissenschaften (16) (remove)
In this thesis, the development of a phylogenetic DNA microarray, the analysis of several gene expression microarray datasets and new approaches for improved data analysis and interpretation are described. In the first publication, the development and analysis of a phylogenetic microarray is presented. I could show that species detection with phylogenetic DNA microarrays can be significantly improved when the microarray data is analyzed with a linear regression modeling approach. Standard methods have so far relied on pure signal intensities of the array spots and a simple cutoff criterion was applied to call a species present or absent. This procedure is not applicable to very closely related species with high sequence similarity because cross-hybridization of non-target DNA renders species detection impossible based on signal intensities alone. By modeling hybridization and cross-hybridization with linear regression, as I have presented in this thesis, even species with a sequence similarity of 97% in the marker gene can be detected and distinguished from related species. Another advantage of the modeling approach over existing methods is that the model also performs well on mixtures of different species. In principle, also quantitative predictions can be made. To make better use of the large amounts of microarray data stored in public databases, meta-analysis approaches need to be developed. In the second publication, an explorative meta-analysis exemplified on Arabidopsis thaliana gene expression datasets is presented. Integrating datasets studying effects such as the influence of plant hormones, pathogens and different mutations on gene expression levels, clusters of similarly treated datasets could be found. From the clusters of pathogen-treated and indole-3-acetic acid (IAA) treated datasets, representative genes were selected which pointed to functions which had been associated with pathogen attack or IAA effects previously. Additionally, hypotheses about the functions of so far uncharacterized genes could be set up. Thus, this kind of meta-analysis could be used to propose gene functions and their regulation under different conditions. In this work, also primary data analysis of Arabidopsis thaliana datasets is presented. In the third publication, an experiment which was conducted to find out if microwave irradiation has an effect on the gene expression of a plant cell culture is described. During the first steps, the data analysis was carried out blinded and exploratory analysis methods were applied to find out if the irradiation had an effect on gene expression of plant cells. Small but statistically significant changes in a few genes were found and could be experimentally confirmed. From the functions of the regulated genes and a meta-analysis with publicly available microarray data, it could be suspected that the plant cell culture somehow perceived the irradiation as energy, similar to perceiving light rays. The fourth publication describes the functional analysis of another Arabidopsis thaliana gene expression dataset. The gene expression data of the plant tumor dataset pointed to a switch from a mainly aerobic, auxotrophic to an anaerobic and heterotrophic metabolism in the plant tumor. Genes involved in photosynthesis were found to be repressed in tumors; genes of amino acid and lipid metabolism, cell wall and solute transporters were regulated in a way that sustains tumor growth and development. Furthermore, in the fifth publication, GEPAT (Genome Expression Pathway Analysis Tool), a tool for the analysis and integration of microarray data with other data types, is described. It consists of a web application and database which allows comfortable data upload and data analysis. In later chapters of this thesis (publication 6 and publication 7), GEPAT is used to analyze human microarray datasets and to integrate results from gene expression analysis with other datatypes. Gene expression and comparative genomic hybridization data from 71 Mantle Cell Lymphoma (MCL) patients was analyzed and allowed proposing a seven gene predictor which facilitates survival predictions for patients compared to existing predictors. In this study, it was shown that CGH data can be used for survival predictions. For the dataset of Diffuse Large B-cell lymphoma (DLBCL) patients, an improved survival predictor could be found based on the gene expression data. From the genes differentially expressed between long and short surviving MCL patients as well as for regulated genes of DLBCL patients, interaction networks could be set up. They point to differences in regulation for cell cycle and proliferation genes between patients with good and bad prognosis.
The Popeye domain containing (Popdc) gene family of membrane proteins is predominantly expressed in striated and smooth muscle tissues and has been shown to act as novel cAMP-binding proteins. In mice, loss of Popdc1 and Popdc2, respectively, affects sinus node function in the postnatal heart in an age and stress-dependent manner. In this thesis, I examined gene expression pattern and function of the Popdc gene family during zebrafish development with an emphasis on popdc2. Expression of the zebrafish popdc2 was exclusively present in cardiac and skeletal muscle during cardiac development, whereas popdc3 was expressed in striated muscle tissue and in distinct regions of the brain. In order to study the function of these genes, an antisense morpholino-based knockdown approach was used. Knockdown of popdc2 resulted in aberrant development of facial and tail musculature. In the heart, popdc2 morphants displayed irregular ventricular contractions with 2:1 and 3:1 ventricular pauses. Recordings of calcium transients using a transgenic indicator line Tg(cmlc2:gCaMP)s878 and selective plane illumination microscopy (SPIM) revealed the presence of an atrioventricular (AV) block in popdc2 morphants as well as a complete heart block. Interestingly, preliminary data revealed that popdc3 morphants developed a similar phenotype. In order to find a morphological correlate for the observed AV conduction defect, I studied the structure of the AV canal in popdc2 morphants using confocal analysis of hearts of the transgenic line Tg(cmlc2:eGFP-ras)s883, which outlines individual cardiac myocytes with the help of membrane-localized GFP. However, no evidence for morphological alterations was obtained. To ensure that the observed arrhythmia phenotype in the popdc2 morphant was based on a myocardial defect and not caused by defective valve development, live imaging was performed revealing properly formed valves. Thus, in agreement with the data obtained in knockout mice, popdc2 and popdc3 genes in zebrafish are involved in the regulation of cardiac electrical activity. However, both genes are not required for cardiac pacemaking, but they play essential roles in AV conduction. In order to elucidate the biological importance of cAMP-binding, wild type Popdc1 as well as mutants with a significant reduction in binding affinity for cAMP in vitro were overexpressed in zebrafish embryos. Expression of wild type Popdc1 led to a cardiac insufficiency phenotype characterized by pericardial edema and venous blood retention. Strikingly, the ability of the Popdc1 mutants to induce a cardiac phenotype correlated with the binding affinity for cAMP. These data suggest that cAMP-binding represents an important biological property of the Popdc protein family.
Various types of cancer involve aberrant cell cycle regulation. Among the pathways responsible for tumor growth, the YAP oncogene, a key downstream effector of the Hippo pathway, is responsible for oncogenic processes including cell proliferation, and metastasis by controlling the expression of cell cycle genes. In turn, the MMB multiprotein complex (which is formed when B-MYB binds to the MuvB core) is a master regulator of mitotic gene expression, which has also been associated with cancer. Previously, our laboratory identified a novel crosstalk between the MMB-complex and YAP. By binding to enhancers of MMB target genes and promoting B-MYB binding to promoters, YAP and MMB co-regulate a set of mitotic and cytokinetic target genes which promote cell proliferation. This doctoral thesis addresses the mechanisms of YAP and MMB mediated transcription, and it characterizes the role of YAP regulated enhancers in transcription of cell cycle genes.
The results reported in this thesis indicate that expression of constitutively active, oncogenic YAP5SA leads to widespread changes in chromatin accessibility in untransformed human MCF10A cells. ATAC-seq identified that newly accessible and active regions include YAP-bound enhancers, while the MMB-bound promoters were found to be already accessible and remain open during YAP induction. By means of CRISPR-interference (CRISPRi) and chromatin immuniprecipitation (ChIP), we identified a role of YAP-bound enhancers in recruitment of CDK7 to MMB-regulated promoters and in RNA Pol II driven transcriptional initiation and elongation of G2/M genes. Moreover, by interfering with the YAP-B-MYB protein interaction, we can show that binding of YAP to B-MYB is also critical for the initiation of transcription at MMB-regulated genes. Unexpectedly, overexpression of YAP5SA also leads to less accessible chromatin regions or chromatin closing. Motif analysis revealed that the newly closed regions contain binding motifs for the p53 family of transcription factors. Interestingly, chromatin closing by YAP is linked to the reduced expression and loss of chromatin-binding of the p53 family member Np63. Furthermore, I demonstrate that downregulation of Np63 following expression of YAP is a key step in driving cellular migration.
Together, the findings of this thesis provide insights into the role of YAP in the chromatin changes that contribute to the oncogenic activities of YAP. The overexpression of YAP5SA not only leads to the opening of chromatin at YAP-bound enhancers which together with the MMB complex stimulate the expression of G2/M genes, but also promotes the closing of chromatin at ∆Np63 -bound regions in order to lead to cell migration.
Background: The frequency of the most observed cancer, Non Hodgkin Lymphoma (NHL), is further rising. Diffuse large B-cell lymphoma (DLBCL) is the most common of the NHLs. There are two subgroups of DLBCL with different gene expression patterns: ABC (“Activated B-like DLBCL”) and GCB (“Germinal Center B-like DLBCL”). Without therapy the patients often die within a few months, the ABC type exhibits the more aggressive behaviour. A further B-cell lymphoma is the Mantle cell lymphoma (MCL). It is rare and shows very poor prognosis. There is no cure yet. Methods: In this project these B-cell lymphomas were examined with methods from bioinformatics, to find new characteristics or undiscovered events on the molecular level. This would improve understanding and therapy of lymphomas. For this purpose we used survival, gene expression and comparative genomic hybridization (CGH) data. In some clinical studies, you get large data sets, from which one can reveal yet unknown trends. Results (MCL): The published proliferation signature correlates directly with survival. Exploratory analyses of gene expression and CGH data of MCL samples (n=71) revealed a valid grouping according to the median of the proliferation signature values. The second axis of correspondence analysis distinguishes between good and bad prognosis. Statistical testing (moderate t-test, Wilcoxon rank-sum test) showed differences in the cell cycle and delivered a network of kinases, which are responsible for the difference between good and bad prognosis. A set of seven genes (CENPE, CDC20, HPRT1, CDC2, BIRC5, ASPM, IGF2BP3) predicted, similarly well, survival patterns as proliferation signature with 20 genes. Furthermore, some bands could be associated with prognosis in the explorative analysis (chromosome 9: 9p24, 9p23, 9p22, 9p21, 9q33 and 9q34). Results (DLBCL): New normalization of gene expression data of DLBCL patients revealed better separation of risk groups by the 2002 published signature based predictor. We could achieve, similarly well, a separation with six genes. Exploratory analysis of gene expression data could confirm the subgroups ABC and GCB. We recognized a clear difference in early and late cell cycle stages of cell cycle genes, which can separate ABC and GCB. Classical lymphoma and best separating genes form a network, which can classify and explain the ABC and GCB groups. Together with gene sets which identify ABC and GCB we get a network, which can classify and explain the ABC and GCB groups (ASB13, BCL2, BCL6, BCL7A, CCND2, COL3A1, CTGF, FN1, FOXP1, IGHM, IRF4, LMO2, LRMP, MAPK10, MME, MYBL1, NEIL1 and SH3BP5; Altogether these findings are useful for diagnosis, prognosis and therapy (cytostatic drugs).
Applying microarray‐based techniques to study gene expression patterns: a bio‐computational approach
(2010)
The regulation and maintenance of iron homeostasis is critical to human health. As a constituent of hemoglobin, iron is essential for oxygen transport and significant iron deficiency leads to anemia. Eukaryotic cells require iron for survival and proliferation. Iron is part of hemoproteins, iron-sulfur (Fe-S) proteins, and other proteins with functional groups that require iron as a cofactor. At the cellular level, iron uptake, utilization, storage, and export are regulated at different molecular levels (transcriptional, mRNA stability, translational, and posttranslational). Iron regulatory proteins (IRPs) 1 and 2 post-transcriptionally control mammalian iron homeostasis by binding to iron-responsive elements (IREs), conserved RNA stem-loop structures located in the 5’- or 3‘- untranslated regions of genes involved in iron metabolism (e.g. FTH1, FTL, and TFRC). To identify novel IRE-containing mRNAs, we integrated biochemical, biocomputational, and microarray-based experimental approaches. Gene expression studies greatly contribute to our understanding of complex relationships in gene regulatory networks. However, the complexity of array design, production and manipulations are limiting factors, affecting data quality. The use of customized DNA microarrays improves overall data quality in many situations, however, only if for these specifically designed microarrays analysis tools are available. Methods In this project response to the iron treatment was examined under different conditions using bioinformatical methods. This would improve our understanding of an iron regulatory network. For these purposes we used microarray gene expression data. To identify novel IRE-containing mRNAs biochemical, biocomputational, and microarray-based experimental approaches were integrated. IRP/IRE messenger ribonucleoproteins were immunoselected and their mRNA composition was analysed using an IronChip microarray enriched for genes predicted computationally to contain IRE-like motifs. Analysis of IronChip microarray data requires specialized tool which can use all advantages of a customized microarray platform. Novel decision-tree based algorithm was implemented using Perl in IronChip Evaluation Package (ICEP). Results IRE-like motifs were identified from genomic nucleic acid databases by an algorithm combining primary nucleic acid sequence and RNA structural criteria. Depending on the choice of constraining criteria, such computational screens tend to generate a large number of false positives. To refine the search and reduce the number of false positive hits, additional constraints were introduced. The refined screen yielded 15 IRE-like motifs. A second approach made use of a reported list of 230 IRE-like sequences obtained from screening UTR databases. We selected 6 out of these 230 entries based on the ability of the lower IRE stem to form at least 6 out of 7 bp. Corresponding ESTs were spotted onto the human or mouse versions of the IronChip and the results were analysed using ICEP. Our data show that the immunoselection/microarray strategy is a feasible approach for screening bioinformatically predicted IRE genes and the detection of novel IRE-containing mRNAs. In addition, we identified a novel IRE-containing gene CDC14A (Sanchez M, et al. 2006). The IronChip Evaluation Package (ICEP) is a collection of Perl utilities and an easy to use data evaluation pipeline for the analysis of microarray data with a focus on data quality of custom-designed microarrays. The package has been developed for the statistical and bioinformatical analysis of the custom cDNA microarray IronChip, but can be easily adapted for other cDNA or oligonucleotide-based designed microarray platforms. ICEP uses decision tree-based algorithms to assign quality flags and performs robust analysis based on chip design properties regarding multiple repetitions, ratio cut-off, background and negative controls (Vainshtein Y, et al., 2010).
The eukaryotic parasite Trypanosoma brucei has evolved sophisticated strategies to persist within its mammalian host. Trypanosomes evade the hosts' immune system by antigenic variation of their surface coat, consisting of variant surface glycoproteins (VSGs). Out of a repertoire of thousands of VSG genes, only one is expressed at any given time from one of the 15 telomeric expression sites (ES). The VSG is stochastically exchanged either by a transcriptional switch of the active ES (in situ switch) or by a recombinational exchange of the VSG within the active ES. However, for infections to persist, the parasite burden has to be limited. The slender (sl) bloodstream form secretes the stumpy induction factor (SIF), which accumulates with rising parasitemia. SIF induces the irreversible developmental transition from the proliferative sl to the cell cycle-arrested but fly-infective stumpy (st) stage once a concentration threshold is reached. Thus, antigenic variation and st development ensure persistent infections and transmissibility. A previous study in monomorphic cells indicated that the attenuation of the active ES could be relevant for the development of trypanosomes. The present thesis investigated this hypothesis using the inducible overexpression of an ectopic VSG in pleomorphic trypanosomes, which possess full developmental competence. These studies revealed a surprising phenotypic plasticity: while the endogenous VSG was always down-regulated upon induction, the ESactivity determined whether the VSG overexpressors arrested in growth or kept proliferating. Full ES-attenuation induced the differentiation of bona fide st parasites independent of the cell density and thus represents the sole natural SIF-independent differentiation trigger to date. A milder decrease of the ES-activity did not induce phenotypic changes, but appeared to prime the parasites for SIF-induced differentiation. These results demonstrate that antigenic variation and development are linked and indicated that the ES and the VSG are independently regulated. Therefore, I investigated in the second part of my thesis how ES-attenuation and VSG-silencing can be mediated. Integration of reporters with a functional or defective VSG 3'UTR into different genomic loci showed that the maintenance of the active state of the ES depends on a conserved motif within the VSG 3'UTR. In situ switching was only triggered when the telomere-proximal motif was partially deleted, suggesting that it serves as a DNA-binding motif for a telomere-associated protein. The VSG levels seem to be additionally regulated in trans based on the VSG 3'UTR independent of the genomic context, which was reinforced by the regulation of a constitutively expressed reporter with VSG 3' UTR upon ectopic VSG overexpression.