Filtern
Volltext vorhanden
- ja (10)
Gehört zur Bibliographie
- ja (10)
Dokumenttyp
Sprache
- Englisch (10) (entfernen)
Schlagworte
- Microarray (10) (entfernen)
Institut
- Theodor-Boveri-Institut für Biowissenschaften (6)
- Pathologisches Institut (2)
- Graduate School of Life Sciences (1)
- Institut für Molekulare Infektionsbiologie (1)
- Julius-von-Sachs-Institut für Biowissenschaften (1)
- Lehrstuhl für Biochemie (1)
- Neurologische Klinik und Poliklinik (1)
- Rudolf-Virchow-Zentrum (1)
In order to test the effects of environmental factors on different characteristics of plant leaf waxes, barley plants (Hordeum vulgare) were abiotically stress treated (exposure to darkness, heavy metal, high salt concentrations and drought), and biotically stressed by the infection with powdery mildew (Blumeria graminis f.sp. hordei; Bgh). Different wax parameters like amount, chemical composition, and micromorphology of epicuticular wax crystals, were investigated. Etiolated leaves of barley showed distinctly reduced wax amounts and modifications in their relative composition. The alterations of these wax parameters might be a result of a developmental delay, which could have been caused by a decreased availability of energy for cellular processes, due to lack of light. Cadmium exposure led to a 1.5-fold increase of wax amount, while chemical composition was unaffected. In drought- and salt-stressed plants, all investigated leaf wax parameters remained unaltered. In each of the abiotic treatments, the microstructure of epicuticular wax crystals, formed as typical platelets, was not modified. Even after 6d infection with powdery mildew (Bgh), neither locally nor systemically enforced modifications of wax features were revealed.
The analyzed leave surfaces, resulting from these four abiotic and the biotic treatment (phenotypic approach), were compared to altered leaf surfaces’ characteristics of 18 analyzed eceriferum (cer-) wax mutants (genotypic approach). Within the screening, 5 mutants were selected which distinctly differed from the wild-type in wax amount, portions of epi- and intracuticular wax fraction, relative chemical composition, crystal morphology, and surface wettability (hydrophobicity).
Apart from quantitative and qualitative effects on the leaf waxes, environmentally enforced modifications in cuticular waxes might be reflected in molecular processes of wax biogenesis. Therefore, a barley wax-microarray was established. 254 genes were selected, which are putatively involved in processes of de novo fatty acid biosynthesis, fatty acid elongation, and modification, and which are supposed to take part in lipid-trafficking between cell compartments, and transport of wax components to the outer cell surface. The regulations within the expression pattern evoked by the respective treatments were correlated with the corresponding analytical wax data, and the observed molecular effects of a 3d powdery mildew infection were compared with succeeding fungal morphogenesis. Etiolation and cadmium exposition pointed to transcriptional modifications in the de novo fatty acid synthesis, and in the screened, transport-related mechanisms, which correlate with respective alterations in surface wax characteristics. Moderate changes in the gene expression pattern, evoked by drought- and salinity-stress, might give hints for evolved adaptations in barley to such common habitat stresses. Theinvasion of powdery mildew into the epidermal host cells was reflected in the regulation of several genes. Beside other functions, these genes take part in pathogen defense, and intracellular component transport, or they encode transcription factors. The different modifications within the molecular responses evoked by the investigated abiotic treatments, and the effects of powdery mildew infection representing a biotic stressor, were compared between the different treatments.
In order to test the potential impact of different wax parameters on Bgh, conidia germination and differentiation was comparably investigated on leaf surfaces of abiotically stressed wild-type and cer-mutants, isolated cuticles, and further artificial surfaces. The rates of conidial development were similar on each of the leaf surfaces resulting from the abiotic treatments, while a significant reduction of the germination and differentiation success was revealed for the wax mutant cer-yp.949. Compared to the wild-type, developmental rates on isolated cuticles and extracted leaf waxes of the mutant cer-yp.949 indicated a modified embedding of cuticular waxes, and a possibly changed three-dimensional structure of the cer-yp.949 cuticle, which might explain the reduced conidial developmental rates on leaf surfaces of this particular mutant.
Experiments with Bgh conidia on mechanically de-waxed leaf surfaces (selective mechanical removal of the epicuticular leaf waxes with glue-like gum arabic, followed by an extraction of the intracuticular wax portion with chloroform) demonstrated the importance of the wax coverage for the germination and differentiation of the fungal conidia. On all dewaxed leaf surfaces, except those of cer-yp.949, the differentiation success of the germlings was significantly reduced, by about 20% (“wax-effect”). This result was verified through an artificial system with increased conidia developmental rates on glass slides covered with extracted leaf waxes. Further comparative tests with the major components of barley leaf wax, hexacosanol and hexacosanal, showed that the germination and differentiation of powdery mildew conidia not only depends on the different chemistry, but is also influenced by the respective surface hydrophobicity. Compared to hexacosanol, on hexacosanal coated glass surfaces, higher germination and differentiation rates were achieved, which correlated with increased levels of surface hydrophobicity. Developmental rates of conidia on hydrophobic foils demonstrated that hydrophobicity, as a sole surface factor, may stimulate the conidial germination and differentiation processes. Moreover, the survival of conidia on artificial surfaces is determined by additional surface derived factors, e.g. the availability of water, and a pervadable matrix.
The IronChip evaluation package: a package of perl modules for robust analysis of custom microarrays
(2010)
Background: Gene expression studies greatly contribute to our understanding of complex relationships in gene regulatory networks. However, the complexity of array design, production and manipulations are limiting factors, affecting data quality. The use of customized DNA microarrays improves overall data quality in many situations, however, only if for these specifically designed microarrays analysis tools are available. Results: The IronChip Evaluation Package (ICEP) is a collection of Perl utilities and an easy to use data evaluation pipeline for the analysis of microarray data with a focus on data quality of custom-designed microarrays. The package has been developed for the statistical and bioinformatical analysis of the custom cDNA microarray IronChip but can be easily adapted for other cDNA or oligonucleotide-based designed microarray platforms. ICEP uses decision tree-based algorithms to assign quality flags and performs robust analysis based on chip design properties regarding multiple repetitions, ratio cut-off, background and negative controls. Conclusions: ICEP is a stand-alone Windows application to obtain optimal data quality from custom-designed microarrays and is freely available here (see “Additional Files” section) and at: http://www.alice-dsl.net/evgeniy. vainshtein/ICEP/
Applying microarray‐based techniques to study gene expression patterns: a bio‐computational approach
(2010)
The regulation and maintenance of iron homeostasis is critical to human health. As a constituent of hemoglobin, iron is essential for oxygen transport and significant iron deficiency leads to anemia. Eukaryotic cells require iron for survival and proliferation. Iron is part of hemoproteins, iron-sulfur (Fe-S) proteins, and other proteins with functional groups that require iron as a cofactor. At the cellular level, iron uptake, utilization, storage, and export are regulated at different molecular levels (transcriptional, mRNA stability, translational, and posttranslational). Iron regulatory proteins (IRPs) 1 and 2 post-transcriptionally control mammalian iron homeostasis by binding to iron-responsive elements (IREs), conserved RNA stem-loop structures located in the 5’- or 3‘- untranslated regions of genes involved in iron metabolism (e.g. FTH1, FTL, and TFRC). To identify novel IRE-containing mRNAs, we integrated biochemical, biocomputational, and microarray-based experimental approaches. Gene expression studies greatly contribute to our understanding of complex relationships in gene regulatory networks. However, the complexity of array design, production and manipulations are limiting factors, affecting data quality. The use of customized DNA microarrays improves overall data quality in many situations, however, only if for these specifically designed microarrays analysis tools are available. Methods In this project response to the iron treatment was examined under different conditions using bioinformatical methods. This would improve our understanding of an iron regulatory network. For these purposes we used microarray gene expression data. To identify novel IRE-containing mRNAs biochemical, biocomputational, and microarray-based experimental approaches were integrated. IRP/IRE messenger ribonucleoproteins were immunoselected and their mRNA composition was analysed using an IronChip microarray enriched for genes predicted computationally to contain IRE-like motifs. Analysis of IronChip microarray data requires specialized tool which can use all advantages of a customized microarray platform. Novel decision-tree based algorithm was implemented using Perl in IronChip Evaluation Package (ICEP). Results IRE-like motifs were identified from genomic nucleic acid databases by an algorithm combining primary nucleic acid sequence and RNA structural criteria. Depending on the choice of constraining criteria, such computational screens tend to generate a large number of false positives. To refine the search and reduce the number of false positive hits, additional constraints were introduced. The refined screen yielded 15 IRE-like motifs. A second approach made use of a reported list of 230 IRE-like sequences obtained from screening UTR databases. We selected 6 out of these 230 entries based on the ability of the lower IRE stem to form at least 6 out of 7 bp. Corresponding ESTs were spotted onto the human or mouse versions of the IronChip and the results were analysed using ICEP. Our data show that the immunoselection/microarray strategy is a feasible approach for screening bioinformatically predicted IRE genes and the detection of novel IRE-containing mRNAs. In addition, we identified a novel IRE-containing gene CDC14A (Sanchez M, et al. 2006). The IronChip Evaluation Package (ICEP) is a collection of Perl utilities and an easy to use data evaluation pipeline for the analysis of microarray data with a focus on data quality of custom-designed microarrays. The package has been developed for the statistical and bioinformatical analysis of the custom cDNA microarray IronChip, but can be easily adapted for other cDNA or oligonucleotide-based designed microarray platforms. ICEP uses decision tree-based algorithms to assign quality flags and performs robust analysis based on chip design properties regarding multiple repetitions, ratio cut-off, background and negative controls (Vainshtein Y, et al., 2010).
In initial experiments, the well characterized VACV strain GLV-1h68 and three wild-type LIVP isolates were utilized to analyze gene expression in a pair of autologous human melanoma cell lines (888-MEL and 1936 MEL) after infection. Microarray analyses, followed by sequential statistical approaches, characterized human genes whose transcription is affected specifically by VACV infection. In accordance with the literature, those genes were involved in broad cellular functions, such as cell death, protein synthesis and folding, as well as DNA replication, recombination, and repair. In parallel to host gene expression, viral gene expression was evaluated with help of customized VACV array platforms to get better insight over the interplay between VACV and its host. Our main focus was to compare host and viral early events, since virus genome replication occurs early after infection. We observed that viral transcripts segregated in a characteristic time-specific pattern, consistent with the three temporal expression classes of VACV genes, including a group of genes which could be classified as early-stage genes. In this work, comparison of VACV early replication and respective early gene transcription led to the identification of seven viral genes whose expression correlated strictly with replication. We considered the early expression of those seven genes to be representative for VACV replication and we therefore referred to them as viral replication indicators (VRIs). To explore the relationship between host cell transcription and viral replication, we correlated viral (VRI) and human early gene expression. Correlation analysis revealed a subset of 114 human transcripts whose early expression tightly correlated with early VRI expression and thus early viral replication. These 114 human molecules represented an involvement in broad cellular functions. We found at least six out of 114 correlates to be involved in protein ubiquitination or proteasomal function. Another molecule of interest was the serine-threonine protein kinase WNK lysine-deficient protein kinase 1 (WNK1). We discovered that WNK1 features differences on several molecular biological levels associated with permissiveness to VACV infection. In addition to that, a set of human genes was identified with possible predictive value for viral replication in an independent dataset. A further objective of this work was to explore baseline molecular biological variances associated with permissiveness which could help identifying cellular components that contribute to the formation of a permissive phenotype. Therefore, in a subsequent approach, we screened a set of 15 melanoma cell lines (15-MEL) regarding their permissiveness to GLV-1h68, evaluated by GFP expression levels, and classified the top four and lowest four cell lines into high and low permissive group, respectively. Baseline gene transcriptional data, comparing low and highly permissive group, suggest that differences between the two groups are at least in part due to variances in global cellular functions, such as cell cycle, cell growth and proliferation, as well as cell death and survival. We also observed differences in the ubiquitination pathway, which is consistent with our previous results and underlines the importance of this pathway in VACV replication and permissiveness. Moreover, baseline microRNA (miRNA) expression between low and highly permissive group was considered to provide valuable information regarding virus-host co-existence. In our data set, we identified six miRNAs that featured varying baseline expression between low and highly permissive group. Finally, copy number variations (CNVs) between low and highly permissive group were evaluated. In this study, when investigating differences in the chromosomal aberration patterns between low and highly permissive group, we observed frequent segmental amplifications within the low permissive group, whereas the same regions were mostly unchanged in the high group. Taken together, our results highlight a probable correlation between viral replication, early gene expression, and the respective host response and thus a possible involvement of human host factors in viral early replication. Furthermore, we revealed the importance of cellular baseline composition for permissiveness to VACV infection on different molecular biological levels, including mRNA expression, miRNA expression, as well as copy number variations. The characterization of human target genes that influence viral replication could help answering the question of host cell response to oncolytic virotherapy and provide important information for the development of novel recombinant vaccinia viruses with improved features to enhance replication rate and hence trigger therapeutic outcome.
The Nuclear Factors of Activated T cells (NFATs) are critical transcription factors that direct gene expression in immune and non-immune cells. Interaction of T cells with Ag-presenting cells results in the clustering of T-cell antigen receptor (TCR), co-receptors and integrins. Subsequent signal transduction resulting in NFAT activation leads to cytokine gene expression. Among the NFATs expressed in T cells, NFATc1 shows a unique induction property, which is essential for T cell differentiation and activation. It was revealed before that 3 major isoforms of NFATc1 are generated in activated T cells – the inducible short NFATc1/A, and the longer isoforms NFATc1/B and C. However, due to alternative splicing events and the existence of two different promoters and two alternative polyadenylation, we show here that 6 isoforms are synthesized in T cells which differ in their N-terminal and C-terminal peptides. In these experiments, we have identified these 6 isoforms by semi-quantitative long distance RT-PCR in several T cells subsets, and the inducible properties of 6 isoforms were investigated in those cells. The short NFATc1/A which is under control of the P1 promoter and the proximal pA1 polyadenylation site was the most prominent and inducible isoform in T effector cells. The transcription of the longer NFATc1/B and C isoforms is constitutive and even reduced in activated T lymphocytes. In addition to NFATc1 autoregulation, we tried to understand the NFATc1 gene regulation under the control of PKC pathways by microarray analysis. Compared to treatment of T cells with ionomycin alone (which enhances Ca++ flux), treatment of cells with the phorbolester TPA (leading to PKC activation) enhanced the induction of NFATc1. Microarray analysis revealed that PKC activation increased the transcription of NF-B1, Fos and JunB, which are important transcription factors binding to the regulatory regions of the NFATc1 gene. Besides the promoting effect of these transcription factors, we provided evidence that p53 and its targeting gene, Gadd45, exerted a negative effect on NFATc1 gene transcription. Summarizing all these results, we drew novel conclusions on NFATc1 expression, which provide a more detailed view on the regulatory mechanisms of NFATc1 transcription. Considering the high transcription and strong expression of NFATc1 in various human lymphomas, we propose that similar to NF-B, NFATc1/A plays a pivotal role in lymphomagenesis.
Immune-mediated polyneuropathies like chronic inflammatory demyelinating polyradiculoneuropathy or Guillain-Barré syndrome are rare diseases of the peripheral nervous system. A subgroup of patients harbors autoantibodies against nodal or paranodal antigens, associated with a distinct phenotype and treatment response. In a part of patients with pathologic paranodal or nodal immunoreactivity the autoantigens remain difficult or impossible to determine owing to limitations of the used detection approach - usually ELISAs (enzyme-linked-immunosorbent-assays) - and incomplete knowledge of the possible autoantigens. Due to their high-throughput, low sample consumption and high sensitivity as well as the possibility to display many putative nodal and paranodal autoantigens simultaneously, peptide microarray-based approaches are prime candidates for the discovery of novel autoantigens, point-of-care diagnostics and, in addition, monitoring of pathologic autoimmune response. Current applications of peptide microarrays are however limited by high false-positive rates and the associated need for detailed follow-up studies and validation. Here, robust peptide microarray-based detection of antibodies and the efficient validation of binding signals by on-chip neutralization is demonstrated. First, autoantigens were displayed as overlapping peptide libraries in microarray format. Copies of the biochips were used for the fine mapping of antibody epitopes. Next, binding signals were validated by antibody neutralization in solution. Since neutralizing peptides are obtained in the process of microarray fabrications, neither throughput nor costs are significantly altered. Similar in-situ validation approaches could contribute to future autoantibody characterization and detection methods as well as to therapeutic research. Areas of application could be expanded to any autoimmune-mediated neurological disease as a long-term vision.
Recent progresses and developments in molecular biology provide a wealth of new but insufficiently characterised data. This fund comprises amongst others biological data of genomic DNA, protein sequences, 3-dimensional protein structures as well as profiles of gene expression. In the present work, this information is used to develop new methods for the characterisation and classification of organisms and whole groups of organisms as well as to enhance the automated gain and transfer of information. The first two presented approaches (chapters 4 und 5) focus on the medically and scientifically important enterobacteria. Its impact in medicine and molecular biology is founded in versatile mechanisms of infection, their fundamental function as a commensal inhabitant of the intestinal tract and their use as model organisms as they are easy to cultivate. Despite many studies on single pathogroups with clinical distinguishable pathologies, the genotypic factors that contribute to their diversity are still partially unknown. The comprehensive genome comparison described in Chapter 4 was conducted with numerous enterobacterial strains, which cover nearly the whole range of clinically relevant diversity. The genome comparison constitutes the basis of a characterisation of the enterobacterial gene pool, of a reconstruction of evolutionary processes and of comprehensive analysis of specific protein families in enterobacterial subgroups. Correspondence analysis, which is applied for the first time in this context, yields qualitative statements to bacterial subgroups and the respective, exclusively present protein families. Specific protein families were identified for the three major subgroups of enterobacteria namely the genera Yersinia and Salmonella as well as to the group of Shigella and E. coli by applying statistical tests. In conclusion, the genome comparison-based methods provide new starting points to infer specific genotypic traits of bacterial groups from the transfer of functional annotation. Due to the high medical importance of enterobacterial isolates their classification according to pathogenicity has been in focus of many studies. The microarray technology offers a fast, reproducible and standardisable means of bacterial typing and has been proved in bacterial diagnostics, risk assessment and surveillance. The design of the diagnostic microarray of enterobacteria described in chapter 5 is based on the availability of numerous enterobacterial genome sequences. A novel probe selection strategy based on the highly efficient algorithm of string search, which considers both coding and non-coding regions of genomic DNA, enhances pathogroup detection. This principle reduces the risk of incorrect typing due to restrictions to virulence-associated capture probes. Additional capture probes extend the spectrum of applications of the microarray to simultaneous diagnostic or surveillance of antimicrobial resistance. Comprehensive test hybridisations largely confirm the reliability of the selected capture probes and its ability to robustly classify enterobacterial strains according to pathogenicity. Moreover, the tests constitute the basis of the training of a regression model for the classification of pathogroups and hybridised amounts of DNA. The regression model features a continuous learning capacity leading to an enhancement of the prediction accuracy in the process of its application. A fraction of the capture probes represents intergenic DNA and hence confirms the relevance of the underlying strategy. Interestingly, a large part of the capture probes represents poorly annotated genes suggesting the existence of yet unconsidered factors with importance to the formation of respective virulence phenotypes. Another major field of microarray applications is gene expression analysis. The size of gene expression databases rapidly increased in recent years. Although they provide a wealth of expression data, it remains challenging to integrate results from different studies. In chapter 6 the methodology of an unsupervised meta-analysis of genome-wide A. thaliana gene expression data sets is presented, which yields novel insights in function and regulation of genes. The application of kernel-based principal component analysis in combination with hierarchical clustering identified three major groups of contrasts each sharing overlapping expression profiles. Genes associated with two groups are known to play important roles in Indol-3 acetic acid (IAA) mediated plant growth and development as well as in pathogen defence. Yet uncharacterised serine-threonine kinases could be assigned to novel functions in pathogen defence by meta-analysis. In general, hidden interrelation between genes regulated under different conditions could be unravelled by the described approach. HMMs are applied to the functional characterisation of proteins or the detection of genes in genome sequences. Although HMMs are technically mature and widely applied in computational biology, I demonstrate the methodical optimisation with respect to the modelling accuracy on biological data with various distributions of sequence lengths. The subunits of these models, the states, are associated with a certain holding time being the link to length distributions of represented sequences. An adaptation of simple HMM topologies to bell-shaped length distributions described in chapter 7 was achieved by serial chain-linking of single states, while residing in the class of conventional HMMs. The impact of an optimisation of HMM topologies was underlined by performance evaluations with differently adjusted HMM topologies. In summary, a general methodology was introduced to improve the modelling behaviour of HMMs by topological optimisation with maximum likelihood and a fast and easily implementable moment estimator. Chapter 8 describes the application of HMMs to the prediction of interaction sites in protein domains. As previously demonstrated, these sites are not trivial to predict because of varying degree in conservation of their location and type within the domain family. The prediction of interaction sites in protein domains is achieved by a newly defined HMM topology, which incorporates both sequence and structure information. Posterior decoding is applied to the prediction of interaction sites providing additional information of the probability of an interaction for all sequence positions. The implementation of interaction profile HMMs (ipHMMs) is based on the well established profile HMMs and inherits its known efficiency and sensitivity. The large-scale prediction of interaction sites by ipHMMs explained protein dysfunctions caused by mutations that are associated to inheritable diseases like different types of cancer or muscular dystrophy. As already demonstrated by profile HMMs, the ipHMMs are suitable for large-scale applications. Overall, the HMM-based method enhances the prediction quality of interaction sites and improves the understanding of the molecular background of inheritable diseases. With respect to current and future requirements I provide large-scale solutions for the characterisation of biological data in this work. All described methods feature a highly portable character, which allows for the transfer to related topics or organisms, respectively. Special emphasis was put on the knowledge transfer facilitated by a steadily increasing wealth of biological information. The applied and developed statistical methods largely provide learning capacities and hence benefit from the gain of knowledge resulting in increased prediction accuracies and reliability.
In this thesis, the development of a phylogenetic DNA microarray, the analysis of several gene expression microarray datasets and new approaches for improved data analysis and interpretation are described. In the first publication, the development and analysis of a phylogenetic microarray is presented. I could show that species detection with phylogenetic DNA microarrays can be significantly improved when the microarray data is analyzed with a linear regression modeling approach. Standard methods have so far relied on pure signal intensities of the array spots and a simple cutoff criterion was applied to call a species present or absent. This procedure is not applicable to very closely related species with high sequence similarity because cross-hybridization of non-target DNA renders species detection impossible based on signal intensities alone. By modeling hybridization and cross-hybridization with linear regression, as I have presented in this thesis, even species with a sequence similarity of 97% in the marker gene can be detected and distinguished from related species. Another advantage of the modeling approach over existing methods is that the model also performs well on mixtures of different species. In principle, also quantitative predictions can be made. To make better use of the large amounts of microarray data stored in public databases, meta-analysis approaches need to be developed. In the second publication, an explorative meta-analysis exemplified on Arabidopsis thaliana gene expression datasets is presented. Integrating datasets studying effects such as the influence of plant hormones, pathogens and different mutations on gene expression levels, clusters of similarly treated datasets could be found. From the clusters of pathogen-treated and indole-3-acetic acid (IAA) treated datasets, representative genes were selected which pointed to functions which had been associated with pathogen attack or IAA effects previously. Additionally, hypotheses about the functions of so far uncharacterized genes could be set up. Thus, this kind of meta-analysis could be used to propose gene functions and their regulation under different conditions. In this work, also primary data analysis of Arabidopsis thaliana datasets is presented. In the third publication, an experiment which was conducted to find out if microwave irradiation has an effect on the gene expression of a plant cell culture is described. During the first steps, the data analysis was carried out blinded and exploratory analysis methods were applied to find out if the irradiation had an effect on gene expression of plant cells. Small but statistically significant changes in a few genes were found and could be experimentally confirmed. From the functions of the regulated genes and a meta-analysis with publicly available microarray data, it could be suspected that the plant cell culture somehow perceived the irradiation as energy, similar to perceiving light rays. The fourth publication describes the functional analysis of another Arabidopsis thaliana gene expression dataset. The gene expression data of the plant tumor dataset pointed to a switch from a mainly aerobic, auxotrophic to an anaerobic and heterotrophic metabolism in the plant tumor. Genes involved in photosynthesis were found to be repressed in tumors; genes of amino acid and lipid metabolism, cell wall and solute transporters were regulated in a way that sustains tumor growth and development. Furthermore, in the fifth publication, GEPAT (Genome Expression Pathway Analysis Tool), a tool for the analysis and integration of microarray data with other data types, is described. It consists of a web application and database which allows comfortable data upload and data analysis. In later chapters of this thesis (publication 6 and publication 7), GEPAT is used to analyze human microarray datasets and to integrate results from gene expression analysis with other datatypes. Gene expression and comparative genomic hybridization data from 71 Mantle Cell Lymphoma (MCL) patients was analyzed and allowed proposing a seven gene predictor which facilitates survival predictions for patients compared to existing predictors. In this study, it was shown that CGH data can be used for survival predictions. For the dataset of Diffuse Large B-cell lymphoma (DLBCL) patients, an improved survival predictor could be found based on the gene expression data. From the genes differentially expressed between long and short surviving MCL patients as well as for regulated genes of DLBCL patients, interaction networks could be set up. They point to differences in regulation for cell cycle and proliferation genes between patients with good and bad prognosis.
Recent studies have shown aberrant expression of SOX11 in various types of aggressive B-cell neoplasms. To elucidate the molecular mechanisms leading to such deregulation, we performed a comprehensive SOX11 gene expression and epigenetic study in stem cells, normal hematopoietic cells and different lymphoid neoplasms. We observed that SOX11 expression is associated with unmethylated DNA and presence of activating histone marks (H3K9/14Ac and H3K4me3) in embryonic stem cells and some aggressive B-cell neoplasms. In contrast, adult stem cells, normal hematopoietic cells and other lymphoid neoplasms do not express SOX11. Such repression was associated with silencing histone marks H3K9me2 and H3K27me3. The SOX11 promoter of non-malignant cells was consistently unmethylated whereas lymphoid neoplasms with silenced SOX11 tended to acquire DNA hypermethylation. SOX11 silencing in cell lines was reversed by the histone deacetylase inhibitor SAHA but not by the DNA methyltransferase inhibitor AZA. These data indicate that, although DNA hypermethylation of SOX11 is frequent in lymphoid neoplasms, it seems to be functionally inert, as SOX11 is already silenced in the hematopoietic system. In contrast, the pathogenic role of SOX11 is associated with its de novo expression in some aggressive lymphoid malignancies, which is mediated by a shift from inactivating to activating histone modifications.
DNA microarrays have become a standard technique to assess the mRNA levels for complete genomes. To identify significantly regulated genes from these large amounts of data a wealth of methods has been developed. Despite this, the functional interpretation (i.e. deducing biological hypothesis from the data) still remains a major bottleneck in microarray data analysis. Most available methods display the set of significant genes in long lists, from which common functional properties have to be extracted. This is not only a tedious and time-consuming task, which becomes less and less feasible with increasing numbers of experimental conditions, but is also prone to errors, since it is commonly done by eye. In the course of this work methods have been developed and tested, that allow for a computerbased analysis of functional properties being relevant in the given experimental setting. To this end the Gene Ontology was chosen as an appropriate source of annotation data, because it combines human-readability with computer-accessibility of the annotations term and thus allows for a statistical analysis of functional properties. Here the gene-annotations are integrated in a Correspondence Analysis which allows to visualize genes, hybridizations and functional categories in a single plot. Due to the increasing amounts of available annotations and the fact that in most settings only few functional processes are differentially regulated, several filter criteria have been developed to reduce the number of displayed annotations to a set being relevant in the given experimental setting. The applicability of the presented visualization and filtering have both been validated on datasets of varying complexity. Starting from the well studied glucose-pathway in S. cerevisiae up to the comparison of different tumor types in human. In both settings the method generated well interpretable plots, which allowed for an immediate identification of the major functional differences between the experimental conditions [90]. While the integration of annotation data like GO facilitates functional interpretation, it lacks the capability to identify key regulatory elements. To facilitate such an analysis, the occurrence of transcription factor binding sites in upstream regions of genes has been integrated to the analysis as well. Again this methodology was biologically validated on S. cerevisiae as well human cancer data sets. In both settings TFs known to exhibit central roles for the observed transcriptional changes were plotted in marked positions and thus could be immediately identified [206]. In essence, integration of supplementary information in Correspondence Analysis visualizes genes, hybridizations and annotation data in a single, well interpretable plot. This allows for an intuitive identification of relevant annotations even in complex experimental settings. The presented approach is not limited to the shown types of data, but is generalizable to account for the majority of the available annotation data.