TY  - JOUR
A1  - Staiger, Christine
A1  - Cadot, Sidney
A1  - Kooter, Raul
A1  - Dittrich, Marcus
A1  - Müller, Tobias
A1  - Klau, Gunnar W.
A1  - Wessels, Lodewyk F. A.
T1  - A Critical Evaluation of Network and Pathway-Based Classifiers for Outcome Prediction in Breast Cancer
JF  - PLoS One
N2  - Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single genes classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single genes classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single genes classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single genes sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single genes classifiers for predicting outcome in breast cancer.
KW  - modules
KW  - protein-interaction networks
KW  - expression signature
KW  - classification
KW  - set
KW  - metastasis
KW  - stability
KW  - survival
KW  - database
KW  - markers
Y1  - 2012
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-131323
VL  - 7
IS  - 4
ER  - 
TY  - JOUR
A1  - Merget, Benjamin
A1  - Koetschan, Christian
A1  - Hackl, Thomas
A1  - Förster, Frank
A1  - Dandekar, Thomas
A1  - Müller, Tobias
A1  - Schultz, Jörg
A1  - Wolf, Matthias
T1  - The ITS2 Database
JF  - Journal of Visual Expression
N2  - The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1 and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation.

The ITS2 Database presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank accurately reannotated. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold (direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.

The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE and ProfDistS for multiple sequence-structure alignment calculation and Neighbor Joining tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.

In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
KW  - homology modeling
KW  - molecular systematics
KW  - internal transcribed spacer 2
KW  - alignment
KW  - genetics
KW  - secondary structure
KW  - ribosomal RNA
KW  - phylogenetic tree
KW  - phylogeny
Y1  - 2012
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-124600
VL  - 61
IS  - e3806
ER  -