TY - JOUR A1 - Merget, Benjamin A1 - Koetschan, Christian A1 - Hackl, Thomas A1 - Förster, Frank A1 - Dandekar, Thomas A1 - Müller, Tobias A1 - Schultz, Jörg A1 - Wolf, Matthias T1 - The ITS2 Database JF - Journal of Visual Expression N2 - The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1 and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation. The ITS2 Database presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank accurately reannotated. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold (direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold. The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE and ProfDistS for multiple sequence-structure alignment calculation and Neighbor Joining tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure. In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses. KW - homology modeling KW - molecular systematics KW - internal transcribed spacer 2 KW - alignment KW - genetics KW - secondary structure KW - ribosomal RNA KW - phylogenetic tree KW - phylogeny Y1 - 2012 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-124600 VL - 61 IS - e3806 ER - TY - JOUR A1 - Pawellek, Ruben A1 - Krmar, Jovana A1 - Leistner, Adrian A1 - Djajić, Nevena A1 - Otašević, Biljana A1 - Protić, Ana A1 - Holzgrabe, Ulrike T1 - Charged aerosol detector response modeling for fatty acids based on experimental settings and molecular features: a machine learning approach JF - Journal of Cheminformatics N2 - The charged aerosol detector (CAD) is the latest representative of aerosol-based detectors that generate a response independent of the analytes' chemical structure. This study was aimed at accurately predicting the CAD response of homologous fatty acids under varying experimental conditions. Fatty acids from C12 to C18 were used as model substances due to semivolatile characterics that caused non-uniform CAD behaviour. Considering both experimental conditions and molecular descriptors, a mixed quantitative structure-property relationship (QSPR) modeling was performed using Gradient Boosted Trees (GBT). The ensemble of 10 decisions trees (learning rate set at 0.55, the maximal depth set at 5, and the sample rate set at 1.0) was able to explain approximately 99% (Q\(^2\): 0.987, RMSE: 0.051) of the observed variance in CAD responses. Validation using an external test compound confirmed the high predictive ability of the model established (R-2: 0.990, RMSEP: 0.050). With respect to the intrinsic attribute selection strategy, GBT used almost all independent variables during model building. Finally, it attributed the highest importance to the power function value, the flow rate of the mobile phase, evaporation temperature, the content of the organic solvent in the mobile phase and the molecular descriptors such as molecular weight (MW), Radial Distribution Function-080/weighted by mass (RDF080m) and average coefficient of the last eigenvector from distance/detour matrix (Ve2_D/Dt). The identification of the factors most relevant to the CAD responsiveness has contributed to a better understanding of the underlying mechanisms of signal generation. An increased CAD response that was obtained for acetone as organic modifier demonstrated its potential to replace the more expensive and environmentally harmful acetonitrile. KW - High-performance liquid chromatography (HPLC) KW - Charged aerosol detector (CAD) KW - Gradient boosted trees (GBT) KW - Quantitative structure-property relationship modeling (QSPR) KW - Fatty acids Y1 - 2021 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-261618 VL - 13 IS - 1 ER -