Refine
Has Fulltext
- yes (3)
Is part of the Bibliography
- yes (3)
Document Type
- Journal article (2)
- Doctoral Thesis (1)
Language
- English (3)
Keywords
- Genom (1)
- RT-PCR (1)
- SARS-CoV-2 (1)
- Venusfliegenfalle (1)
- alignment (1)
- antigen testing (1)
- asymptomatic screening (1)
- genetics (1)
- genome assembly (1)
- heterozygosity (1)
The ITS2 Database
(2012)
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1 and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation.
The ITS2 Database presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank accurately reannotated. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold (direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE and ProfDistS for multiple sequence-structure alignment calculation and Neighbor Joining tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
The Venus flytrap, \textit{Dionaea muscipula}, with its carnivorous life-style and its highly
specialized snap-traps has fascinated biologist since the days of Charles Darwin. The
goal of the \textit{D. muscipula} genome project is to gain comprehensive insights into the
genomic landscape of this remarkable plant.
The genome of the diploid Venus flytrap with an estimated size between 2.6 Gbp to
3.0 Gbp is comparatively large and comprises more than 70 % of repetitive regions.
Sequencing and assembly of genomes of this scale are even with state-of-the-art
technology and software challenging. Initial sequencing and assembly of the genome
was performed by the BGI (Beijing Genomics Institute) in 2011 resulting in a 3.7 Gbp
draft assembly. I started my work with thorough assessment of the delivered assembly
and data. My analysis showed that the BGI assembly is highly fragmented and
at the same time artificially inflated due to overassembly of repetitive sequences.
Furthermore, it only comprises about on third of the expected genes in full-length,
rendering it inadequate for downstream analysis.
In the following I sought to optimize the sequencing and assembly strategy to obtain
an assembly of higher completeness and contiguity by improving data quality and
assembly procedure and by developing tailored bioinformatics tools. Issues with
technical biases and high levels of heterogeneity in the original data set were solved
by sequencing additional short read libraries from high quality non-polymorphic DNA
samples. To address contiguity and heterozygosity I examined numerous alternative
assembly software packages and strategies and eventually identified ALLPATHS-LG
as the most suited program for assembling the data at hand. Moreover, by utilizing
digital normalization to reduce repetitive reads, I was able to substantially reduce
computational demands while at the same time significantly increasing contiguity of
the assembly.
To improve repeat resolution and scaffolding, I started to explore the novel PacBio
long read sequencing technology. Raw PacBio reads exhibit high error rates of 15 %
impeding their use for assembly. To overcome this issue, I developed the PacBio
hybrid correction pipeline proovread (Hackl et al., 2014). proovread uses high
coverage Illumina read data in an iterative mapping-based consensus procedure to
identify and remove errors present in raw PacBio reads. In terms of sensitivity and
accuracy, proovread outperforms existing software. In contrast to other correction
programs, which are incapable of handling data sets of the size of D. muscipula
project, proovread’s flexible design allows for the efficient distribution of work load on high-performance computing clusters, thus enabling the correction of the Venus
flytrap PacBio data set.
Next to the assembly process itself, also the assessment of the large de novo draft
assemblies, particularly with respect to coverage by available sequencing data, is
difficult. While typical evaluation procedures rely on computationally extensive
mapping approaches, I developed and implemented a set of tools that utilize k-mer
coverage and derived values to efficiently compute coverage landscapes of large-scale
assemblies and in addition allow for automated visualization of the of the obtained
information in comprehensive plots.
Using the developed tools to analyze preliminary assemblies and by combining my
findings regarding optimizations of the assembly process, I was ultimately able to
generate a high quality draft assembly for D. muscipula. I further refined the assembly
by removal of redundant contigs resulting from separate assembly of heterozygous
regions and additional scaffolding and gapclosing using corrected PacBio data. The
final draft assembly comprises 86 × 10 3 scaffolds and has a total size of 1.45 Gbp.
The difference to the estimated genomes size is well explained by collapsed repeats.
At the same time, the assembly exhibits high fractions full-length gene models,
corroborating the interpretation that the obtained draft assembly provides a complete
and comprehensive reference for further exploration of the fascinating biology of the
Venus flytrap.
Due to the lack of data on asymptomatic SARS-CoV-2-positive persons in healthcare institutions, they represent an inestimable risk. Therefore, the aim of the current study was to evaluate the first 1,000,000 reported screening tests of asymptomatic staff, patients, residents, and visitors in hospitals and long-term care (LTC) facilities in the State of Bavaria over a period of seven months. Data were used from the online database BayCoRei (Bavarian Corona Screening Tests), established in July 2020. Descriptive analyses were performed, describing the temporal pattern of persons that tested positive for SARS-CoV-2 by real-time polymerase chain reaction (RT-PCR) or antigen tests, stratified by facility. Until 15 March 2021, this database had collected 1,038,146 test results of asymptomatic subjects in healthcare facilities (382,240 by RT-PCR, and 655,906 by antigen tests). Of the RT-PCR tests, 2.2% (n = 8380) were positive: 3.0% in LTC facilities, 2.2% in hospitals, and 1.2% in rehabilitation institutions. Of the antigen tests, 0.4% (n = 2327) were positive: 0.5% in LTC facilities, and 0.3% in both hospitals and rehabilitation institutions, respectively. In LTC facilities and hospitals, infection surveillance using RT-PCR tests, or the less expensive but less sensitive, faster antigen tests, could facilitate the long-term management of the healthcare workforce, patients, and residents.