Refine
Has Fulltext
- yes (2)
Is part of the Bibliography
- yes (2)
Year of publication
- 2008 (2) (remove)
Document Type
- Doctoral Thesis (2)
Keywords
- evolution (2) (remove)
Institute
The human genome has been sequenced since 2001. Most proteins have been characterized now and with everyday more bioinformatical predictions are experimentally verified. A project is underway to sequence thousand humans. But still, little is known about the evolution of the human proteome itself. Domains and their combinations are analysed in detail but not all of the human domain architectures at once. Like no one before, we have large datasets of high quality human protein-protein-protein interactions and complexes available which allow us to characterize the human proteome with unmatched accuracy. Advanced clustering algorithms and computing power enable us to gain new information about protein interactions without touching a pipette. In this work, the human proteome is analysed at three different levels. First, the origin of the different types of proteins was analysed based on their domain architectures. The second part focuses on the protein-protein interactions. Finally, in the third part, proteins are clustered based on their interactions and non-interactions. Most proteins are built of domains and their function is the sum of their domain functions. Proteins that share the same domain architecture, the linear order of domains are homologues and should have originated from one common ancestral protein. This ancestor was calculated for roughly 750 000 proteins from 1313 species. The relations between the species are based on the NCBI Taxonomy and additional molecular data. The resulting data set of 5817 domains and 32868 domain architectures was used to estimate the origin of these proteins based on their architectures. It could be observed, that new domain architectures are only in a small fraction composed of domains arisen at the same taxon. It was also found that domain architectures increase in length and complexity in the course of evolution and that different organisms like worm, and human share nearly the same amount of proteins but differ in their number of distinct domain architectures. The second part of this thesis focuses on protein-protein interactions. This chapter addresses the question how new evolved proteins form connections within the existing network. The network built of protein-protein interactions was shown to be scale free. Scale free networks, like the internet, consist of few hubs with many connections and many nodes with few connections. They are thought to arise by two mechanisms. First, newly emerged proteins interact with proteins of the network. Second, according to the theory of preferential attachment, new proteins have a higher chance to interact with already interaction rich proteins. The Human Protein Reference Database provides an on in-vivo interaction data based network for human. With the data obtained from chapter one, proteins were marked with their taxon of origin based on their domain architectures. The interaction ratio of proteins of the same taxa compared to all interactions was calculated and higher values than the random model showed for nearly every taxa. On the other hand, there was no enrichment of proteins originated at the taxon of cellular organisms for the node degree found. The node degree is the number of links for this node. According to the theorie of preferential attachment the oldest nodes should have the most interactions and newly arisen proteins should be preferably attached to them not together. Both could not be shown in this analysis, preferential attachment could therefore not be the only explanation for the forming of the human protein interaction network. Finally in part three, proteins and all their interactions in the network are analysed. Protein networks can be divided into smaller highly interacting parts carrying out specific functions. This can be done with high statistical significance but still, it does not reflect the biological significance. Proteins were clustered based on their interactions and non-interactions with other proteins. A version with eleven clusters showed high gene ontology based ratings and clusters related to specific cell parts. One cluster consists of proteins having very few interactions together but many to proteins of two other clusters. This first cluster is significantly enriched with transport proteins and the two others are enriched with extracellular and cytoplasm/membrane located proteins. The algorithm seems therefore well suited to reflect the biological importance behind functional modules. Although we are still far from understanding the origin of species, this work has significantly contributed to a better understanding of evolution at the protein level and has, in particular, shown the relation of protein domains and protein architectures and their preferences for binding partners within interaction networks.
Untersuchung von gene-drive-Strategien als neue Interventionsstrategien zur Eindämmung der Malaria
(2008)
In der vorliegenden Arbeit haben wir unter Nutzung bioinformatischer Methoden eine innovative Strategie zur Eindämmung der Malaria entwickelt. Die genetische Modifikationsstrategie beinhaltet sowohl Manipulationen aufseiten des gefährlichsten Erregers, Plasmodium falciparum, als auch des Hauptvektors, Anopheles gambiae. In den Genomen beider Spezies wurden eine Reihe neuer konkreter targets identifiziert. Auch bereits beschriebene targets und Ansätze wurden in die Strategie einbezogen bzw. weiter ausgestaltet. Bezüglich der Vektormoskitos wird die Verbreitung eines gegenüber Plasmodien resistenten Genotyps angestrebt. Es werden einerseits effiziente natürliche und künstliche Resistenzgene diskutiert und andererseits eine bekannte Strategie zur Fixierung natürlicher Resistenzallele in natürlichen Populationen verbessert. Auf der Seite der Plasmodien erweiterten wir einen bereits von A. Burt (2003) beschriebenen Eradikationsansatz um weitere targets. Aus ethischen und evolutionsbiologischen Erwägungen bevorzugen wir jedoch eine alternative Strategie, welche die Etablierung von in ihrer Virulenz gemilderten Parasiten zum Ziel hat. Der attenuierte Genotyp wird unter anderem durch komplexe Pathway-Remodellierungen beschrieben (Löwe, Sauerborn, Schirmer, Dandekar, A refined genome engineering strategy against parasites and vectors, Manuskript beim Journal „Genome Biology“ eingereicht). Da sich Mutanten in der Natur gegen Wildtyp-Organismen kaum durchsetzen können, werden zwei drive-Systeme beschrieben, welche für die Implementierung der genetischen Manipulationsstrategie entwickelt wurden. Beide Konstrukte wurden zur Patentierung angemeldet (Patentanmeldung U30010 DPMA bzw. Aktenzeichen 102006029354.1). Zusätzlich zur deutschen wurde für eines der beiden Konstrukte eine PCT-Anmeldung eingereicht, welche in Zukunft einen internationalen Patentschutz ermöglichen soll. Es werden Kalkulationen vorgelegt, welche die Verbreitungstendenzen der Konstrukte in natürlichen Populationen vorhersagen. Die Beschreibung der entwickelten Konstrukte beschränkt sich nicht auf das primäre Anwendungsgebiet der Arbeit (Malaria), sondern beinhaltet auch andere Anwendungsgebiete, vor allem im Bereich der Medizin und Molekularbiologie.