TY - JOUR A1 - Pook, Torsten A1 - Freudenthal, Jan A1 - Korte, Arthur A1 - Simianer, Henner T1 - Using Local Convolutional Neural Networks for Genomic Prediction JF - Frontiers in Genetics N2 - The prediction of breeding values and phenotypes is of central importance for both livestock and crop breeding. In this study, we analyze the use of artificial neural networks (ANN) and, in particular, local convolutional neural networks (LCNN) for genomic prediction, as a region-specific filter corresponds much better with our prior genetic knowledge on the genetic architecture of traits than traditional convolutional neural networks. Model performances are evaluated on a simulated maize data panel (n = 10,000; p = 34,595) and real Arabidopsis data (n = 2,039; p = 180,000) for a variety of traits based on their predictive ability. The baseline LCNN, containing one local convolutional layer (kernel size: 10) and two fully connected layers with 64 nodes each, is outperforming commonly proposed ANNs (multi layer perceptrons and convolutional neural networks) for basically all considered traits. For traits with high heritability and large training population as present in the simulated data, LCNN are even outperforming state-of-the-art methods like genomic best linear unbiased prediction (GBLUP), Bayesian models and extended GBLUP, indicated by an increase in predictive ability of up to 24%. However, for small training populations, these state-of-the-art methods outperform all considered ANNs. Nevertheless, the LCNN still outperforms all other considered ANNs by around 10%. Minor improvements to the tested baseline network architecture of the LCNN were obtained by increasing the kernel size and of reducing the stride, whereas the number of subsequent fully connected layers and their node sizes had neglectable impact. Although gains in predictive ability were obtained for large scale data sets by using LCNNs, the practical use of ANNs comes with additional problems, such as the need of genotyping all considered individuals, the lack of estimation of heritability and reliability. Furthermore, breeding values are additive by design, whereas ANN-based estimates are not. However, ANNs also comes with new opportunities, as networks can easily be extended to account for additional inputs (omics, weather etc.) and outputs (multi-trait models), and computing time increases linearly with the number of individuals. With advances in high-throughput phenotyping and cheaper genotyping, ANNs can become a valid alternative for genomic prediction. KW - phenotype prediction KW - Keras KW - genomic selection KW - selection KW - breeding KW - machine learning KW - deep learning Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-216436 VL - 11 ER - TY - THES A1 - Freudenthal, Jan Alexander T1 - Quantitative genetics from genome assemblies to neural network aided omics-based prediction of complex traits T1 - Quantitative Genetik von Genomassemblierungen bis zur genomischen Vorhersage von phänotypischen Merkmalen mit Hilfe von künstlichen neuronalen Netzwerken N2 - Quantitative genetics is the study of continuously distributed traits and their ge- netic components. Recent developments in DNA sequencing technologies and computational systems allow researchers to conduct large scale in silico studies. However, going from raw DNA reads to genomic prediction of quantitative traits with the help of neural networks is a long and error-prone process. In the course of this thesis, many steps involved in this process will be assessed in depth. Chap- ter 2 will feature a study that compares the landscape of chloroplast genome as- sembly tools. Chapter 3 will present a software to perform genome-wide associa- tion studies using modern tools, which allow GWAS-Flow to outperform current state of the art software packages. Chapter 4 will give an in depth introduc- tion to machine learning and the nature of quantitative traits and will combine those to genomic prediction with artificial neural networks and compares the re- sults to those of algorithms based on linear mixed models. Finally, in Chapter 5 the results from the previous chapters are summarized and used to elucidate the complex nature of studies concerning quantitative genetics. N2 - Quantitative Genetik beschäftigt sich mit kontinuierlich verteilten Merkmalen und deren genetischer Komponenten. In den letzten Jahren gab es vielfältige Entwicklungen in der Computertechnik und der Genomik, insbesondere der DNA Sequenzierung, was Forschern erlaubt großflächig angelegte in silico Studien durchzuführen. Jedoch ist es ein komplexer Prozess von rohen Sequenzdaten bis zur genomischen Vorhersage mit Hilfe von neuronalen Netzwerken zu kommen. Im Rahmen der vorliegenden Studien werden viele Schritte, die an diesem Prozess beteiligt sind beleuchtet. Kapitel 2 wird einen Vergleich zwischen einer Vielzahl an Werkzeugen zur Assemblierung von Chloroplasten Genomen ziehen. Kapitel 3 stellt eine neu entwickelte Software zur genom-weiten Assoziationskartierung vor, die bisherigen Programmen überlegen ist. Kapitel 4 stellt maschinelles Lernen und die genetischen Komponenten von quantitativen Merkmalen vor und bringt diese im Kontext der genomischen Vorhersagen zusammen. Zum Schluss in Kapitel 5 werden die vorherigen Ergebnisse im Gesamtkontext der quantitativen Genetik erläutert. KW - Genetics KW - GWAS KW - Genomic Selection KW - Quantitative Genetics Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-199429 ER -