dc.contributor.advisor | Guyot, Romain | |
dc.contributor.advisor | Isaza Echeverri, Gustavo Adolfo | |
dc.contributor.author | Orozco Arias, Simon | |
dc.date.accessioned | 2022-05-03T22:58:23Z | |
dc.date.available | 2022-05-03T22:58:23Z | |
dc.date.issued | 2022-05-04 | |
dc.identifier.uri | https://repositorio.ucaldas.edu.co/handle/ucaldas/17590 | |
dc.description | Ilustraciones | spa |
dc.description.abstract | spa:Esta tesis doctoral se ha centrado en la aplicación de técnicas de machine learning y deep learning para el estudio de los LTR retrotransposones, con el objetivo de mejorar la comprensión a nivel genómico de plantas de interés agroindustrial como el arroz, el maíz, el café y la caña de azúcar, y que podría aplicarse a cualquier otro genoma vegetal u otros organismos.
Investigaciones recientes han demostrado el impacto de los elementos transponibles en el fenotipo de cultivos de interés, como el color de los granos de maíz, el color y el sabor de las naranjas, el color de la piel de las patatas, el tamaño y la forma de los tomates, y el color y el sabor de las uvas, que se producen por la inserción de estos elementos cerca o dentro de los genes. Aunque existen técnicas y herramientas bioinformáticas para la detección y clasificación de los elementos transponibles, aún no es posible obtener resultados fiables, debido a la gran diversidad de sus estructuras, patrones de replicación y ciclos de vida. Además, estos componentes genómicos tienen características que hacen muy complejo su estudio, como la especificidad de las especies, la alta diversidad a nivel de nucleótidos (baja homología entre secuencias), las largas regiones no codificantes y su naturaleza repetitiva. Por ello, nuevas técnicas como el machine learning y el deep learning podrían mejorar el rendimiento tanto en el tiempo de ejecución como en la precisión de los resultados.
En el desarrollo de este proyecto de investigación se utilizaron los algoritmos de aprendizaje automático más conocidos, así como algunas arquitecturas de redes neuronales profundas que se han generalizado en la comunidad científica en los últimos años. Se extrapolaron los métodos de extracción y selección de características, las técnicas de preprocesamiento, los algoritmos y las arquitecturas que se han utilizado con éxito en conjuntos de datos similares a los elementos transponibles. Asimismo, esta tesis doctoral tendrá un impacto positivo en la comunidad científica en los campos de la bioinformática, la genómica y la agricultura, ya que el software desarrollado aquí y su uso en otros genomas podría servir de base para futuras investigaciones relacionadas con la mejora genética, la comprensión de la evolución de las especies y la relación entre los organismos y el medio ambiente. Además, se generó conocimiento sobre el uso de nuevas técnicas en datos genómicos (especialmente LTR retrotransposones), como la influencia de la naturaleza de los datos en la precisión de los resultados, mejores técnicas de preprocesamiento (selección y extracción de características, reducción de la dimensionalidad, transformación de datos, entre otras), mejores hiperparámetros y métricas que se ajusten mejor a dichos elementos.
Finalmente, esta propuesta de investigación condujo a la creación de un software bioinformático funcional que, gracias a las técnicas seleccionadas, permite la detección y clasificación de LTR retrotransposones en plantas de interés. Este software está disponible para la comunidad científica y puede ser utilizado en el contexto de varios proyectos masivos de secuenciación y ensamblaje de genomas, como el proyecto de los 3.000 genomas del arroz, la secuenciación de 10.000 genomas de plantas o el proyecto de secuenciación de 1,5 millones de especies eucariotas. Todos los códigos y scripts desarrollados durante este proyecto están disponibles en https://github.com/simonorozcoarias/MLinTEs. | spa |
dc.description.abstract | eng:This PhD thesis focused on the application of machine learning and deep learning techniques for the study of LTR retrotransposons, with the aim of improving the understanding at the genomic level of plants of agro-industrial interest such as rice, maize, coffee and sugar cane, and which could be applied to any other plant genome or other organisms. Recent research has demonstrated the impact of transposable elements on the phenotype of crops of interest, such as the colour of maize kernels, the colour and flavor of oranges, the skin colour of potatoes, the size and shape of tomatoes, and the colour and flavor of grapes, which are produced by the insertion of these elements near or into genes. Although bioinformatics techniques and tools exist for the detection and classification of transposable elements, it is not yet possible to obtain reliable results, due to the great diversity of their structures, replication patterns and life cycles. In addition, these genomic components have characteristics that make their study very complex, such as species specificity, high diversity at the nucleotide level (low homology between sequences), long non-coding regions and their repetitive nature. Therefore, new techniques such as machine learning and deep learning could improve performance in terms of both execution time and accuracy of results. In the development of this research project, the most well-known machine learning algorithms were used, as well as some deep neural network architectures that have become widespread in the scientific community in recent years. Feature extraction and selection methods, pre-processing techniques, algorithms and architectures that have been successfully used on datasets similar to transposable features were extrapolated. Also, this Ph.D. thesis will have a positive impact on the scientific community in the fields of bioinformatics, genomics and agriculture, as the software developed here and its use on other genomes could serve as a basis for future research related to genetic improvement, understanding the evolution of species and the relationship between organisms and the environment. In addition, knowledge was generated on the use of new techniques on genomic data (especially LTR retrotransposons), such as the influence of the nature of the data on the accuracy of the results, better pre-processing techniques (feature selection and extraction, dimensionality reduction, data transformation, among others), and better hyper-parameters and metrics that better fit such elements. Finally, this research proposal led to the creation of a functional bioinformatics software that, thanks to the selected techniques, allows the detection and classification of LTR retrotransposons in plants of interest. This software is available to the scientific community and can be used in the context of several massive genome sequencing and assembly projects, such as the 3,000 rice genomes project, the sequencing of 10,000 plant genomes or the 1.5 million eukaryotic species sequencing project. All the codes and scripts developed during this project are available at https://github.com/simonorozcoarias/MLinTEs. | eng |
dc.description.tableofcontents | Contents Acknowledgements / 1. Introduction / 1.1. Background / 1.2. Research problema / 1.3. Justi cation / 1.4. Research questions / 1.5. Research hypothesis / 1.6. Organization of this Document / 2. Thesis Objectives 11 2.1. General Objective /2.2. Speci c Objectives / 3. The State of the Art / 3.1. Context about retrotransposons and their characteristics / 3.2. Context about machine learning models in TEs / 3.3. Conclusions and perspectives / 4. DNA coding schemes and measuring metrics / 4.1. Context / 4.2. Conclusions and perspectives / 5. InpactorDB 20 5.1. Context / 5.2. Conclusions and perspectives / 6. K-mers-based-methods 23 6.1. Context / 6.2. Conclusions and perspectives / 7. Neural Network to curate LTR retrotransposons libraries 26 7.1. Context / 7.2. Conclusions and perspectives / 8. Inpactor2: A one-shot so ware based on deep learning / 8.1. Context / 8.2. Conclusions and perspectives / 9. Application of a DL-based tool to the identification and classification of LTR retrotransposons in the genus Co ea / 9.1. Abstract / 9.2. Introduction / 9.3. Materials and methods / 9.3.1. Co ea sequencing resources available / 9.3.2. Creation of co ee dataset for re-training Inpactor2 / 9.3.3. Library of LTR-RTs in Co ea genus and its annotation / 9.3.4. Data analysis and visualization / 9.3.5. Raw Illumina reads mapping results / 9.4. Results / 9.4.1. Re-training of the model for the Co ea genus / 9.4.2. Construction of a LTR-RT library for the Co ea genus / 9.4.3. Utilization of a Co ea LTR-RT library for the annotation of assemblies in the Co ea genus / 9.4.4. Relationship between the LTR-RT proportion and the genome size assembly / 9.5. Discussion / 9.6. Conclusion / Appendices / A. Appendix A / B. Appendix B / 10. Discussions, conclusions, and contributions / 10.1. Discussions / 10.1.1. DNA coding schemes and available datasets / 10.1.2. e detection problema / 10.1.3. Integration of ML models in a one-shot tool / 10.2. Conclusions / 10.3. Contributions / Bibliography | eng |
dc.format.mimetype | application/pdf | spa |
dc.language.iso | eng | spa |
dc.language.iso | spa | spa |
dc.title | A computational architecture to identify and classify LTR retrotransposons in plant genomes | eng |
dc.type | Trabajo de grado - Doctorado | spa |
dc.contributor.researchgroup | GITIR Grupo de Investigación en Tecnologías de la Información y Redes (Categoría A) | spa |
dc.description.degreelevel | Doctorado | spa |
dc.identifier.instname | Universidad de Caldas | spa |
dc.identifier.reponame | Repositorio Institucional Universidad de Caldas | spa |
dc.identifier.repourl | https://repositorio.ucaldas.edu.co/ | spa |
dc.publisher.faculty | Facultad de Ingeniería | spa |
dc.publisher.place | Manizales | spa |
dc.relation.references | F. Choulet, A. Alberti, S. eil, N. Glover, V. Barbe, J. Daron, L. Pingault, P. Sourdille, A. Couloux, E. Paux, and Others, “Structural and functional partitioning of bread wheat chromosome 3B,” Science, vol. 345, no. 6194, p. 1249721, 2014. | spa |
dc.relation.references | E. Ibarra-Lacle e and E. Lyons, “Architecture and evolution of a minute plant genome,” Nature, vol. 498, no. 7452, pp. 1–6, 2013. | spa |
dc.relation.references | M. I. Tenaillon, J. D. Hollister, and B. S. Gaut, “A triptych of the evolution of plant transposable elements,” Trends in Plant Science, vol. 15, no. 8, pp. 471–478, 2010. | spa |
dc.relation.references | I. Makarevitch, A. J. Waters, P. T. West, M. Stitzer, C. N. Hirsch, J. Ross-Ibarra, and N. M. Springer, “Transposable Elements Contribute to Activation of Maize Genes in Response to Abiotic Stress,” PLoS Genetics, vol. 11, no. 1, 2015. | spa |
dc.relation.references | E. Todorovska, “Retrotransposons and their Role in Plant-Genome Evolution,” Biotechnology & Biotechnological Equipment, vol. 2818, no. August, pp. 294–305, 2014. | spa |
dc.relation.references | E. Casacuberta and J. Gonzalez, “ e impact of transposable elements in environmental ´ adaptation,” Molecular Ecology, vol. 22, no. 6, pp. 1503–1517, 2013. | spa |
dc.relation.references | G. Bonchev and C. Parisod, “Transposable elements and microevolutionary changes in natural populations,” MOLECULAR ECOLOGY RESOURCES, vol. 13, pp. 765–775, sep 2013. | spa |
dc.relation.references | S.-F. Li, T. Su, G.-Q. Cheng, B.-X. Wang, X. Li, C.-L. Deng, and W.-J. Gao, “Chromosome Evolution in Connection with Repetitive Sequences and Epigenetics in Plants,” GENES, vol. 8, oct 2017. | spa |
dc.relation.references | S. Ou, J. Chen, and N. Jiang, “Assessing genome assembly quality using the LTR Assembly Index (LAI),” Nucleic Acids Research, no. August, pp. 1–11, 2018. | spa |
dc.relation.references | D. Hermann, F. Egue, E. Tastard, D.-H. Nguyen, N. Casse, A. Caruso, S. Hiard, J. Marchand, B. Chenais, A. Morant-Manceau, and J. D. Rouault, “An introduction to the vast world of transposable elements - what about the diatoms?,” DIATOM RESEARCH, vol. 29, pp. 91–104, jan 2014 | spa |
dc.relation.references | F. Mascagni, A. Vangelisti, T. Giordani, A. Cavallini, and L. Natali, “Speci c LTRRetrotransposons Show Copy Number Variations between Wild and Cultivated Sun owers.,” Genes, vol. 9, p. 433, aug 2018 | spa |
dc.relation.references | T. Wicker, F. Sabot, A. Hua-Van, J. L. Bennetzen, P. Capy, B. Chalhoub, A. Flavell, P. Leroy, M. Morgante, O. Panaud, E. Paux, P. SanMiguel, and A. H. Schulman, “A uni ed classi cation system for eukaryotic transposable elements,” Nature Reviews Genetics, vol. 8, no. 12, pp. 973–982, 2007. | spa |
dc.relation.references | P. S. Schnable, D. Ware, R. S. Fulton, J. C. Stein, F. Wei, S. Pasternak, C. Liang, J. Zhang, L. Fulton, T. A. Graves, P. Minx, A. D. Reily, L. Courtney, S. S. Kruchowski, C. Tomlinson, C. Strong, K. Delehaunty, C. Fronick, B. Courtney, S. M. Rock, E. Belter, F. Du, K. Kim, R. M. Abbo , M. Co on, A. Levy, P. Marche o, K. Ochoa, S. M. Jackson, B. Gillam, W. Chen, L. Yan, J. Higginbotham, M. Cardenas, J. Waligorski, E. Applebaum, L. Phelps, J. Falcone, K. Kanchi, T. ane, A. Scimone, N. ane, J. Henke, T. Wang, J. Ruppert, N. Shah, K. Ro er, J. Hodges, E. Ingenthron, M. Cordes, S. Kohlberg, J. Sgro, B. Delgado, K. Mead, A. Chinwalla, S. Leonard, K. Crouse, K. Collura, D. Kudrna, J. Currie, R. He, A. Angelova, S. Rajasekar, T. Mueller, R. Lomeli, G. Scara, A. Ko, K. Delaney, M. Wissotski, G. Lopez, D. Campos, M. Braido i, E. Ashley, W. Golser, H. Kim, S. Lee, J. Lin, Z. Dujmic, W. Kim, J. Talag, A. Zuccolo, C. Fan, A. Sebastian, M. Kramer, L. Spiegel, L. Nascimento, T. Zutavern, B. Miller, C. Ambroise, S. Muller, W. Spooner, A. Narechania, L. Ren, S. Wei, S. Kumari, B. Faga, M. J. Levy, L. McMahan, P. Van Buren, M. W. Vaughn, K. Ying, C.-T. Yeh, S. J. Emrich, Y. Jia, A. Kalyanaraman, A.-P. Hsia, W. B. Barbazuk, R. S. Baucom, T. P. Brutnell, N. C. Carpita, C. Chaparro, J.-M. Chia, J.-M. Deragon, J. C. Estill, Y. Fu, J. A. Jeddeloh, Y. Han, H. Lee, P. Li, D. R. Lisch, S. Liu, Z. Liu, D. H. Nagel, M. C. McCann, P. SanMiguel, A. M. Myers, D. Ne leton, J. Nguyen, B. W. Penning, L. Ponnala, K. L. Schneider, D. C. Schwartz, A. Sharma, C. Soderlund, N. M. Springer, Q. Sun, H. Wang, M. Waterman, R. Westerman, T. K. Wolfgruber, L. Yang, Y. Yu, L. Zhang, S. Zhou, Q. Zhu, J. L. Bennetzen, R. K. Dawe, J. Jiang, N. Jiang, G. G. Presting, S. R. Wessler, S. Aluru, R. A. Martienssen, S. W. Cli on, W. R. McCombie, R. A. Wing, and R. K. Wilson, “ e B73 Maize Genome: Complexity, Diversity, and Dynamics,” Science, vol. 326, no. 5956, pp. 1112–1115, 2009. | spa |
dc.relation.references | A. H. Paterson, J. E. Bowers, R. Bruggmann, I. Dubchak, J. Grimwood, H. Gundlach, G. Haberer, U. Hellsten, T. Mitros, A. Poliakov, J. Schmutz, M. Spannagl, H. Tang, X. Wang, T. Wicker, A. K. Bharti, J. Chapman, F. A. Feltus, U. Gowik, I. V. Grigoriev, E. Lyons, C. a. Maher, M. Martis, A. Narechania, R. P. Otillar, B. W. Penning, A. a. Salamov, Y. Wang, L. Zhang, N. C. Carpita, M. Freeling, A. R. Gingle, C. T. Hash, B. Keller, P. Klein, S. Kresovich, M. C. McCann, R. Ming, D. G. Peterson, M. ur Rahman, D. Ware, P. Westho , K. F. X. Mayer, J. Messing, and D. S. Rokhsar, “ e Sorghum bicolor genome and the diversi cation of grasses.,” Nature, vol. 457, no. 7229, pp. 551–556, 2009. | spa |
dc.relation.references | F. Denoeud, L. Carretero-Paulet, A. Dereeper, G. Droc, R. Guyot, M. Pietrella, C. Zheng, A. Alberti, F. Anthony, G. Aprea, J.-M. Aury, P. Bento, M. Bernard, S. Bocs, C. Campa, A. Cenci, M.-C. Combes, D. Crouzillat, C. Da Silva, L. Daddiego, F. De Bellis, S. Dussert, O. Garsmeur, T. Gayraud, V. Guignon, K. Jahn, V. Jamilloux, T. Joet, K. Labadie, T. Lan, J. Le- clercq, M. Lepelley, T. Leroy, L.-T. Li, P. Librado, L. Lopez, A. Munoz, B. Noel, A. Pallavicini, ˜ G. Perro a, V. Poncet, D. Pot, Priyono, M. Rigoreau, M. Rouard, J. Rozas, C. TranchantDubreuil, R. VanBuren, Q. Zhang, A. C. Andrade, X. Argout, B. Bertrand, A. de Kochko, G. Graziosi, R. J. Henry, Jayarama, R. Ming, C. Nagai, S. Rounsley, D. Sanko , G. Giuliano, V. a. Albert, P. Wincker, P. Lashermes, and Others, “ e co ee genome provides insight into the convergent evolution of ca eine biosynthesis,” science, vol. 345, no. 6201, pp. 1181–4, 2014. | spa |
dc.relation.references | R. de Castro Nunes, S. Orozco-Arias, D. Crouzillat, L. A. Mueller, S. R. Strickler, P. Descombes, C. Fournier, D. Moine, A. de Kochko, P. M. Yuyama, A. L. L. Vanzela, and R. Guyot, “Structure and Distribution of Centromeric Retrotransposons at Diploid and Allotetraploid Co ea Centromeric and Pericentromeric Regions,” Frontiers in Plant Science, 2018 | spa |
dc.relation.references | C. M. Vicient and J. M. Casacuberta, “Impact of transposable elements on polyploid plant genomes,” ANNALS OF BOTANY, vol. 120, pp. 195–207, aug 2017. | spa |
dc.relation.references | P. is, T. Lacombe, M. Cadle-Davidson, and C. L. Owens, “Wine grape (Vitis vinifera L.) color associates with allelic variation in the domestication gene VvmybA1,” eoretical and Applied Genetics, vol. 114, no. 4, pp. 723–730, 2007. | spa |
dc.relation.references | H. Xiao, N. Jiang, E. Scha ner, E. J. Stockinger, and E. Van Der Knaap, “A retrotransposonmediated gene duplication underlies morphological variation of tomato fruit,” science, vol. 319, no. 5869, pp. 1527–1530, 2008 | spa |
dc.relation.references | M. Momose, Y. Abe, and Y. Ozeki, “Miniature inverted-repeat transposable elements of stowaway are active in potato,” Genetics, vol. 186, no. 1, pp. 59–66, 2010. | spa |
dc.relation.references | E. Butelli, C. Licciardello, Y. Zhang, J. Liu, S. Mackay, P. Bailey, G. Reforgiato-Recupero, and C. Martin, “Retrotransposons control fruit-speci c, cold-dependent accumulation of anthocyanins in blood oranges.,” e Plant cell, vol. 24, pp. 1242–55, mar 2012. | spa |
dc.relation.references | L. Wei and X. Cao, “ e e ect of transposable elements on phenotypic variation: insights from plants to humans,” Science China Life Sciences, vol. 59, pp. 24–37, jan 2016. | spa |
dc.relation.references | C. Vi e, M.-A. Fustier, K. Alix, and M. I. Tenaillon, “ e bright side of transposons in crop evolution,” Brie ngs in Functional Genomics, vol. 13, no. 4, pp. 276–295, 2014. | spa |
dc.relation.references | P. Baduel and V. Colot, “ e epiallelic potential of transposable elements and its evolutionary signi cance in plants,” Philosophical Transactions of the Royal Society B, vol. 376, no. 1826, p. 20200123, 2021. | spa |
dc.relation.references | J. Arango-Lopez, S. Orozco-Arias, J. A. Salazar, and R. Guyot, “Application of Data Mining ´ Algorithms to Classify Biological Data: e Co ea canephora Genome Case,” in Advances in Computing, vol. 735, pp. 156–170, Springer, 2017. | spa |
dc.relation.references | L. Schietgat, C. Vens, R. Cerri, C. N. Fischer, E. Costa, J. Ramon, C. M. A. Carareto, and H. Blockeel, “A machine learning based framework to identify and classify long terminal repeat retrotransposons.,” PLoS computational biology, vol. 14, p. e1006097, apr 2018. | spa |
dc.relation.references | T. Loureiro, N. Fonseca, and R. Camacho, Application of Machine Learning techniques on the Discovery and annotation of Transposons in genomes. Ms.c., Ms.C. esis FACULDADE DE ENGENHARIA, UNIVERSIDADE DO PORTO, 2012. | spa |
dc.relation.references | M. Dupeyron, Dynamique et evolution de deux lign ´ ees remarquables de r ´ etrotransposons ´ a` LTR dans le genre Co ea (famille des Rubiacees) ´ . PhD thesis, Montpellier, 2017. | spa |
dc.relation.references | K. Rawal and R. Ramaswamy, “Genome-wide analysis of mobile genetic element insertion sites,” Nucleic Acids Research, vol. 39, no. 16, pp. 6864–6878, 2011. | spa |
dc.relation.references | R. N. Musta n and E. K. Khusnutdinova, “ e Role of Transposons in Epigenetic Regulation of Ontogenesis,” Russian Journal of Developmental Biology, vol. 49, pp. 61–78, mar 2018. | spa |
dc.relation.references | W. Bao, K. K. Kojima, and O. Kohany, “Repbase Update, a database of repetitive elements in eukaryotic genomes,” Mobile DNA, vol. 6, no. 1, pp. 4–9, 2015. | spa |
dc.relation.references | J. Amselem, G. Cornut, N. Choisne, M. Alaux, F. Alfama-Depauw, V. Jamilloux, F. Maumus, T. Letellier, I. Luyten, C. Pommier, A. F. Adam-Blondon, and H. esneville, “RepetDB: A uni ed resource for transposable element references,” Mobile DNA, vol. 10, no. 1, pp. 1–9, 2019. | spa |
dc.relation.references | M. Spannagl, T. Nussbaumer, K. C. Bader, M. M. Martis, M. Seidel, K. G. Kugler, H. Gundlach, and K. F. Mayer, “PGSB plantsDB: Updates to the database framework for comparative plant genome research,” Nucleic Acids Research, vol. 44, no. D1, pp. D1141–D1147, 2016. | spa |
dc.relation.references | S. Orozco-Arias, P. A. Jaimes, M. S. Candamil, C. F. Jimenez-Var ´ on, R. Tabares-Soto, G. Isaza, ´ and R. Guyot, “InpactorDB: A classi ed lineage-level plant LTR retrotransposon reference library for free-alignment methods based on machine learning,” Genes, vol. 12, no. 2, pp. 1– 17, 2021. | spa |
dc.relation.references | T. Loureiro, R. Camacho, J. Vieira, and N. A. Fonseca, “Improving the performance of Transposable Elements detection tools.,” Journal of integrative bioinformatics, vol. 10, no. 3, p. 231, 2013. | spa |
dc.relation.references | S. Orozco-Arias, R. Tabares-Soto, D. Ceballos, and R. Guyot, “Parallel Programming in Biological Sciences, Taking Advantage of Supercomputing in Genomics,” in Advances in Computing (A. Solano and H. Ordonez, eds.), vol. 735, pp. 627–643, Zurich: Springer, 2017. | spa |
dc.relation.references | S. Orozco-Arias, G. Isaza, R. Guyot, and R. Tabares-Soto, “A systematic review of the application of machine learning in the detection and classi cation of transposable elements,” PeerJ, vol. 7, p. e8311, 2019. | spa |
dc.relation.references | R. Tabares-Soto, S. Orozco-Arias, V. Romero-Cano, V. S. Bucheli, J. L. Rodr´ıguez-Sotelo, and C. F. Jimenez-Var ´ on, “A comparative study of machine learning and deep learning algo- ´ rithms to classify cancer types based on microarray gene expression data,” PeerJ Computer Science, vol. 6, p. e270, 2020. | spa |
dc.relation.references | M. W. Libbrecht and W. S. Noble, “Machine learning applications in genetics and genomics,” Nature Reviews Genetics, vol. 16, no. 6, pp. 321–332, 2015. | spa |
dc.relation.references | P. Larranaga, B. Calvo, R. Santana, C. Bielza, J. Galdiano, I. Inza, J. A. Lozano, R. Arma ˜ nan- ˜ zas, G. Santafe, A. P ´ erez, and V. Robles, “Machine learning in bioinformatics,” ´ Brie ngs in Bioinformatics, vol. 7, no. 1, pp. 86–112, 2006. | spa |
dc.relation.references | F. K. Nakano, S. M. Mastelini, S. Barbon, and R. Cerri, “Improving Hierarchical Classi cation of Transposable Elements using Deep Neural Networks,” in Proceedings of the International Joint Conference on Neural Networks, vol. 8-13 July, (Rio de Janeiro), IEEE, 2018. | spa |
dc.relation.references | O. A. Montesinos-Lopez, A. Montesinos-L ´ opez, P. P ´ erez-Rodr ´ ´ıguez, J. A. Barron-L ´ opez, ´ J. W. Martini, S. B. Fajardo-Flores, L. S. Gaytan-Lugo, P. C. Santana-Mancilla, and J. Crossa, “A review of deep learning applications for genomic selection,” BMC Genomics, vol. 22, no. 1, pp. 1–23, 2021. | spa |
dc.relation.references | M. H. P. da Cruz, P. T. M. Saito, A. R. Paschoal, and P. H. Buga i, “Classi cation of Transposable Elements by Convolutional Neural Networks,” in Lecture Notes in Computer Science, vol. 11509, pp. 157–168, Springer International Publishing, 2019. | spa |
dc.relation.references | FAO, FIDA, OMS, PMA, and UNICEF, “LA SEGURIDAD ALIMENTARIA Y LA NUTRICION EN EL MUNDO,” tech. rep., ONU, Roma, 2020. | spa |
dc.relation.references | ONU, “Alimentacion,” 2018. | spa |
dc.relation.references | C. A. Deutsch, J. J. Tewksbury, M. Tigchelaar, D. S. Ba isti, S. C. Merrill, R. B. Huey, and R. L. Naylor, “Increase in crop losses to insect pests in a warming climate,” Science, vol. 361, no. 6405, pp. 916–919, 2018. | spa |
dc.relation.references | R. Tito, H. L. Vasconcelos, and K. J. Feeley, “Global Climate Change Increases Risk of Crop Yield Losses and Food Insecurity in the Tropical Andes,” Global Change Biology, vol. 24, no. 2, 2017 | spa |
dc.relation.references | N. Jiang, “Overview of Repeat Annotation and De Novo Repeat Identi cation,” in Plant Transposable Elements, pp. 275–287, Springer, 2013. | spa |
dc.relation.references | G. Abrusan, N. Grundmann, L. Demester, and W. Makalowski, “TEclass - A tool for auto- ´ mated classi cation of unknown eukaryotic transposable elements,” Bioinformatics, vol. 25, no. 10, pp. 1329–1330, 2009. | spa |
dc.relation.references | G. Eraslan, Z. Avsec, J. Gagneur, and F. J. eis, “Deep learning: new computational mode- ˇ lling techniques for genomics,” Nature Reviews Genetics, vol. 20, no. 7, pp. 389–403, 2019. | spa |
dc.relation.references | T. Yue and H. Wang, “Deep learning for genomics: A concise overview,” arXiv preprint arXiv:1802.00810, 2018. | spa |
dc.relation.references | J. Zou, M. Huss, A. Abid, P. Mohammadi, A. Torkamani, and A. Telenti, “A primer on deep learning in genomics,” Nature Genetics, vol. 51, no. 1, pp. 12–18, 2019. | spa |
dc.relation.references | L. Koumakis, “Deep learning models in genomics; are we there yet?,” Computational and Structural Biotechnology Journal, vol. 18, pp. 1466–1473, 2020. | spa |
dc.relation.references | M. H. P. da Cruz, D. S. Domingues, P. T. M. Saito, A. R. Paschoal, and P. H. Buga i, “TERL: classi cation of transposable elements by convolutional neural networks,” Brie ngs in Bioinformatics, vol. 22, may 2021 | spa |
dc.relation.references | H. Yan, A. Bombarely, and S. Li, “DeepTE: a computational method for de novo classi cation of transposons with convolutional neural network.,” Bioinformatics (Oxford, England), 2020. | spa |
dc.relation.references | S. Orozco-Arias, M. S. Candamil-Cortes, P. A. Jaimes, E. Valencia-Castrillon, R. TabaresSoto, R. Guyot, and G. Isaza, “Deep neural network to curate ltr retrotransposon libraries from plant genomes,” in International Conference on Practical Applications of Computational Biology & Bioinformatics, pp. 85–94, Springer, 2021. | spa |
dc.relation.references | N.-S. Kim, “ e genomes and transposable elements in plants: are they friends or foes?,” GENES & GENOMICS, vol. 39, pp. 359–370, apr 2017. | spa |
dc.relation.references | G. Usai, F. Mascagni, L. Natali, T. Giordani, and A. Cavallini, “Comparative genome-wide analysis of repetitive DNA in the genus Populus L.,” Tree Genetics & Genomes, vol. 13, p. 96, oct 2017 | spa |
dc.relation.references | C. R. L. Huang, K. H. Burns, and J. D. Boeke, “Active transposition in genomes.,” Annual review of genetics, vol. 46, pp. 651–75, dec 2012. | spa |
dc.relation.references | A. Testori, L. Caizzi, S. Cutrupi, O. Friard, M. De Bortoli, D. Cora, and M. Caselle, “ e role of transposable elements in shaping the combinatorial interaction of transcription factors,” BMC genomics, vol. 13, no. 1, pp. 1–16, 2012. | spa |
dc.relation.references | M.-A. A. Grandbastien, “LTR retrotransposons, handy hitchhikers of plant regulation and stress response,” Biochimica et Biophysica Acta - Gene Regulatory Mechanisms, vol. 1849, pp. 403–416, apr 2015 | spa |
dc.relation.references | N. Krom and W. Ramakrishna, “Retrotransposon insertions in rice gene pairs associated with reduced conservation of gene pairs in grass genomes.,” Genomics, vol. 99, pp. 308–14, may 2012. | spa |
dc.relation.references | J. Lee, N. E. Waminal, H.-I. Choi, S. Perumal, S.-C. Lee, V. B. Nguyen, W. Jang, N.-H. Kim, L.-Z. Gao, and T.-J. Yang, “Rapid ampli cation of four retrotransposon families promoted speciation and genome size expansion in the genus Panax.,” Scienti c reports, vol. 7, p. 9045, aug 2017. | spa |
dc.relation.references | M. Elbaidouri and O. Panaud, “Genome-Wide Analysis of Transposition Using Next Generation Sequencing Technologies,” in Plant Transposable Elements, pp. 59–70, Springer, 2012. | spa |
dc.relation.references | L. Wang, Y. He, H. Qiu, J. Guo, M. Han, J. Zhou, Q. Sun, and J. Sun, “Mdoryco1-1, a bidirectionally transcriptional Ty1-copia retrotransposon from Malus x domestica,” SCIENTIA HORTICULTURAE, vol. 220, pp. 283–290, jun 2017. | spa |
dc.relation.references | R. C. Paz, M. E. Kozaczek, H. G. Rosli, N. P. Andino, and M. V. Sanchez-Puerta, “Diversity, distribution and dynamics of full-length Copia and Gypsy LTR retroelements in Solanum lycopersicum.,” Genetica, vol. 145, pp. 417–430, oct 2017. | spa |
dc.relation.references | M. Iquebal, S. Jaiswal, C. Mukhopadhyay, C. Sarkar, A. Rai, and D. Kumar, “Applications of bioinformatics in plant and agriculture,” in PlantOmics: e Omics of Plant Science, pp. 755– 789, Springer, 2015. | spa |
dc.relation.references | H. Z. Girgis, “Red: An intelligent, rapid, accurate tool for detecting repeats de-novo on the genomic scale,” BMC Bioinformatics, vol. 16, no. 1, pp. 1–19, 2015. | spa |
dc.relation.references | G. I. Arabidopsis, S. Kaul, H. L. Koo, J. Jenkins, M. Rizzo, T. Rooney, L. J. Tallon, T. Feldblyum, W. Nierman, M. I. Benito, X. Lin, and Others, “Analysis of the genome sequence of the owering plant Arabidopsis thaliana,” Nature, vol. 408, no. December, pp. 796–815, 2000. | spa |
dc.relation.references | J. Yu, S. Hu, J. Wang, G. K. Wong, S. Li, B. Liu, Y. Deng, L. Dai, Y. Zhou, X. Zhang, M. Cao, J. Liu, J. Sun, J. Tang, Y. Chen, X. Huang, W. Lin, C. Ye, W. Tong, L. Cong, J. Geng, Y. Han, L. Li, W. Li, G. Hu, J. Li, Z. Liu, Q. Qi, T. Li, X. Wang, H. Lu, T. Wu, M. Zhu, P. Ni, H. Han, W. Dong, X. Ren, X. Feng, P. Cui, X. Li, H. Wang, X. Xu, W. Zhai, Z. Xu, J. Zhang, S. He, J. Xu, K. Zhang, X. Zheng, J. Dong, W. Zeng, L. Tao, J. Ye, J. Tan, X. Chen, J. He, D. Liu, W. Tian, C. Tian, H. Xia, Q. Bao, G. Li, H. Gao, T. Cao, W. Zhao, P. Li, W. Chen, Y. Zhang, J. Hu, S. Liu, J. Yang, G. Zhang, Y. Xiong, Z. Li, L. Mao, C. Zhou, Z. Zhu, R. Chen, B. Hao, W. Zheng, S. Chen, W. Guo, M. Tao, L. Zhu, L. Yuan, and H. Yang, “A dra sequence of the rice genome (Oryza sativa L. ssp. indica),” Science, vol. 296, no. 5565, pp. 79–92, 2002. | spa |
dc.relation.references | R. Akakpo, M.-C. Carpentier, Y. Ie Hsing, and O. Panaud, “ e impact of transposable elements on the structure, evolution and function of the rice genome,” New Phytologist, vol. 226, no. 1, pp. 44–49, 2020. | spa |
dc.relation.references | M. Dom´ınguez, E. Dugas, M. Benchouaia, B. Leduque, J. M. Jimenez-G ´ omez, V. Colot, and ´ L. adrana, “ e impact of transposable elements on tomato diversity,” Nature communications, vol. 11, no. 1, pp. 1–11, 2020. | spa |
dc.relation.references | D. Almojil, Y. Bourgeois, M. Falis, I. Hariyani, J. Wilcox, and S. Boissinot, “ e structural, functional and evolutionary impact of transposable elements in eukaryotes,” Genes, vol. 12, no. 6, p. 918, 2021. | spa |
dc.relation.references | L. Sun, Y. Jing, X. Liu, Q. Li, Z. Xue, Z. Cheng, D. Wang, H. He, and W. Qian, “Heat stressinduced transposon activation correlates with 3d chromatin organization rearrangement in arabidopsis,” Nature communications, vol. 11, no. 1, pp. 1–13, 2020. | spa |
dc.relation.references | S. A. Montgomery, Y. Tanizawa, B. Galik, N. Wang, T. Ito, T. Mochizuki, S. Akimcheva, J. L. Bowman, V. Cognat, L. Marechal-Drouard, ´ et al., “Chromatin organization in early land plants reveals an ancestral association between h3k27me3, transposons, and constitutive heterochromatin,” Current Biology, vol. 30, no. 4, pp. 573–588, 2020 | spa |
dc.relation.references | S. Alseekh, F. Scossa, and A. R. Fernie, “Mobile transposable elements shape plant genome diversity,” Trends in Plant Science, vol. 25, no. 11, pp. 1062–1064, 2020. | spa |
dc.relation.references | S. Pimpinelli and L. Piacentini, “Environmental change and the evolution of genomes: Transposable elements as translators of phenotypic plasticity into genotypic variability,” Functional Ecology, vol. 34, no. 2, pp. 428–441, 2020. | spa |
dc.relation.references | S. Orozco-arias, J. Liu, R. T.-s. Id, D. Ceballos, D. Silva, D. Id, R. Ming, and R. Guyot, “Inpactor, Integrated and Parallel Analyzer and Classi er of LTR Retrotransposons and Its Application for Pineapple LTR Retrotransposons Diversity and Dynamics,” Biology, 2018. | spa |
dc.relation.references | L. van Dorp, C. J. Houldcro , D. Richard, and F. Balloux, “Covid-19, the rst pandemic in the post-genomic era,” Current Opinion in Virology, 2021. | spa |
dc.relation.references | T. Flutre, E. Duprat, C. Feuillet, and H. esneville, “Considering transposable element diversi cation in de novo annotation approaches,” PloS one, vol. 6, no. 1, p. e16526, 2011. | spa |
dc.relation.references | S. Orozco-Arias, G. Isaza, and R. Guyot, “Retrotransposons in plant genomes: structure, identi cation, and classi cation through bioinformatics and machine learning,” International journal of molecular sciences, vol. 20, no. 15, p. 3837, 2019. | spa |
dc.relation.references | S. Ou, W. Su, Y. Liao, K. Chougule, J. R. Agda, A. J. Hellinga, C. S. B. Lugo, T. A. Ellio , D. Ware, T. Peterson, N. Jiang, C. N. Hirsch, and M. B. Hu ord, “Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline,” Genome Biology, vol. 20, no. 1, pp. 1–18, 2019. | spa |
dc.relation.references | D. R. Hoen, G. Hickey, G. Bourque, J. Casacuberta, R. Cordaux, C. Fescho e, A.-S. FistonLavier, A. Hua-Van, R. Hubley, A. Kapusta, et al., “A call for benchmarking transposable element annotation methods,” Mobile DNA, vol. 6, no. 1, pp. 1–9, 2015. | spa |
dc.relation.references | K. A. Shastry and H. Sanjay, “Machine learning for bioinformatics,” in Statistical modelling and machine learning principles for bioinformatics techniques, tools, and applications, pp. 25– 39, Springer, 2020. | spa |
dc.relation.references | E. Naresh, B. V. Kumar, S. P. Shankar, et al., “Impact of machine learning in bioinformatics research,” in Statistical Modelling and Machine Learning Principles for Bioinformatics Techniques, Tools, and Applications, pp. 41–62, Springer, 2020. | spa |
dc.relation.references | I.-C. Giassa and P. Alexiou, “Bioinformatics and machine learning approaches to understand the regulation of mobile genetic elements,” Biology, vol. 10, no. 9, p. 896, 2021. | spa |
dc.relation.references | E. Routhier, A. Bin Kamruddin, and J. Mozziconacci, “keras dna: a wrapper for fast implementation of deep learning models in genomics,” Bioinformatics, vol. 37, no. 11, pp. 1593– 1594, 2021. | spa |
dc.relation.references | W. Kopp, R. Monti, A. Tamburrini, U. Ohler, and A. Akalin, “Deep learning for genomics using janggu,” Nature communications, vol. 11, no. 1, pp. 1–7, 2020. | spa |
dc.relation.references | A. Kashfeen and L. McMillan, “Frontier: nding the boundaries of novel transposable element insertions in genomes,” in Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 1–10, 2021 | spa |
dc.relation.references | M. Panta, A. Mishra, M. T. Hoque, and J. Atallah, “Classifyte: a stacking-based prediction of hierarchical classi cation of transposable elements,” Bioinformatics, 2021 | spa |
dc.relation.references | K. Riehl, C. Riccio, E. A. Miska, and M. Hemberg, “Transposonultimate: so ware for transposon classi cation, annotation and detection,” bioRxiv, 2021. | spa |
dc.relation.references | S. Orozco-Arias, G. Isaza, R. Guyot, and R. Tabares-soto, “A systematic review of the application of machine learning in the detection and classi cation of transposable elements,” Peerj, vol. 7, p. 18311, 2019 | spa |
dc.relation.references | C. Ma, H. H. Zhang, and X. Wang, “Machine learning for Big Data analytics in plants,” Trends in Plant Science, vol. 19, no. 12, pp. 798–808, 2014. | spa |
dc.relation.references | F. K. Nakano, W. J. Pinto, G. L. Pappa, and R. Cerri, “Top-down strategies for hierarchical classi cation of transposable elements with neural networks,” in Proceedings of the International Joint Conference on Neural Networks, vol. 2017-May, (Anchorage, AK, United states), pp. 2539–2546, 2017. | spa |
dc.relation.references | E. A. Bell, C. L. Butler, C. Oliveira, S. Marburger, L. Yant, and M. I. Taylor, “Transposable element annotation in non-model species: e bene ts of species-speci c repeat libraries using semi-automated edta and deepte de novo pipelines,” Molecular Ecology Resources, 2021. | spa |
dc.relation.references | T. Flutre, E. Permal, and H. esneville, “Transposable element annotation in completely sequenced eukaryote genomes,” in Plant Transposable Elements, pp. 17–39, Springer, 2012. | spa |
dc.relation.references | C. Fescho e, N. Jiang, and S. R. Wessler, “Plant transposable elements: Where genetics meets genomics,” Nature Reviews Genetics, vol. 3, pp. 329–341, may 2002. | spa |
dc.relation.references | J. F. Pereira and P. R. Ryan, “ e role of transposable elements in the evolution of aluminium resistance in plants,” Journal of Experimental Botany, vol. 70, pp. 41–54, 10 2018. | spa |
dc.relation.references | M. Sahebi, M. M. Hana , A. J. van Wijnen, D. Rice, M. Y. Ra i, P. Azizi, M. Osman, S. Taheri, M. F. A. Bakar, M. N. M. Isa, and Others, “Contribution of transposable elements in the plant’s genome,” Gene, vol. 665, pp. 155–166, 2018. | spa |
dc.relation.references | B. McClintock, “ e Signi cance of Responses of the Genome to Challenge,” Science, vol. 226, no. 4676, pp. 792–801, 1984. | spa |
dc.relation.references | V. Horvath, M. Merenciano, and J. Gonz ´ alez, “Revisiting the Relationship between Trans- ´ posable Elements and the Eukaryotic Stress Response,” Trends in Genetics, vol. 33, no. 11, pp. 832–841, 2017. | spa |
dc.relation.references | C. A. omas, “THE GENETIC ORGANIZATION OF CHROMOSOMES,” Annual Review of Genetics, vol. 5, no. 1, pp. 237–256, 1971. | spa |
dc.relation.references | T. P. Michael, “Plant genome size variation: bloating and purging DNA,” Brie ngs in Functional Genomics, vol. 13, pp. 308–317, 03 2014. | spa |
dc.relation.references | X. Dai, H. Wang, H. Zhou, L. Wang, J. Dvo˚A™A¡k, J. L. Bennetzen, and H.-G. M ˜ A˜ ¼ller, “Birth and Death of LTR-Retrotransposons in Aegilops tauschii,” Genetics, vol. 210, pp. 1039–1051, 08 2018. | spa |
dc.relation.references | S.-I. Lee and N.-S. Kim, “Transposable Elements and Genome Size Variations in Plants,” Genomics & Informatics, vol. 12, no. 3, p. 87, 2014. | spa |
dc.relation.references | E. R. Havecker, X. Gao, and D. F. Voytas, “ e diversity of LTR retrotransposons,” Genome biology, vol. 5, no. 6, p. 225, 2004. | spa |
dc.relation.references | J. M. Casacuberta, S. Jackson, O. Panaud, M. Purugganan, and J. Wendel, “Evolution of Plant Phenotypes, from Genomes to Traits,” G3 Genes—Genomes—Genetics, vol. 6, pp. 775–778, 04 2016. | spa |
dc.relation.references | C. M. Bergman and H. esneville, “Discovering and detecting transposable elements in genome sequences,” Brie ngs in Bioinformatics, vol. 8, no. 6, pp. 382–392, 2007. | spa |
dc.relation.references | H. Ismail Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P.-A. Muller, “Transfer learning for time series classi cation,” in 2018 IEEE International Conference on Big Data (Big Data), pp. 1367–1376, 2018. | spa |
dc.relation.references | J.-C. Charr, A. Garavito, C. Guyeux, D. Crouzillat, P. Descombes, C. Fournier, S. N. Ly, E. N. Raharimalala, J.-J. Rakotomalala, P. Sto elen, S. Janssens, P. Hamon, and R. Guyot, “Complex evolutionary history of co ees revealed by full plastid genomes and 28,800 nuclear SNP analyses, with particular emphasis on Co ea canephora (Robusta co ee),” Molecular Phylogenetics and Evolution, vol. 151, p. 106906, 2020. | spa |
dc.relation.references | R. Guyot, T. Darre, M. Dupeyron, A. de Kochko, S. Hamon, E. Couturon, D. Crouzillat, ´ M. Rigoreau, J.-J. Rakotomalala, N. E. Raharimalala, S. D. Aka ou, and P. Hamon, “Partial sequencing reveals the transposable element composition of Co ea genomes and provides evidence for distinct evolutionary stories.,” Molecular genetics and genomics : MGG, vol. 291, pp. 1979–90, oct 2016. | spa |
dc.relation.references | R. Guyot, P. Hamon, E. Couturon, N. Raharimalala, J.-J. Rakotomalala, S. Lakkanna, S. Sabatier, A. A ouard, and P. Bonnet, “WCSdb: a database of wild Co ea species,” Database, vol. 2020, 11 2020. baaa069. | spa |
dc.relation.references | P. Lashermes, V. Paczek, P. Trouslot, M. Combes, E. Couturon, and A. Charrier, “Brief communication. Single-locus inheritance in the allotetraploid Co ea arabica L. and interspeci- c Hybrid C. arabica X C. canephora,” Journal of Heredity, vol. 91, pp. 81–85, 01 2000. | spa |
dc.relation.references | P. Hamon, C. E. Grover, A. P. Davis, J.-J. Rakotomalala, N. E. Raharimalala, V. A. Albert, H. L. Sreenath, P. Sto elen, S. E. Mitchell, E. Couturon, S. Hamon, A. de Kochko, D. Crouzillat, M. Rigoreau, U. Sumirat, S. Aka ou, and R. Guyot, “Genotyping-by-sequencing provides the rst well-resolved phylogeny for co ee (Co ea) and insights into the evolution of caffeine content in its species: GBS co ee phylogeny and the evolution of ca eine content,” Molecular Phylogenetics and Evolution, vol. 109, pp. 351–361, 2017. | spa |
dc.relation.references | N. Raharimalala, S. Rombauts, A. McCarthy, A. Garavito, S. Orozco-Arias, L. Bellanger, A. Y. Morales-Correa, S. Froger, S. Michaux, V. Berry, S. Metairon, C. Fournier, M. Lepelley, L. Mueller, E. Couturon, P. Hamon, J.-J. Rakotomalala, P. Descombes, R. Guyot, and D. Crouzillat, “ e absence of the ca eine synthase gene is involved in the naturally decaffeinated status of Co ea humblotiana, a wild species from Comoro archipelago,” Scienti c Reports, vol. 11, no. 1, pp. 1–14, 2021. | spa |
dc.relation.references | J.-C. Charr, A. Garavito, C. Guyeux, D. Crouzillat, P. Descombes, C. Fournier, S. N. Ly, E. N. Raharimalala, J.-J. Rakotomalala, P. Sto elen, et al., “Complex evolutionary history of co ees revealed by full plastid genomes and 28,800 nuclear snp analyses, with particular emphasis on co ea canephora (robusta co ee),” Molecular Phylogenetics and Evolution, vol. 151, p. 106906, 2020. | spa |
dc.relation.references | P. Hamon, P. O. Duroy, C. Dubreuil-Tranchant, P. Mafra D’Almeida Costa, C. Duret, N. J. Raza narivo, E. Couturon, S. Hamon, A. De Kochko, V. Poncet, and R. Guyot, “Two novel Ty1-copia retrotransposons isolated from co ee trees can e ectively reveal evolutionary relationships in the Co ea genus (Rubiaceae),” Molecular Genetics and Genomics, vol. 285, no. 6, pp. 447–460, 2011. | spa |
dc.relation.references | M. Dupeyron, R. F. de Souza, P. Hamon, A. de Kochko, D. Crouzillat, E. Couturon, D. S. Domingues, and R. Guyot, “Distribution of Divo in Co ea genomes, a poorly described family of angiosperm LTR-Retrotransposons,” Molecular Genetics and Genomics, vol. 292, pp. 741–754, aug 2017. | spa |
dc.relation.references | A. V. Zimin, G. MarA˜ §ais, D. Puiu, M. Roberts, S. L. Salzberg, and J. A. Yorke, “ e MaSuRCA genome assembler,” Bioinformatics, vol. 29, pp. 2669–2677, 08 2013. | spa |
dc.relation.references | M. Seppey, M. Manni, and E. M. Zdobnov, “BUSCO: Assessing genome assembly and annotation completeness,” in Methods in Molecular Biology, vol. 1962, pp. 227–245, Humana Press Inc., 2019. | spa |
dc.relation.references | E. M. McCarthy and J. F. McDonald, “LTR STRUC: A novel search and identi cation program for LTR retrotransposons,” Bioinformatics, vol. 19, no. 3, pp. 362–367, 2003. | spa |
dc.relation.references | S. Orozco-Arias, M. S. Candamil-Cortes, P. A. Jaimes, J. S. Pi ´ na, R. Tabares-Soto, R. Guyot, ˜ and G. Isaza, “K-mer-based machine learning method to classify LTR-retrotransposons in plant genomes,” PeerJ, vol. 9, p. e11456, may 2021. | spa |
dc.relation.references | N. Chen, “Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences,” Current Protocols in Bioinformatics, vol. 5, no. 1, pp. 4.10.1–4.10.14, 2004. | spa |
dc.relation.references | R. C. Team, “R: a language and environment for statistical 688 computing,” Vienna: R Foundation, 2016. | spa |
dc.relation.references | I. Letunic and P. Bork, “Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation,” Nucleic Acids Research, vol. 49, pp. W293–W296, 04 2021. | spa |
dc.relation.references | B. Langmead and S. L. Salzberg, “Fast gapped-read alignment with Bowtie 2,” Nature methods, vol. 9, no. 4, p. 357, 2012 | spa |
dc.relation.references | N. J. Raza narivo, J. J. Rakotomalala, S. C. Brown, M. Bourge, S. Hamon, A. de Kochko, V. Poncet, C. Dubreuil-Tranchant, E. Couturon, R. Guyot, and P. Hamon, “Geographical gradients in the genome size variation of wild co ee trees (Co ea) native to Africa and Indian Ocean islands,” Tree Genetics and Genomes, vol. 8, no. 6, pp. 1345–1358, 2012. | spa |
dc.relation.references | C. E. Grover and J. F. Wendel, “Recent Insights into Mechanisms of Genome Size Change in Plants,” Journal of Botany, vol. 2010, pp. 1–8, 2010. | spa |
dc.relation.references | R. J. Schley, J. Pellicer, X.-J. Ge, C. Barre , S. Bellot, M. S. Guignard, P. Novak, J. Suda, ´ D. Fraser, W. J. Baker, S. Dodsworth, J. r´ıMacas, A. R. Leitch, and I. J. Leitch, “ e Ecology of Palm Genomes: Repeat-associated genome size expansion is constrained by aridity,” bioRxiv, 2021. | spa |
dc.relation.references | K. Y. Yip, C. Cheng, and M. Gerstein, “Machine learning and genome annotation: a match meant to be?,” Genome biology, vol. 14, no. 5, pp. 1–10, 2013. | spa |
dc.relation.references | C. Xu and S. A. Jackson, “Machine learning and complex biological data,” 2019 | spa |
dc.relation.references | N. Yu, X. Guo, F. Gu, and Y. Pan, “Dna as x: An information-coding-based model to improve the sensitivity in comparative gene analysis,” in International Symposium on Bioinformatics Research and Applications, pp. 366–377, Springer, 2015. | spa |
dc.relation.references | M. Akhtar, J. Epps, and E. Ambikairajah, “Signal processing in sequence analysis: advances in eukaryotic gene prediction,” IEEE journal of selected topics in signal processing, vol. 2, no. 3, pp. 310–321, 2008. | spa |
dc.relation.references | G. Kauer and H. Blocker, “Applying signal theory to the analysis of biomolecules,” ¨ Bioinformatics, vol. 19, no. 16, pp. 2016–2021, 2003. | spa |
dc.relation.references | G. L. Rosen, Signal processing for biologically-inspired gradient source localization and DNA sequence analysis. Georgia Institute of Technology, 2006. | spa |
dc.relation.references | A. C. H. Choong and N. K. Lee, “Evaluation of convolutionary neural networks modeling of dna sequences using ordinal versus one-hot encoding method,” in 2017 International Conference on Computer and Drone Applications (IConDA), pp. 60–65, IEEE, 2017. | spa |
dc.relation.references | D. Ceballos, D. Lopez- ´ Alvarez, G. Isaza, R. Tabares-Soto, S. Orozco-Arias, and C. D. Fe- ´ rrin, “A machine learning-based pipeline for the classi cation of ctx-m in metagenomics samples,” Processes, vol. 7, no. 4, p. 235, 2019. | spa |
dc.relation.references | Z. Lv, H. Ding, L. Wang, and Q. Zou, “A convolutional neural network using dinucleotide one-hot encoder for identifying dna n6-methyladenine sites in the rice genome,” Neurocomputing, vol. 422, pp. 214–221, 2021. | spa |
dc.relation.references | F. Wang, P. Chainani, T. White, J. Yang, Y. Liu, and B. Soibam, “Deep learning identi- es genome-wide dna binding sites of long noncoding rnas,” RNA biology, vol. 15, no. 12, pp. 1468–1476, 2018. | spa |
dc.relation.references | D. R. Kelley, Y. A. Reshef, M. Bileschi, D. Belanger, C. Y. McLean, and J. Snoek, “Sequential regulatory activity prediction across chromosomes with convolutional neural networks,” Genome research, vol. 28, no. 5, pp. 739–750, 2018. | spa |
dc.relation.references | D. Mapleson, G. Garcia Accinelli, G. Ke leborough, J. Wright, and B. J. Clavijo, “Kat: a kmer analysis toolkit to quality control ngs datasets and genome assemblies,” Bioinformatics, vol. 33, no. 4, pp. 574–576, 2017. | spa |
dc.relation.references | F. P. Breitwieser, D. Baker, and S. L. Salzberg, “Krakenuniq: con dent and fast metagenomics classi cation using unique k-mer counts,” Genome biology, vol. 19, no. 1, pp. 1–10, 2018. | spa |
dc.relation.references | D. R. Zerbino and E. Birney, “Velvet: algorithms for de novo short read assembly using de bruijn graphs,” Genome research, vol. 18, no. 5, pp. 821–829, 2008. | spa |
dc.relation.references | J. T. Simpson, K. Wong, S. D. Jackman, J. E. Schein, S. J. Jones, and I. Birol, “Abyss: a parallel assembler for short read sequence data,” Genome research, vol. 19, no. 6, pp. 1117–1123, 2009. | spa |
dc.relation.references | H. Sun, J. Ding, M. Piednoel, and K. Schneeberger, “ ndgse: estimating genome size varia- ¨ tion within human and arabidopsis using k-mer frequencies,” Bioinformatics, vol. 34, no. 4, pp. 550–557, 2018. | spa |
dc.relation.references | A. L. Price, N. C. Jones, and P. A. Pevzner, “De novo identi cation of repeat families in large genomes,” Bioinformatics, vol. 21, no. suppl 1, pp. i351–i358, 2005. | spa |
dc.relation.references | B. Z. Santos, G. T. Pereira, F. K. Nakano, and R. Cerri, “Strategies for selection of positive and negative instances in the hierarchical classi cation of transposable elements,” in 2018 7th Brazilian Conference on Intelligent Systems (BRACIS), pp. 420–425, IEEE, 2018. | spa |
dc.relation.references | W. Ashlock and S. Da a, “Distinguishing endogenous retroviral ltrs from sine elements using features extracted from evolved side e ect machines,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 6, pp. 1676–1689, 2012. | spa |
dc.relation.references | F. Liu, H. Li, C. Ren, X. Bo, and W. Shu, “Pedla: predicting enhancers with a deep learningbased algorithmic framework,” Scienti c reports, vol. 6, no. 1, pp. 1–14, 2016. | spa |
dc.relation.references | J. T. Cuperus, B. Groves, A. Kuchina, A. B. Rosenberg, N. Jojic, S. Fields, and G. Seelig, “Deep learning of the regulatory grammar of yeast 5’ untranslated regions from 500, 000 random sequences,” Genome research, vol. 27, no. 12, pp. 2015–2024, 2017. | spa |
dc.relation.references | R. S. Roy, D. Bha acharya, and A. Schliep, “Turtle: Identifying frequent k-mers with cachee cient algorithms,” Bioinformatics, vol. 30, no. 14, pp. 1950–1957, 2014. | spa |
dc.relation.references | L. Pellegrina, C. Pizzi, and F. Vandin, “Fast approximation of frequent k-mers and applications to metagenomics,” Journal of Computational Biology, vol. 27, no. 4, pp. 534–549, 2020. | spa |
dc.relation.references | P. Melsted and J. K. Pritchard, “E cient counting of k-mers in dna sequences using a bloom lter,” BMC bioinformatics, vol. 12, no. 1, pp. 1–7, 2011. | spa |
dc.relation.references | F. Doshi-Velez and B. Kim, “Considerations for evaluation and generalization in interpretable machine learning,” in Explainable and interpretable models in computer vision and machine learning, pp. 3–17, Springer, 2018 | spa |
dc.relation.references | M. Mohri, A. Rostamizadeh, and A. Talwalkar, Foundations of machine learning. MIT press, 2018. | spa |
dc.relation.references | S.-S. Zhou, X.-M. Yan, K.-F. Zhang, H. Liu, J. Xu, S. Nie, K.-H. Jia, S.-Q. Jiao, W. Zhao, Y.-J. Zhao, et al., “A comprehensive annotation dataset of intact ltr retrotransposons of 300 plant genomes,” Scienti c Data, vol. 8, no. 1, pp. 1–9, 2021. | spa |
dc.relation.references | S. Ou and N. Jiang, “Ltr retriever: a highly accurate and sensitive program for identi cation of long terminal repeat retrotransposons,” Plant physiology, vol. 176, no. 2, pp. 1410–1422, 2018. | spa |
dc.relation.references | E. Lerat, “Identifying repeats and transposable elements in sequenced genomes: how to nd your way through the dense forest of programs,” Heredity, vol. 104, no. 6, pp. 520–533, 2010. | spa |
dc.relation.references | F. M. You, S. Cloutier, Y. Shan, and R. Ragupathy, “Ltr annotator: automated identi cation and annotation of ltr retrotransposons in plant genomes,” International Journal of Bioscience, Biochemistry and Bioinformatics, vol. 5, no. 3, p. 165, 2015. | spa |
dc.relation.references | A. C. Wacholder, C. Cox, T. J. Meyer, R. P. Ruggiero, V. Vemulapalli, A. Damert, L. Carbone, and D. D. Pollock, “Inference of transposable element ancestry,” PLoS genetics, vol. 10, no. 8, p. e1004482, 2014. | spa |
dc.relation.references | P. Neumann, P. Novak, N. Ho ´ stˇ akov ´ a, and J. Macas, “Systematic survey of plant ltr- ´ retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classi cation,” Mobile DNA, vol. 10, no. 1, pp. 1–17, 2019. | spa |
dc.relation.references | N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” Journal of arti cial intelligence research, vol. 16, pp. 321–357, 2002. | spa |
dc.relation.references | H. He, Y. Bai, E. A. Garcia, and S. Li, “Adasyn: Adaptive synthetic sampling approach for imbalanced learning,” in 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), pp. 1322–1328, IEEE, 2008. | spa |
dc.relation.references | Z. Xu and H. Wang, “Ltr nder: an e cient tool for the prediction of full-length ltr retrotransposons,” Nucleic acids research, vol. 35, no. suppl 2, pp. W265–W268, 2007. | spa |
dc.relation.references | S. Ou and N. Jiang, “Ltr nder parallel: parallelization of ltr nder enabling rapid identi cation of long terminal repeat retrotransposons,” Mobile DNA, vol. 10, no. 1, pp. 1–3, 2019. | spa |
dc.relation.references | G. Chandan, A. Jain, H. Jain, et al., “Real time object detection and tracking using deep learning and opencv,” in 2018 International Conference on inventive research in computing applications (ICIRCA), pp. 1305–1308, IEEE, 2018. | spa |
dc.relation.references | A. E. Wahabi, I. H. Baraka, S. Hamdoune, and K. E. Mokhtari, “Detection and control system for automotive products applications by arti cial vision using deep learning,” in International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 224–241, Springer, 2019. | spa |
dc.relation.references | A. Raghunandan, P. Raghav, H. R. Aradhya, et al., “Object detection algorithms for video surveillance applications,” in 2018 International Conference on Communication and Signal Processing (ICCSP), pp. 0563–0568, IEEE, 2018. | spa |
dc.relation.references | J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Uni ed, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pa ern recognition, pp. 779–788, 2016. | spa |
dc.relation.references | D. Ellinghaus, S. Kurtz, and U. Willhoe , “Ltrharvest, an e cient and exible so ware for de novo detection of ltr retrotransposons,” BMC bioinformatics, vol. 9, no. 1, pp. 1–14, 2008. | spa |
dc.relation.references | J. D. Valencia and H. Z. Girgis, “Ltrdetector: a tool-suite for detecting long terminal repeat retrotransposons de-novo,” BMC genomics, vol. 20, no. 1, pp. 1–14, 2019. | spa |
dc.relation.references | M. Biryukov and K. Ustyantsev, “Darts: An algorithm for domain-associated retrotransposon search in genome assemblies,” Genes, vol. 13, no. 1, 2022. | spa |
dc.relation.references | H. Jung, M.-S. Jeon, M. Hodge , P. Waterhouse, and S.-i. Eyun, “Comparative evaluation of genome assemblers from long-read sequencing for plants and crops,” Journal of Agricultural and Food Chemistry, vol. 68, no. 29, pp. 7670–7677, 2020. PMID: 32530283. | spa |
dc.relation.references | Y. Chernyavskaya, X. Zhang, J. Liu, and J. Blackburn, “Long-read sequencing of the zebra sh genome reorganizes genomic architecture,” BMC Genomics, vol. 23, no. 1, pp. 1–13, 2022. | spa |
dc.relation.references | Y. Suzuki and S. Morishita, “ e time is ripe to investigate human centromeres by long-read sequencing,” DNA Research, vol. 28, 10 2021. dsab021. | spa |
dc.relation.references | Y. Jiang, Repetitive DNA sequence assembly. PhD thesis, Deakin University, 2017. | spa |
dc.relation.references | T. J. Treangen and S. L. Salzberg, “Repetitive DNA and next-generation sequencing: Computational challenges and solutions,” Nature Reviews Genetics, vol. 13, no. 1, pp. 36– 46, 2012. | spa |
dc.relation.references | S. Lian, Y. Tu, Y. Wang, X. Chen, and L. Wang, “A repetitive sequence assembler based on next-generation sequencing,” Genetics and Molecular Research, vol. 15, no. 3, pp. 1–13, 2016. | spa |
dc.relation.references | M. Zytnicki, E. Akhunov, and H. esneville, “ Tedna: a transposable element de novo assembler ,” Bioinformatics, vol. 30, pp. 2656–2658, 06 2014. | spa |
dc.relation.references | C. Chu, R. Nielsen, and Y. Wu, “REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads,” PLOS ONE, vol. 11, no. 3, pp. 1–17, 2016. | spa |
dc.relation.references | R. M. Nowak, “Genome Assembler for Repetitive Sequences,” in Information Technologies in Biomedicine (E. Piketka and J. Kawa, eds.), (Berlin, Heidelberg), pp. 422–429, Springer Berlin Heidelberg, 2012. | spa |
dc.relation.references | E. Bao, F. Xie, C. Song, and D. Song, “FLAS: fast and high-throughput algorithm for PacBio long-read self-correction,” Bioinformatics, vol. 35, pp. 3953–3960, 03 2019. | spa |
dc.relation.references | E. L. van Dijk, Y. Jaszczyszyn, D. Naquin, and C. ermes, “ e ird Revolution in Sequencing Technology,” Trends in Genetics, vol. 34, no. 9, pp. 666–681, 2018. | spa |
dc.relation.references | H. Jung, C. Wine eld, A. Bombarely, P. Prentis, and P. Waterhouse, “Tools and Strategies for Long-Read Sequencing and De Novo Assembly of Plant Genomes,” Trends in Plant Science, vol. 24, no. 8, pp. 700–724, 2019. | spa |
dc.relation.references | S. Shahid and R. K. Slotkin, “ e current revolution in transposable element biology enabled by long reads,” Current Opinion in Plant Biology, vol. 54, pp. 49–56, 2020. | spa |
dc.relation.references | R.-G. Zhang, Z.-X. Wang, S. Ou, and G.-Y. Li, “TEsorter: lineage-level classi cation of transposable elements using conserved protein domains,” bioRxiv, 2019. | spa |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | spa |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | spa |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | spa |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | spa |
dc.subject.proposal | LTR retrotransposons | eng |
dc.subject.proposal | Machine Learning | eng |
dc.subject.proposal | Detection | eng |
dc.subject.proposal | Classification | eng |
dc.subject.proposal | Genomic Object Detection | eng |
dc.subject.proposal | K-mer based method | eng |
dc.subject.proposal | Neural networks | eng |
dc.subject.unesco | Agricultura | |
dc.subject.unesco | Inteligencia artificial | |
dc.type.coar | http://purl.org/coar/resource_type/c_db06 | spa |
dc.type.content | Text | spa |
dc.type.driver | info:eu-repo/semantics/doctoralThesis | spa |
dc.type.version | info:eu-repo/semantics/publishedVersion | spa |
oaire.version | http://purl.org/coar/version/c_ab4af688f83e57aa | spa |
oaire.accessrights | http://purl.org/coar/access_right/c_abf2 | spa |
dc.description.degreename | Doctor(a) en Ingeniería | spa |
dc.publisher.program | Doctorado en Ingeniería | spa |
dc.description.researchgroup | Línea de Investigación en modelos biocomputacionales y bioinformática | spa |
dc.rights.coar | http://purl.org/coar/access_right/c_abf2 | spa |