Integrating genetic markers and adiabatic quantum machine learning to improve disease resistance-based marker assisted plant selection

Authors

  • Enow Takang Achuo Albert Department of Plant Biology, Faculty of Science, University of Yaoundé I, P.O. Box 812, Yaoundé, Center Region, Cameroon
  • Ngalle Hermine Bille Department of Plant Biology, Faculty of Science, University of Yaoundé I, P.O. Box 812, Yaoundé, Center Region, Cameroon
  • Bell Joseph Martin Department of Plant Biology, Faculty of Science, University of Yaoundé I, P.O. Box 812, Yaoundé, Center Region, Cameroon
  • Ngonkeu Mangaptche Eddy Leonard Department of Plant Biology, Faculty of Science, University of Yaoundé I, P.O. Box 812, Yaoundé, Center Region, Cameroon

DOI:

https://doi.org/10.25081/jsa.2023.v7.8556

Keywords:

Plant disease resistance, Marker-assisted plant selection, Genetic markers, Adiabatic quantum computing

Abstract

The goal of this research was to create a more accurate and efficient method for selecting plants with disease resistance using a combination of genetic markers and advanced machine learning algorithms. A multi-disciplinary approach incorporating genomic data, machine learning algorithms and high-performance computing was employed. First, genetic markers highly associated with disease resistance were identified using next-generation sequencing data and statistical analysis. Then, an adiabatic quantum machine learning algorithm was developed to integrate these markers into a single predictor of disease susceptibility. The results demonstrate that the integrative use of genetic markers and adiabatic quantum machine learning significantly improved the accuracy and efficiency of disease resistance-based marker-assisted plant selection. By leveraging the power of adiabatic quantum computing and genetic markers, more effective and efficient strategies for disease resistance-based marker-assisted plant selection can be developed.

Downloads

Download data is not yet available.

References

Adhikari, P., Oh, Y., & Panthee, D. R. (2017). Current Status of Early Blight Resistance in Tomato: An Update. International Journal of Molecular Sciences, 18(10), 2019. https://doi.org/10.3390/ijms18102019

Adhikari, T. B., Siddique, M. I., Louws, F. J., Sim, S.-C., & Panthee, D. R. (2023). Molecular mapping of quantitative trait loci for resistance to early blight in tomatoes. Frontiers in Plant Science, 14, 1135884. https://doi.org/10.3389/fpls.2023.1135884

AlNuaimi, N., Masud, M. M., Serhani, M. A., & Zaki, N. (2020). Streaming feature selection algorithms for big data: A survey. Applied Computing and Informatics, 18(1/2), 113-135. https://doi.org/10.1016/j.aci.2019.01.001

Arafa, R. A., Rakha, M. T., Soliman, N. E. K., Moussa, O. M., Kamel, S. M., & Shirasawa, K. (2017). Rapid identification of candidate genes for resistance to tomato late blight disease using next-generation sequencing technologies. PLoS One, 12(12), e0189951. https://doi.org/10.1371/journal.pone.0189951

Atashgahi, Z., Zhang, X., Kichler, N., Liu, S., Yin, L., Pechenizkiy, M., Veldhuis, R., & Mocanu, D. C. (2023). Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks (arXiv:2303.07200). arXiv. https://doi.org/10.48550/arXiv.2303.07200

Bacanin, N., Zivkovic, M., Antonijevic, M., Venkatachalam, K., Lee, J., Nam, Y., Marjanovic, M., Strumberger, I., & Abouhawwash, M. (2023). Addressing feature selection and extreme learning machine tuning by diversity-oriented social network search: An application for phishing websites detection. Complex & Intelligent Systems. https://doi.org/10.1007/s40747-023-01118-z

Barzilay, O., & Brailovsky, V. L. (1999). On domain knowledge and feature selection using a support vector machine. Pattern Recognition Letters, 20(5), 475-484. https://doi.org/10.1016/S0167-8655(99)00014-8

Bashir, S., Rehman, N., Zaman, F. F., Naeem, M. K., Jamal, A., Tellier, A., Ilyas, M., Arias, G. A. S., & Khan, M. R. (2022). Genome-wide characterization of the NLR gene family in tomato (Solanum lycopersicum) and their relatedness to disease resistance. Frontiers in Genetics, 13. https://doi.org/10.3389/fgene.2022.931580

Benos, L., Tagarakis, A. C., Dolias, G., Berruto, R., Kateris, D., & Bochtis, D. (2021). Machine Learning in Agriculture: A Comprehensive Updated Review. Sensors, 21(11), 3758. https://doi.org/10.3390/s21113758

Bhat, J. A., Ali, S., Salgotra, R. K., Mir, Z. A., Dutta, S., Jadon, V., Tyagi, A., Mushtaq, M., Jain, N., Singh, P. K., Singh, G. P., & Prabhu, K. V. (2016). Genomic Selection in the Era of Next Generation Sequencing for Complex Traits in Plant Breeding. Frontiers in Genetics, 7, 221. https://doi.org/10.3389/fgene.2016.00221

Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., & Lloyd, S. (2017). Quantum Machine Learning. Nature, 549, 195-202. https://doi.org/10.1038/nature23474

Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123-140. https://doi.org/10.1007/BF00058655

Brzezinski, D. (2020). Fibonacci and k-Subsecting Recursive Feature Elimination (arXiv:2007.14920). arXiv. https://doi.org/10.48550/arXiv.2007.14920

Buschjäger, S., & Morik, K. (2021). There is no Double-Descent in Random Forests (arXiv:2111.04409). arXiv. https://doi.org/10.48550/arXiv.2111.04409

Clark, S. A., & van der Werf, J. (2013). Genomic best linear unbiased prediction (gBLUP) for the estimation of genomic breeding values. In C. Gondro, J. van der Werf & B. Hayes (Eds.), Genome-Wide Association Studies and Genomic Prediction: Methods in Molecular Biology (Vol. 1019, pp. 321-330) Totowa, New Jersey: Humana Press. https://doi.org/10.1007/978-1-62703-447-0_13

Clarke, G. P., & Kapelner, A. (2020). The Bayesian Additive Regression Trees Formula for Safe Machine Learning-Based Intraocular Lens Predictions. Frontiers in Big Data, 3. https://doi.org/10.3389/fdata.2020.572134

Colombelli, F., Kowalski, T. W., & Recamonde-Mendoza, M. (2021). A Hybrid Ensemble Feature Selection Design for Candidate Biomarkers Discovery from Transcriptome Profiles (arXiv:2108.00290). arXiv. https://doi.org/10.48550/arXiv.2108.00290

Consul-Pacareu, S., Montaño, R., Rodriguez-Fernandez, K., Corretgé, À., Vilella-Moreno, E., Casado-Faulí, D., & Atchade-Adelomou, P. (2023). Quantum Machine Learning hyperparameter search (arXiv:2302.10298). arXiv. https://doi.org/10.48550/arXiv.2302.10298

Czosnek, H., Eybishtz, A., Sade, D., Gorovits, R., Sobol, I., Bejarano, E., Rosas-Díaz, T., & Lozano-Durán, R. (2013). Discovering Host Genes Involved in the Infection by the Tomato Yellow Leaf Curl Virus Complex and in the Establishment of Resistance to the Virus Using Tobacco Rattle Virus-based Post Transcriptional Gene Silencing. Viruses, 5(3), 998-1022. https://doi.org/10.3390/v5030998

Das, R., Kasieczka, G., & Shih, D. (2022). Feature Selection with Distance Correlation (arXiv:2212.00046). arXiv. https://doi.org/10.48550/arXiv.2212.00046

Date, P., & Potok, T. (2021). Adiabatic Quantum Linear Regression. Scientific Reports, 11, 21905. https://doi.org/10.1038/s41598-021-01445-6

Difabachew, Y. F., Frisch, M., Langstroff, A. L., Stahl, A., Wittkop, B., Snowdon, R. J., Koch, M., Kirchhoff, M., Cselényi, L., Wolf, M., Förster, J., Weber, S., Okoye, U. J., & Zenke-Philippi, C. (2023). Genomic prediction with haplotype blocks in wheat. Frontiers in Plant Science, 14, 1168547. https://doi.org/10.3389/fpls.2023.1168547

Dorleon, G., Megdiche, I., Bricon-Souf, N., & Teste, O. (2022, August 22-24). Feature Selection Under Fairness and Performance Constraints. Big Data Analytics and Knowledge Discovery: 24th International Conference, DaWaK 2022, Vienna, Austria (pp. 125-130). https://doi.org/10.1007/978-3-031-12670-3_11

Duan, Y., Duan, S., Xu, J., Zheng, J., Hu, J., Li, X., Li, B., Li, G., & Jin, L. (2021). Late Blight Resistance Evaluation and Genome-Wide Assessment of Genetic Diversity in Wild and Cultivated Potato Species. Frontiers in Plant Science, 12, 710468. https://doi.org/10.3389/fpls.2021.710468

Elaziz, M. A., Ewees, A. A., Al-qaness, M. A. A., Alshathri, S., & Ibrahim, R. A. (2022). Feature Selection for High Dimensional Datasets Based on Quantum-Based Dwarf Mongoose Optimization. Mathematics, 10(23), 4565. https://doi.org/10.3390/math10234565

Freijeiro-González, L., Febrero-Bande, M., & González-Manteiga, W. (2020). A critical review of LASSO and its derivatives for variable selection under dependence among covariates (arXiv:2012.11470). arXiv. https://doi.org/10.48550/arXiv.2012.11470

Ghosh, M., Dey, N., Mitra, D., & Chakrabarti, A. (2022). A Novel Quantum Algorithm for Ant Colony Optimization. IET Quantum Communication, 3(1), 13-29. https://doi.org/10.1049/qtc2.12023

Gujju, Y., Matsuo, A., & Raymond, R. (2023). Quantum Machine Learning on Near-Term Quantum Devices: Current State of Supervised and Unsupervised Techniques for Real-World Applications (arXiv:2307.00908). arXiv. https://doi.org/10.48550/arXiv.2307.00908

Han, W., Zhao, J., Deng, X., Gu, A., Li, D., Wang, Y., Lu, X., Zu, Q., Chen, Q., Chen, Q., Zhang, J., & Qu, Y. (2022). Quantitative Trait Locus Mapping and Identification of Candidate Genes for Resistance to Fusarium Wilt Race 7 Using a Resequencing-Based High Density Genetic Bin Map in a Recombinant Inbred Line Population of Gossypium barbadense. Frontiers in Plant Science, 13, 815643. https://doi.org/10.3389/fpls.2022.815643

Jeon, D., Kang, Y., Lee, S., Choi, S., Sung, Y., Lee, T.-H., & Kim, C. (2023). Digitalizing breeding in plants: A new trend of next-generation breeding based on genomic prediction. Frontiers in Plant Science, 14, 1092584. https://doi.org/10.3389/fpls.2023.1092584

Khaire, U. M., & Dhanalakshmi, R. (2022). Stability of feature selection algorithm: A review. Journal of King Saud University - Computer and Information Sciences, 34(4), 1060-1073. https://doi.org/10.1016/j.jksuci.2019.06.012

Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1-2), 273-324. https://doi.org/10.1016/S0004-3702(97)00043-X

Krauth, W. (2021). Event-Chain Monte Carlo: Foundations, Applications, and Prospects. Frontiers in Physics, 9, 663457. https://doi.org/10.3389/fphy.2021.663457

Landy, J. (2017). Stepwise regression for unsupervised learning (arXiv:1706.03265). arXiv. https://doi.org/10.48550/arXiv.1706.03265

Letzgus, S., Wagner, P., Lederer, J., Samek, W., Müller, K.-R., & Montavon, G. (2022). Toward Explainable AI for Regression Models. IEEE Signal Processing Magazine, 39(4), 40-58. https://doi.org/10.1109/MSP.2022.3153277

Liu, S., & Motani, M. (2022). Improving Mutual Information based Feature Selection by Boosting Unique Relevance (arXiv:2212.06143). arXiv. https://doi.org/10.48550/arXiv.2212.06143

Louppe, G. (2015). Understanding Random Forests: From Theory to Practice (arXiv:1407.7502). arXiv. https://doi.org/10.48550/arXiv.1407.7502

Ma, N., Chu, W., & Gong, J. (2023). Adiabatic quantum learning (arXiv:2303.01023). arXiv. https://doi.org/10.48550/arXiv.2303.01023

Mahmood, U., Li, X., Fan, Y., Chang, W., Niu, Y., Li, J., Qu, C., & Lu, K. (2022). Multi-omics revolution to promote plant breeding efficiency. Frontiers in Plant Science, 13, 1062952. https://doi.org/10.3389/fpls.2022.1062952

Mao, X., Peng, L., & Wang, Z. (2022). Nonparametric Feature Selection by Random Forests and Deep Neural Networks (arXiv:2201.06821). arXiv. https://doi.org/10.48550/arXiv.2201.06821

Massi, M. C., Franco, N. R., Manzoni, A., Paganoni, A. M., Park, H. A., Hoffmeister, M., Brenner, H., Chang-Claude, J., Ieva, F., & Zunino, P. (2023). Learning high-order interactions for polygenic risk prediction. PLoS One, 18(2), e0281618. https://doi.org/10.1371/journal.pone.0281618

Mathew, B., Hauptmann, A., Léon, J., & Sillanpää, M. J. (2022). NeuralLasso: Neural Networks Meet Lasso in Genomic Prediction. Frontiers in Plant Science, 13, 800161. https://doi.org/10.3389/fpls.2022.800161

Merrick, L. F., Lozada, D. N., Chen, X., & Carter, A. H. (2022). Classification and Regression Models for Genomic Selection of Skewed Phenotypes: A Case for Disease Resistance in Winter Wheat (Triticum aestivum L.). Frontiers in Genetics, 13, 835781. https://doi.org/10.3389/fgene.2022.835781

Mühlenbein, H. (1990). Limitations of multi-layer perceptron networks—Steps towards genetic neural networks. Parallel Computing, 14(3), 249-260. https://doi.org/10.1016/0167-8191(90)90079-O

Oreski, D., Oreski, S., & Klicek, B. (2017). Effects of dataset characteristics on the performance of feature selection techniques. Applied Soft Computing, 52, 109-119. https://doi.org/10.1016/j.asoc.2016.12.023

Pabuccu, H., & Barbu, A. (2023). Feature Selection for Forecasting (arXiv:2303.02223). arXiv. https://doi.org/10.48550/arXiv.2303.02223

Pandey, A. K., Kumar, A., Dinesh, K., Varshney, R., & Dutta, P. (2022). The hunt for beneficial fungi for tomato crop improvement – Advantages and perspectives. Plant Stress, 6, 100110. https://doi.org/10.1016/j.stress.2022.100110

Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O’Sullivan, J. M. (2022). A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Frontiers in Bioinformatics, 2, 927312. https://doi.org/10.3389/fbinf.2022.927312

Robbiati, M., Cruz-Martinez, J. M., & Carrazza, S. (2023). Determining probability density functions with adiabatic quantum computing (arXiv:2303.11346). arXiv. https://doi.org/10.48550/arXiv.2303.11346

Rocha, A. V., Shamarova, E., & Simas, A. B. (2017). Improved residuals for linear regression models under heteroskedasticity of unknown form (arXiv:1607.07926). arXiv. https://doi.org/10.48550/arXiv.1607.07926

Saeys, Y., Abeel, T., & van de Peer, Y. (2008). Robust Feature Selection Using Ensemble Feature Selection Techniques. In W. Daelemans, B. Goethals & K. Morik (Eds.), Machine Learning and Knowledge Discovery in Databases (Vol. 5212, pp. 313-325). Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-540-87481-2_21

Saibene, A., & Gasparini, F. (2023). Genetic algorithm for feature selection of EEG heterogeneous data. Expert Systems with Applications, 217, 119488. https://doi.org/10.1016/j.eswa.2022.119488

Sengupta, S., Basak, S., & Peters II, R. A. (2018). Particle Swarm Optimization: A survey of historical and recent developments with hybridization perspectives. Machine Learning and Knowledge Extraction, 1(1), 157-191. https://doi.org/10.3390/make1010010

Simeone, O. (2022). An Introduction to Quantum Machine Learning for Engineers (arXiv:2205.09510). arXiv. https://doi.org/10.48550/arXiv.2205.09510

Sisiaridis, D., & Markowitch, O. (2017). Feature Extraction and Feature Selection: Reducing Data Complexity with Apache Spark (arXiv:1712.08618). arXiv. https://doi.org/10.48550/arXiv.1712.08618

Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267-288.

van Wieringen, W. N. (2023). Lecture notes on ridge regression (arXiv:1509.09169). arXiv. https://doi.org/10.48550/arXiv.1509.09169

Vlasic, A., Grant, H., & Certo, S. (2023). An Advantage Using Feature Selection with a Quantum Annealer (arXiv:2211.09756). arXiv. https://doi.org/10.48550/arXiv.2211.09756

Wang, C.-C. J., & Bennink, R. S. (2023). Variational quantum regression algorithm with encoded data structure (arXiv:2307.03334). arXiv. https://doi.org/10.48550/arXiv.2307.03334

Wang, H., Hans-DietrichHaasis, Du, P., Xu, X., Su, M., Wen, S., Yue, W., & Zhang, S. (2021a). Adaptive Group Collaborative Artificial Bee Colony Algorithm (arXiv:2112.01215). arXiv. https://doi.org/10.48550/arXiv.2112.01215

Wang, X., Liu, J., & Liu, G. (2021b). Diseases Detection of Occlusion and Overlapping Tomato Leaves Based on Deep Learning. Frontiers in Plant Science, 12, 792244. https://doi.org/10.3389/fpls.2021.792244

Wang, Z., Dhakal, S., Cerit, M., Wang, S., Rauf, Y., Yu, S., Maulana, F., Huang, W., Anderson, J. D., Ma, X.-F., Rudd, J. C., Ibrahim, A. M. H., Xue, Q., Hays, D. B., Bernardo, A., St. Amand, P., Bai, G., Baker, J., Baker, S., & Liu, S. (2022). QTL mapping of yield components and kernel traits in wheat cultivars TAM 112 and Duster. Frontiers in Plant Science, 13, 1057701. https://doi.org/10.3389/fpls.2022.1057701

Williamson, H. F., Brettschneider, J., Caccamo, M., Davey, R. P., Goble, C., Kersey, P. J., May, S., Morris, R. J., Ostler, R., Pridmore, T., Rawlings, C., Studholme, D., Tsaftaris, S. A., & Leonelli, S. (2023). Data management challenges for artificial intelligence in plant and agricultural research. F1000Research, 10, 324. https://doi.org/10.12688/f1000research.52204.2

Wu, J., Ainsworth, E. A., Wang, S., Guan, K., & He, J. (2022). Adaptive Transfer Learning for Plant Phenotyping (arXiv:2201.05261). arXiv. https://doi.org/10.48550/arXiv.2201.05261

Xu, Z. E., Huang, G., Weinberger, K. Q., & Zheng, A. X. (2019). Gradient Boosted Feature Selection (arXiv:1901.04055). arXiv. https://doi.org/10.48550/arXiv.1901.04055

Xue, Y., Tang, Y., Xu, X., Liang, J., & Neri, F. (2022). Multi-Objective Feature Selection With Missing Data in Classification. IEEE Transactions on Emerging Topics in Computational Intelligence, 6(2), 355-364. https://doi.org/10.1109/TETCI.2021.3074147

Yang, Y., Wang, W., Fu, H., & Kuo, C.-C. J. (2022). On Supervised Feature Selection from High Dimensional Feature Spaces (arXiv:2203.11924). arXiv. https://doi.org/10.48550/arXiv.2203.11924

Zhang, C., Soda, P., Bi, J., Fan, G., Almpanidis, G., & Garcia, S. (2021). An Empirical Study on the Joint Impact of Feature Selection and Data Re-sampling on Imbalance Classification (arXiv:2109.00201). arXiv. https://doi.org/10.48550/arXiv.2109.00201

Zhou, X., Carbonetto, P., & Stephens, M. (2013). Polygenic Modeling with Bayesian Sparse Linear Mixed Models. PLoS Genetics, 9(2), e1003264. https://doi.org/10.1371/journal.pgen.1003264

Published

11-09-2023

How to Cite

Albert, E. T. A., Bille, N. H., Martin, B. J., & Leonard, N. M. E. (2023). Integrating genetic markers and adiabatic quantum machine learning to improve disease resistance-based marker assisted plant selection. Journal of Scientific Agriculture, 7, 63–76. https://doi.org/10.25081/jsa.2023.v7.8556

Issue

Section

Articles