High-Performance Computing for Breeding Better Crops

Ashley Robinson Technology, Top Posts, VSCNews magazine

By Tong Geon Lee

Vegetable and specialty crop yields have steadily increased throughout the years as a result of advances in genetics and agronomic practices. Nevertheless, a significant improvement in crop performance remains necessary, particularly given the predicted shifts in climate, pathogen outbreaks and other challenges. The continued genetic improvement of crops is, therefore, a focus for crop researchers.

During crop breeding, it is important to satisfy market demands in a timely manner to maintain thriving agriculture. Given that the approaches for improving crops through breeding and genetics are undergoing a rapid transformation, the number of disciplines involved has expanded to include deep genome (a collection of genes in an organism; a gene is a DNA segment that can contribute to characteristics such as disease resistance) information, contemporary biology tools and recent advances in computing technologies.

This article will address high-performance computing to increase awareness of this area for a non-scientific audience. Note that “computing” in this article includes a broad spectrum of disciplines not only in computer equipment, data storage, data sciences and computational programs but also in applications made possible through computer control. This article will briefly discuss 1) crop improvement through high-performance computing and 2) automation on farms through computing.


Modern crop breeding is a big data science. Approximately 200 years ago, the father of modern genetics discovered that plants follow rules when they pass their appearance traits (e.g., flower color) to their offspring. He addressed a handful of genes responsible for a few traits. Currently, however, tomato breeders are evaluating approximately 35,000 genes individually and over 610,000,000 cases of gene-to-gene interaction systemically to develop superior cultivars. If we disentangle the informative data from such genes, we would have over 900,000,000 bases of DNA in hand, and each DNA base might impact a tomato’s appearance.

This situation far exceeds the capability of manual calculation. In addition, most, if not all, of breeders and geneticists currently wish to use whole-genome sequencing technology to obtain full genome information and to select for superior gene combinations. Such technology demands a specially designed facility to analyze and store data because the amount of information is very large, which limits the data processing speed. A high-performance computing system enables research of such large-scale data.

If you Google the term “HiPerGator,” black-colored metal cabinets similar to piles of pizza boxes should appear as a result. Each pizza box is actually a small computer controller or data storage unit, and thousands of these have been clustered to provide optimal computing performance. Currently, University of Florida (UF) researchers access this high-performance computing system. With this system, researchers have been able to complete numerous tasks they could not perform previously.

For example, geneticists can use bioinformatics tools to greatly increase the quantity and quality of data on plant genes, allowing the screening of larger populations to increase breeding gains per cycle. Likewise, they can use the cutting-edge biology tool CRISPR to guide plant gene modification to create new traits of interest. To avoid unwanted changes to similar or other DNA sequences, in addition to the sequence of interest, the CRISPR tool must be sensitive and accurate, which requires whole-genome-scale sequence information. Computing can greatly enhance the speed and accuracy of this approach, resulting in the effective targeting of a single gene within a few hours.


One of the major changes in the horticultural industries is a shift toward lower labor inputs and increased mechanization to achieve higher levels of productivity at a decreased cost. Successful technical innovations that involve computer control have further enabled plant breeding and management to achieve more effective agricultural outcomes through automation. Though agricultural equipment is increasingly automated, the potential of new technologies made possible through computer control has been explored to a very limited extent.

For example, computer vision through analysis and decision making can, in near real time, aid in breeding, such as through phenotyping, and farm management practices, including disease detection, weeding and harvesting. Likewise, technologies to collect data, such as drones, can drive the development of new autonomous capabilities to improve field data collection and analysis through the advent of low-energy computing and large-scale image pixel data storage. Improved wireless or satellite connections through computer devices will also increase the speed and stability of transferring data derived from farm advisory systems. This will allow data-driven decision making for a wide range of tasks, such as the use of chemicals.


Continued success in breeding better crops and computing depends on large interdisciplinary research teams of computer scientists, engineers, geneticists and breeders interested in developing novel algorithmic and computing system approaches and evaluating these methods in agricultural settings. Above all, growers are a crucial component, serving as both stewards and beneficiaries of the outcomes by using these approaches. Therefore, strong collaborative opportunities exist with stakeholders interested in research and development. Such collaboration will provide clear guidelines for new research and its deployment in crop production.

Tong Geon Lee is an assistant professor at the University of Florida Institute of Food and Agricultural Sciences Gulf Coast Research and Education Center in Wimauma.

Share this Post