Even if a single-origin Colombian coffee bean harvest was roasted and ground in the same way as a batch from Ethiopia, a coffee connoisseur would be able to tell the difference. But the differences in the flavour notes and other characteristics of coffee varieties from around the world are not down to variations in individual genes, a study has found. Rather, they seem to be mainly the result of wholesale swapping, deletion and rearrangement of chromosomes.
The most complete sequencing yet of the genome of Coffea arabica has revealed that the ‘letters’ in the DNA strands differ only slightly between varieties. “If you look at single nucleotide variations, the levels as are anywhere from 10 to 100 times lower than any other species,” says Michele Morgante, a plant geneticist at the University of Udine, Italy, and an author of the study.
Morgante and his team used next-generation sequencing technology that can read DNA strands up to hundreds of thousands of base pairs in length without interruption and with greater accuracy than earlier technologies. The results are published today in Nature Communications1.
“With those type of technologies, it becomes much easier to assemble the genome and you can also assemble regions that were previously inaccessible,” says Morgante.
What’s in a brew?
Coffee’s genetic make-up is no trivial concern; 10 million tonnes of the crop were grown and sold in 2022–23. The coffee that we drink comes from two species: Coffea canephora, which is also known as robusta, and Coffea arabica, known as arabica. In many cases, beans from the two species are blended to make a brew. But the beans of single species are also roasted and sold. Overall, arabica beans represent around 56% of all coffee sold.
Most genetic variation in living organisms comes from hybridization with other species. However, that is a relatively rare event for C. arabica because it has more than two copies of each chromosome — a phenomenon called polyploidy. Coffea canephora has two copies of each chromosome, but C. arabica contains multiple copies. This makes it much more difficult for arabica to interbreed with other species.
As a result, C. arabica’s main source of single nucleotide variation is mutation, which occurs at a steady rate over time. However, the species is also relatively young, having formed as a hybrid of robusta and Coffea eugenioides — another coffee species that is not widely cultivated — within the past 50,000 years.
“From that single plant, which has basically no variation, you create the whole species, and then the variation that you have is only the novel mutations that have occurred since that event,” Morgante says.
Despite this, there is substantial variation in the physical characteristics of the arabica coffee plant, including different flavour profiles in the beans and variations in disease resistance, says emeritus geneticist Juan Medrano at the UC Davis Coffee Center at the University of California, Davis. “We’re always talking about low variability at the DNA level, but there is variability at the structural level, at the chromosomal level, at the level of deletions … and insertions,” Medrano says.
The study found evidence of significant chromosomal rearrangements, especially in a varietal of C. arabica called Bourbon. There were deletions, in which fragments of chromosomes were missing — in some cases large chunks — and even instances in which entire chromosomes were absent. “In some varieties, you had either only three [chromosome] copies — let’s say two canephora, one eugenioides — in another one you had five copies — two eugenioides, three canephora,” Morgante says.
A valuable resource
In addition to shedding light on C. arabica’s phenotypic variation, the sequenced genome will be a valuable resource for coffee breeders, particularly as disease and climate change challenge the long-term sustainability of coffee, says Kassahun Tesfaye, a plant geneticist at the Institute of Biotechnology at Addis Ababa University.
“Getting a proper in-depth understanding of the genome would basically help us to understand how the crop evolved and also understand the genomes of coffee in line with its parents,” Tesfaye says.
The work will also inform coffee breeding programmes — and possibly even genetic modification — to select for favourable characteristics, such as resistance to a fungus called coffee rust or low caffeine levels, Tesfaye says. The challenge now is to translate this understanding of the genome into practical outcomes for coffee breeders.
“We need to equip breeders, mostly in the developing countries, with the toolkits to breed for low caffeine, to breed for specific disease [resistance], to breed for high productivity,” Tesfaye says.