How scientists in Africa – the most genetically diverse continent on earth – looked deep into the data in 2020 to make headway in understanding diseases that affect millions.
Dr Ambroise Wonkam carries hundreds of thousands of years of history in his blood. The impacts of pathogens, migration, environment and geography are written into the braids of DNA he inherited from tens of thousands of generations of ancestors.
This richness, he remarks, is not unique. It is a shared legacy for more than one billion people: in South Africa, where he works; in Cameroon, where he is from; and across the African continent – the most genetically diverse landmass on earth. With such wealth all around him, how could he not spend his life studying it?
Wonkam is one of several scientists from Africa who made significant gains in 2020 towards understanding diseases that affect millions – and are not novel pandemic viruses. He and others lead large experiments that analyse whole genome data from thousands of African volunteers in South Africa, Ghana, Nigeria and beyond.
The knowledge these scientists are collecting is broadening humanity’s reference genetic archive, the libraries scientists around the world use to identify and compare genomic information. It is filling in blind spots in a field that has been dominated for a generation by institutions in the Global North and in data that has historically not included African participants. Despite lockdowns and working from home, 2020 was a productive year for Wonkam and his peers.
The power of place
Humans diverged from chimpanzees 5 to 6 million years ago in East Africa. Modern humans appeared just 300,000 to 200,000 years ago, and some time after that (about 100,000 years ago, though the evidence is still being interpreted), our species started migrating beyond Africa, probably over several attempts. As we walked, humans homogenised, sharing traits, discarding others and becoming more similar the further we went.
“We are all African,” Wonkam said from Cape Town. “But ancestral Africans, like me, who have stayed on the continent for a very long time, have at least 300,000 years of human genetic history in their blood, which makes the variation in the African population thousands of times higher than in any other populations in the world.”
When presented on a chart, the data shows the exodus effect: a plot of genetic diversity among humans against distance from East Africa looks like a slender, almost 45-degree line sloping downwards, through the Middle East, Europe, Asia and, finally at the tail end, the Americas, where populations tend to be the most genetically similar.
Africa’s rich, varied dataset is attractive to scientists, and also to companies and health institutes abroad. Over the years, several have exploited lax patient privacy laws and left scandal and mistrust in their wake. A drug company settled a high-profile lawsuit after running trials that were poorly explained to its African participants; one health technology company shut down a product after a whistle-blower revealed it would be sold against the agreements of volunteers whose data was used to produce it; and during and after the West Africa Ebola epidemic, certain European governments were accused of “biological colonialism” after allegations that blood samples extracted from survivors could not be requested back by their home countries for research purposes. The legacies of these missteps persist, and investigators today must work hard to instil trust and accountability into their process.
Solving sickle cell
Dr Wonkam’s lab at the University of Cape Town studies the genetics of people living with sickle cell disease, a congenital misshaping of the blood cells that affects how effectively they transport oxygen. The disease is prevalent in Africans and some people of African ancestry.
Having one of the genes that cause sickle cell disease confers some natural resistance to malaria, a parasite that targets the blood and is also often fatal. This was a nudge by evolution to relieve pressure from the parasite on early humans. But when the gene is inherited from both parents, the resulting condition is brutal. Sickle cell disease can cause anaemia, frequent pain, increased illnesses due to a weakened immune system, and vision and growth problems. It can lead to a life of complications that ends too early, before the age of 50 in developed countries. Across Africa, 2 to 3 percent of babies are born with it, but the mortality rate before the age of five is more than 50 percent.
“[Sickle cell] mortality for adults has not changed for 40 years” in the United States, where the best treatments are readily available, Wonkam said. “If we put that in context, the disease was described 110 years ago, but there has been only one [primary] medication over the last 110 years and some new ones that were approved [just a few] years ago.”
The US is a leader in medical research, but also a place of well-documented racial disparities in access to healthcare. Sickle cell disease, which affects Black Americans almost exclusively, received far lower federal research funding per patient than cystic fibrosis, which mostly affects white children, a 2020 study showed. The study also cited a similar inquiry from 40 years ago, which showed the same disparities.
At the start of his career, Wonkam saw the inequity in medical research and vowed to correct it from within Africa. “There is a lack of interest by researchers, or by funders, and someone had to take the battle somewhere. Fortunately, we have a network on the continent that is getting stronger and stronger.”
Wonkam’s laboratory sequences entire exomes – the parts of DNA that code proteins – from hundreds of Africans who have lived past the age of 50 with sickle cell disease – so-called “long survivors” – and compares the results to those patients who have experienced stroke or shock, as well as a control group. Volunteers in the five-year-long study were mainly from Cameroon and the Democratic Republic of the Congo, where sickle cell is more prevalent – fewer people in South Africa are born with the disease.
Each volunteer signed a consent form that offered options: allow their genetic data to be studied solely within the context of the experiment, for other experiments as well, or more broadly (anonymously) as part of a global dataset. Wonkam’s lab drew blood and extracted the DNA in Africa, sent the condensed samples to a sequencing facility in the US, and returned the data to Cape Town where his team did a sort of mathematical “brute-force analysis”. They looked as wide as they could, without bias, at the activity of every gene on the genome, trying to glimpse unique differences that might be keeping these long survivors alive. They found much more than they were looking for.
The results were curious. Some confirmed pathways that were known to be implicated in sickle cell disease, such as the body’s production lines for Vitamin B, anticoagulants or nitrous oxide. But other findings surprised the researchers. “The pathway we could not anticipate at all was [linked to causing] low blood pressure. Most sickle cell disease patients tend to have lower blood pressure already,” Wonkam said, so why would an otherwise unhealthy tendency for lower blood pressure be associated with rare long-term survival? That is a question the researchers are trying to answer. Additionally, some genes associated with insulin, which processes starches in the body, were mutated in the long-term survivor group, meaning that this pathway is connected to their overall longevity. Wonkam’s findings have kicked open the door for future analysis.
“What we found is that if you look at what nature [has] evolved over time, you can inspire yourself to design treatments. Those will likely be the most successful because it’s the way nature managed to get some of the patients to [live longer],” he said.
The study, published in June 2020, revealed a dozen mutated genes that were shared across the volunteer group; genes that were different, often by only one letter, from the general population and, as a result, built abnormal proteins. “Basically, every single discovery we found in that paper might be a route for a new treatment for sickle cell disease.”
Wonkam intends to build a cohort of patients and follow them for life. His data will add to large repositories of other human genomic data held in facilities in Europe, the US and, now, in Africa. This, he said, is vital.
“Without an African population database, at least 10 percent of variation is not present,” in global archives. He was referring to a 2018 study carried out with Johns Hopkins University, that showed up to 300 million base pairs – or a tenth of the human genome – appearing in unique forms in Africans compared with the references from the Human Genome Project, which largely excluded Africans. “Every variation discovered in Africa [adds] more genomic sequencing in the public database, and that has a value for all studies – not just African studies, but all studies.”
In 2016, while filming and producing a report on DNA research for Al Jazeera, I visited the stroke clinic at the Korle Bu teaching hospital in Accra, Ghana. At the clinic, a man was learning to move again. Slowly, he pulled himself onto a low rung of a wooden ladder while a physical therapist held his arm. He had lost basic motor skills in an instant when a stroke hit deep in his brain, and his rehabilitation was expected to take weeks or years. He was not yet 50.
Dr Albert Akpalu was doing rounds through the clinic and told us about a future in which this patient’s treatment could be improved by tailoring medications to his genetic profile. Akpalu was running one part of the SIREN stroke study collaboration with colleagues in Nigeria and within a consortium called Human Heredity and Health (H3) Africa that stretches across the continent – and is jointly supported by the UK’s Wellcome Trust and the US’s National Institute of Health.
Were it not for COVID, SIREN might have published its most comprehensive findings yet in 2020. “To make a genetic inference, you need a sample power of more than 3,000 [samples],” Akpalu explained from Accra in December. “We’re getting close to that.”
Akpalu works closely with Dr Rufus Akinyemi, a stroke specialist in Ibadan, Nigeria. In addition to running experiments, Akinyemi oversees several biobanks – storage centres that keep hundreds of thousands of samples of patient blood, serum, DNA and tissue at subfreezing temperatures – in Nigeria and Ghana.
A stroke happens when blood flow to the brain is blocked, suddenly choking oxygen to neurons, which often leads to loss of function, dementia or death. Stroke is the leading cause of neurological emergency in Africa, Akinyemi said, though granular detail is not as well recorded as in other regions.
Africa maintains a higher incidence rate of stroke than the US – according to overviews from 2015 and 2016 – and strokes probably kill at least 300,000 people in sub-Saharan Africa every year, according to one review.
Akinyemi began his career studying the cognitive impairments like dementia that so often follow a stroke. “I realised that if I study this, the risk factors and the genomics, I would be tackling one of the greatest disease burdens in Africa,” he said.
Over four years, SIREN colleagues collected samples from stroke survivors and control-group, non-stroke volunteers in West Africa. Now they have nearly 4,000 of each – enough to run a Genome-Wide Association Study (GWAS) to identify genes linked to the condition, its risks and recovery. “The initial findings,” Akinyemi said, “are looking exciting.”
Akinyemi, Akpalu and their colleagues hope to build bedside tools. “We want to develop Afrocentric [genetic] risk scores that can help us predict stroke. We have these for populations of European descent, but for people of African ancestry, they do not give very accurate results. We need these for populations in Africa to correctly predict the chances of stroke.”
The genomics of stroke have been studied before. Previous analyses have described the genetics of the disease, but are often based on data gathered from mostly caucasian volunteers. A June 2020 study in the US, Canada and Europe looked at the genomes of 22,000 people of African descent, identifying genes implicated in the disease. The SIREN study focuses on Africans in Africa, where it will be the first and the largest yet.
“One thing the consortium has done is develop a very unique chip that is enriched in African content, derived from African populations,” Akinyemi explained. Built by the H3 Africa network in partnership with Illumina, the American genomics company, the card-sized gene chip is an array of tiny wells that hold thousands of genes and genetic variations that are more prevalent in Africa. Introduced just a few years ago as a not-for-profit tool, and developed from early H3 studies, it allows researchers to capture a broad snapshot of the genetics at work in a chosen sample, tailored for populations on the continent.
H3 Africa ensures that genomic data collected will be added to accessible global databases. “We do hope that the findings from this study will unmask some novel variants,” Akinyemi concluded, “perhaps genetic variants that are associated with ischaemic stroke that have not been previously reported in other populations … which will benefit not just African patients, but all global stroke patients in terms of prevention, early detection, treatment and rehabilitation.”
Three million unique variants
Dr Nicola Mulder thumbs through her notes when I ask how many human genome samples from Africa have been added to global databases over recent years. She is a scientist, so it is an unfair prompt on a Zoom call, but she accepts, tabulating aloud “…We put in 348 sequences, additional exome sequences, 10,000 samples on the genotyping array, exomes from Botswana, shotgun metagenomic studies, deep-sequencing of neurological diseases … we’re getting a flood of data…” She makes an estimate: “If I look briefly, I would say maybe 11 or 12 thousand samples for which we have genomic data in the repository, which is significantly different to what we had before. This is probably over the past three years.”
Mulder is the principal investigator for H3ABioNet at the University of Cape Town, a bioinformaticist overseeing a team of 50, working with a network of 28 institutions spread across the continent. “We do all the support: data collection, some analysis, processing, submission, and training to analyse the data,” she said from Cape Town. Although Mulder works with Dr Wonkam, his sickle cell study was conducted independently of H3Africa.
In October, Mulder co-authored a paper in the journal, Nature, that combined some of the most intriguing genomic findings from across the continent.
Mulder and colleagues looked for unique single-letter changes in genes and compared these anomalies to public datasets. “In other studies [like GWAS] you’re looking for signals that jump from the noise. In this study, we’re looking for something that doesn’t exist elsewhere,” she explained. The work produced 300 terabytes of genetic sequence data, enough to make the problem of sending it all from the US to South Africa a three-month project.
The analysis revealed the undocumented richness of genetics in Africa in one bold headline: three million unique genetic variants, letters along DNA’s thread that had never been seen where they were being seen; each one found, named and recorded. Only more interesting was the small size of the volunteer group they studied. The team had analysed just over 400 people from 13 countries, covering 50 ethnolinguistic groups – Africa is not only the most genetically diverse continent, it also hosts the highest number of languages spoken on earth.
“Three million is a lot,” Mulder said when asked about what that number meant to her as a bioinformaticist, the type of specialist who deals with large datasets. “We never hit a plateau, we just keep finding novel variants. These are variants that have never been seen, but some populations are walking around with many people having it. The number is quite significant.”
Within the data, more links. Some of the unique variants revealed evidence of the pressures that diseases have exerted on populations since the dawn of the species. Others indicated tendencies towards decreased mortality for certain infectious diseases like flu. There were dozens of genetic locations that were undergoing “strong selection”, areas that are still actively evolving today. A wider look at the dataset showed the history of migration between populations, revealing evidence of movements and geographical stopovers that were lost to archaeology, but etched into the genome.
The study pointed to a wide horizon yet to be explored, and piqued the curiosity of institutions around the world.
Where to now? Some of the genomics hardware these researchers rely on was repurposed during the coronavirus pandemic to identify unique variants of COVID-19 as they appeared. In South Africa, Wonkam’s lab was working from home for several months before instituting an office-return schedule. Patient recruitment was delayed in Ghana but has begun again.
Now colleagues, funders and startup genomics companies are watching closely, designing the experiments they will embark on as the continent opens up again.