In the fight against epidemics, including the current Covid-19 coronavirus, medical staff are on the front line, risking their own lives to save the lives of others. But behind the lines the war is fought – or should be fought – by authorities, medical researchers, statisticians and computer scientists using an array of artificial intelligence (AI) and data science technologies.
The SARS-CoV2 or Covid-19 virus, which surpassed 500,000 confirmed cases and 23,000 deaths within three months of first detection (WHO statistics), appears to have taken most national governments by surprise – but it shouldn’t have.
Since 2000, there has been warning after warning: SARS (SARS-CoV) in 2003, H1N1 “swine flu” influenza in 2009, MERS (MERS-CoV) in 2012, West African Ebola in 2014, Zika in 2015, and numerous re-emergences of diseases such as cholera, dengue, yellow fever and, even, plague.
Global leaders have ignored repeated warnings from experts and organisations, such as Dennis Carroll (in the early 2000s) and Bill Gates. The head of the World Health Organisation (WHO), Tedros Ghebreyesus, warned in 2018: “A devastating epidemic can start in any country at any time, and kill millions of people, because we are not prepared.”
Artificial intelligence can help governments prepare their readiness for the next epidemic with computer modelling and simulations in the same way AI helps prepare nations for war through AI for military simulation and AI for military readiness.
In a 2015 TED Talk titled The next outbreak? We’re not ready, Bill Gates used computer models to predict that a pathogen as virulent as the 1918 Spanish flu would kill 33 million people worldwide in just nine months. Gates laments that governments regularly conduct war simulations to test their preparedness, “war games”, but not pandemic simulations, “germ games”.
The international community has belatedly started assessing countries’ readiness for coping with pandemics. The first Global health security index was published in October 2019. Data collection was largely manual, with researchers asking yes or no questions. Countries were scored between zero and 100, with higher scores denoting better health conditions. The US came top, with a score of 83.5, and the UK second, scoring 77.9. Retrospective evaluation of each country’s readiness for Covid-19 will highlight if pandemic readiness testing needs to be more sophisticated than this in the future.
In the past 50 years, more than 1,500 new pathogens have been discovered, 70% of which have proved to be of animal origin, according to WHO (2018) statistics. Virus or bacterial infections that “spillover” from animals to humans are called “zoonotic”. Spillover might occur when an infected animal is eaten, trafficked, farmed or bites a human, and where human activity encroaches on or destroys habitats.
Artificial intelligence can help predict the conditions and locations where spillovers of known and unknown pathogens might occur. This allows governments and agencies to plan ahead and ban or educate against high-risk activities.
The leading force in identifying zoonotic threats was Predict, set up by Dennis Carroll in 2009. It estimated that there are 1.6 million unknown viral species in animals, of which 700,000 could infect humans. Predict’s funding was withdrawn by the US government in October 2019.
Prompted by the emergence of the Zika virus in 2015, Predict started developing machine learning to help predict possible hosts for emerging Flavivirus (the family containing Zika, dengue and yellow fever), says Pranav Pandit, a researcher at the One Health Institute based at University of California School, who helped develop the tools for Predict.
Spillover is very rare, stresses Kate Jones, professor of ecology and biodiversity at UCL. It takes a unique cocktail of bad luck for a human to interact with a particular animal that is contagious with a virus that is capable of infecting a human and being passed human to human.
This is what makes AI useful for predicting what, when, why and where these rare events might occur.
Jones’s team has built machine learning models to predict where animals pinpointed as likely carriers of Ebola are likely to exist, where human behaviour, such as deforestation, brings animals and humans into dangerous proximity, and where population density and mobility risks greater spread. Jones is also experimenting with AI-enabled sensors and cameras that can detect the presence of animals – including potential hosts or “reservoirs” of zoonotic diseases – in close proximity to humans.
The first stage of outbreak analytics is detection. Quick detection is crucial because it enables early intervention – including patient isolation, contact tracing, treatment and vaccination (if available) – and the delivery of local and global alerts to prevent spread.
In a perfect world of ubiquitous, connected, affordable global healthcare – as advocated by the WHO – an infected person quickly receives medical attention and details of the illness are shared into a global, AI-enabled data system that can provide advice, summon assistance and issue warnings in real time.
The handling of the outbreak of SARS-CoV2 in Wuhan in December 2019 was a long way off this scenario. Chinese authorities – despite their developed health system – were too slow to detect, recognise or publicise the threat.
Regardless of the secrecy, news of the new pathogen emerged. Several AI systems picked up on the internet chatter about a cluster of unidentified pneumonia cases in Wuhan and issued alerts, regardless of silence from the Chinese authorities. Dataminr claims to have been first to issue an alert (only to its clients) on 30 December 2019, having picked up chatter on social media, including an image of deep cleaning taking place at the now-notorious Wuhan market. Other paid-for services such as BlueDot and Metabiota also claim that their natural language processing (NLP) algorithms were quick to pick up on the news, according to reports.
The first public alerts were also issued on 30 December, according to Associated Press. First was an automated alert from HealthMap, based at Boston Children’s Hospital, which mines numerous feeds for information. The other was a more considered alert issued by ProMED, after New York epidemiologist Marjorie Pollack had been notified by talk of the “unexplained pneumonia” cases via old-school email from China.
Healthmap and BlueDot helped to predict the spread of the virus internationally by mining data of flights leaving Wuhan during the crucial period after outbreak and before travel restrictions were brought in.
A great deal of focus has been given to forecasts of spread, rates of infection, incubation, recovery and death, and peaks and decline of the Covid-19 coronavirus. Notably, predictions by the team at Imperial College, London, are credited for rapidly changing the UK government’s strategy from “wait and see” to introducing intervention, such as social distancing. These models have traditionally been mathematical and do not tend to use AI.
However, researchers from Fudan University in Shanghai have used the Covid-19 outbreak in China as a case study to test and prove that AI makes better real-time predictions for transmission than traditional epidemiological forecasting models. Their first study used a stacked auto-encoder for modelling the transmission dynamics of the epidemic in China. A second paper used AI to predict the consequences of governments delaying making interventions on the spread of the virus.
The SARS-CoV2 genome was sequenced rapidly by Chinese researchers and published in draft on 10 January 2020. The SARS-CoV2 genome has been sequenced innumerable times since from samples around the world.
Deep learning is used in genomic sequencing and diagnostic testing, to process large datasets and to spot variations in the code, as outlined in this November 2019 research paper, but isn’t clear how extensively AI was used in sequencing the SARS-CoV2 genome.
There are many reasons why fast genome sequencing is important. The first is that most tests for the SARS-CoV2 virus in patients rely on identifying part of the virus genome in a nose or throat swab.
The second reason is to allow researchers – see this research paper, for example – to compare genomes, including looking for similarities with previous coronavirus pathogens such as SARS and MERS and with animal coronavirus found in suspected host species such as bats and pangolin. Also, studying the tiny mutations that occur in the virus genome every two to three weeks helps to track when and where it emerged.
Finally, the viral genome is key to tracking viral spread. Nextstrain is an open source project that analyses all virus genomes from around the world to use the tell-tale mutations or “phylogeny” to track the spread of the epidemic. It has collated 1,500 genomes for SARS-CoV2, producing impressive maps showing the colour-code strains spread locally and globally. Analysis of the US shows how different strains have been criss-crossing the country.
While Nextstrain looks like a poster child for AI and big data, it isn’t today, says Richard Neher, a professor at Biozentrum, University of Basel, and one of the founders of Nextstrain. “Some of the algorithms involved in genome sequencing do use AI – various neural network architectures,” he says. “But there’s none currently at our end.”
Shortages of tests – particularly in the west – have highlighted issues with genome-based testing. Two very different examples of AI-enabled testing have emerged from China.
The Chinese authorities in Beijing have introduced AI-enabled thermal-imaging cameras, developed by Megvii, in crowded places such as train stations and airports to help identify people with a high temperature. Even at a distance of more than three metres in a crowded location, with people wearing masks or hats, the system can rapidly identify the forehead and recognise if a person is giving off too much heat, then, using image recognition, flag them to an official who can then check their temperature manually. A high temperature is a symptom of Covid-19. It’s no substitute for a full test, but certainly has advantages.
At the other end of the spectrum, a deep learning model has been used to accurately identify cases of Covid-19 from CT scans of patients’ chests. In a study published in March 2019, a neural network, known as COVNet, was able to examine 4,300 CT scans and accurately distinguish between patients with Covid-19 and other community-acquired pneumonia and lung diseases.
Several Chinese companies have developed similar CT scan recognition technologies, honed in Wuhan. These include Infervision, which has recently been deployed in an Italian hospital. A Canadian startup, DarwinAI, recently made its CT scan reading technology open source.
Prior to testing and even to symptoms, there is a period when the patient is unknowingly contagious and can pass on the virus. The standard way to deal with this is through contact tracing, to establish to whom the infected person could have passed the disease and alert, test, treat and/or isolate those contacts.
Contact tracing was a big part of the containment strategy in Singapore. Once a person tests positive, Singapore interviews the infected person and attempts to track every person they have interacted with in the one to two weeks prior to testing positive. Initially, this appears to be a largely manual process, but in March 2019 the country introduced a phone app called TraceTogether (now available open source) which uses Bluetooth to log all close interactions with other app users. If one app user develops Covid-19, all at-risk individuals and the authorities can be alerted.
The extent to which Korea’s comprehensive contact tracing system uses AI is also unclear. However, this paper shows that Korea uses personal data records, including hospital and pharmacy visits, GPS data, credit card transactions and CCTV, which is allowed under special rules enacted following the MERS outbreak in 2015.
From early February 2020, China rapidly rolled out a Close Contact Detector app across the country to control Covid-19. It works on a traffic light system. At railway stations, venues, and so on, you have to scan the app or officials check the app and only allow people in if the app shows a green light. The system behind the app is shrouded in secrecy, but it appears to rely on some sophisticated AI.
An expat resident tells Computer Weekly that on returning to Shanghai Airport from abroad in February, he and his companion had to download the app. They then took a taxi home. Just 15 minutes after returning to their apartment, health officials and police knocked on the door. They took their temperatures and politely explained they must self-quarantine for two weeks and asked them to sign documents saying they understood.
“In the morning, I went to buy a coffee nearby,” he explains, not having understood the strictness of rules. “Within an hour, the police and health officials were knocking on the door again. They knew exactly where I had been from the phone.
“They didn’t fine me. They explained it was my civic duty and warned me not to do it again. I apologised and thanked them. It was a bit scary, but I fully support it and I’m happy they did it. This is how they track the virus. It works. It has helped China win the battle against Covid-19,” he adds.
Information and control of misinformation
In any disaster it is essential to get the correct information to citizens, data to organisations, and curtail fake news and scams. Bad information can kill, as demonstrated by the hundreds of people who died unnecessarily in Iran from drinking methanol, believing it to be a coronavirus cure.
AI can help provide correct information and curtail the dissemination of the bad. Google, Facebook and other search and social media giants have tweaked their algorithms and pumped up the lie detectors on their platforms in an effort to promote legitimate information and eliminate misinformation. To searches related to Covid-19, Google surfaces data from national governments and health organisations, rather than the usual popular posts and paid-for messages from advertisers.
An interesting example of many new information services is the WhatsApp Health Alert developed by Praekelt.org for South Africa and now rolled out by the WHO. It is a multi-language service using machine learning and natural language understanding to answer users’ questions and steer them to the best resources. The WHO service attracted 12 million users in the first week.
The level of data sharing by governments, agencies, hospitals, research institutions and all manner of organisations is unprecedented. This enables the build of innovative data-led services, including the king of Covid-19 stats, Worldometer, which enables the researchers who are modelling the outbreak projections, the medical researchers who are striving to devise and test new treatments, and the computer scientists who are building the AI tools that will facilitate them all.
One of the key and cutting-edge ways that AI is used in healthcare is computational drug repurposing. In this process, researchers use deep learning technologies to search through huge databases of existing drugs – such as Drugbank – many approved by the US Food and Drug Administration, to find potential remedies to new problems. The hope is that AI can be trained to find viral inhibitors – either vaccines or treatments – in the same way that researchers at MIT used deep neural networks to find a potential new antibiotic to fight bacterial infections such as E. coli.
AI can help predict if potential drugs will prevent the virus binding with human cells, if the drug is likely to be toxic to human cells, and if it could cause a dangerous interaction with other common drugs, thereby helping to pre-screen potential drugs before lab testing.
Many labs globally are working on developing AI or using AI to investigate and test potential drugs. These include the MIT lab behind the aforementioned antibiotic. There are many pre-published papers – that is, with no peer review – where researchers have claimed drug discoveries using AI such as this one from Insilico and this one from Michigan State University.
Being prepared for next time
In the years to come, analysis of the Covid-19 outbreak and national and global responses will be extensive and possibly damning. One positive thing to come from this will undoubtedly be the recognition of the role that AI plays and should play in preparedness for and dealing with global epidemics.