How AI Analytics helps fight COVID-19

The introduction of a global pandemic has compelled policy-makers to push the envelope in technology adoption, as a means by which to effectively navigate uncharted waters.

In an interconnected and highly mobile world, public health organizations are forced to accelerate reaction speeds, rethink the scope of international coordination and leverage progressive technologies in an effort to contain infection and limit its devastating impact.

Through a combination of AI and high-resolution data, government bodies, municipalities and health organizations can rely on advanced analytics in order to get a stronger sense of what drives the spread of the virus, which policies are proving effective, and by which means can the return to normal be safely and responsibly introduced.

In light of SparkBeyond’s collaboration with government bodies in multiple countries, we’ve learned that data-driven policy and action falls mainly into three categories:

  1. Prediction of geospatial risk: An attempt to determine which regions bear a higher risk of infection based on known cases.
  2. Deployment of resources: An effort to prioritize the deployment of sanitization resources, pop-up testing and police presence.
  3. The ‘Return to Normal’: Addressing the challenge of allowing citizens out of lockdown and back to work, safely and responsibly, in an effort to help minimize the impact upon the economy.

The role that AI and data analytics play in these domains is becoming increasingly decisive, as governments transition from ‘blanket’ restrictions as a first attempt to ‘flatten the curve’, to a more nuanced approach that considers risk and opportunity in multiple dimensions.

We can list several clear use cases for advanced analytics in that regard, along with thoughts on the biggest challenge we are facing: Getting our hands on the right data for the task.

Understanding the spread of infection by region

The spear of a pandemic is exponential by nature, yet different areas under different regulations may show different growth patterns. It’s imperative to understand these patterns both for insights into the disease, predicting spread, and the effectiveness of social distancing measures.

What data do we need for this?

The most critical aspect of this approach is the number of confirmed cases by area. The more granular the data, the better. Using advanced analytics, we can combine this high-resolution data with census, demographics, weather, and mobility data from ad-based cell tracking. Using this information, we can then try to predict the exponential growth factor for each area. It’s likely that the time frame where the prediction is most accurate will be the typical time from infection to diagnosis, which is in itself a valuable insight.

Predicting complications

Most people infected by COVID-19 will experience mild illness or possibly no symptoms at all. Understanding who is at risk is important in order to better isolate them, prioritize treatment, and decide on how closely they need to be monitored if diagnosed with a mild case of COVID-19. This insight could allow healthcare professionals to better allocate medical support and redirect patients across hospital networks, ensuring that no one hospital is suddenly overrun by multiple high-risk patients.

What data do we need for this?

Comprehensive medical history data, using a combination of current COVID-19 patients and historical data from other respiratory illnesses, can help accurately predict the potential for complications. The goal is to predict the risk of serious complications, such as the need to intubate. Advanced analytics allows us to take old data, reweigh it to match new data, and build models which demonstrate accuracy on both old and new data.

In this case, SparkBeyond’s time-series capabilities not only find novel features, but pinpoint the best form of typically ‘obvious’ features.

Identifying ‘Super Spreaders’

Social graphs — i.e. the visualization of our interactions with places and each other — tend to lead to power-law distributions, revealing surprising correlations between disparate factors. This is also true for the spread of diseases. In these cases, the few individuals who infect a broad number of people are called “super spreaders”. In order to truly reduce the spread of COVID-19, we need to limit the movement and contagion of these super spreaders.

Taking this one step further, when deciding on testing the general population, it isn’t enough to predict who may fall ill. Advanced analytics empowers us to think ahead and consider who may be a potential super spreader. Then, if data is made available on confirmed or probable infected patients, we can build models to map and understand their general mobility.

What data do we need for this?

In order to identify super spreaders and the probable infection chain, granular mobility data, (e.g widespread cell-phone tracking) is required. It is important to note that privacy can be maintained with automatic anonymization.

Applying and easing restrictions

The effect of policy changes in mitigating disease spread are only evented a week or two out. Policy makers need to have at least partial feedback at a much faster rate, so they can react efficiently and effectively to the dynamics of the changing world. A key driver of the COVID-19 spread is people’s mobility: who is moving? How much are they moving? Where are people congregating? Who is staying at home?

While simple geospatial analysis can give you unrefined hot/cold regions of movement, actual human behaviour is much more nuanced. Characterizing the ebb and flow in people’s movement over the last day or two, and providing a readable summary of what is happening can dramatically shorten a policy-maker’s response time to change. Automatic geospatial insight search can be used not only to predict but to summarize the key phenomena in these movements, and prevent decision makers from drowning in the details of location data.

What data do we need for this?

In order to monitor movement changes in real-time, up-to-date location data on a large sample of the population is required. This can be derived from cell-tower information or ad-based tracking, for example.


No items found.
No items found.

Prioritizing populations for testing

In many countries, only a low percentage of test results are positive, while it’s assumed that just a fraction of SARS-CoV-2 carriers have been identified. Many complain that testing is delayed or simply denied altogether. Prompt testing can reduce the spread, place the correct people in isolation earlier, hospitalize the sick sooner or at least put them under close monitoring remotely. This can reduce the spread and many complications of the virus.

Current decisions on testing are based on clinical symptoms and the patient’s ability to recall concrete contact with a known carrier. Obtaining more information can provide officials with data-driven decisions on whom to test, as well as confirming more positive patients sooner with the same testing capacity.

What data do we need for this?

There are many potential data sources, but the most critical one is patient data: specifically, who has tested positive for the virus. One simple approach would be to gather data on tested individuals and model predictive test results. Additional supporting data such as residential or work address, occupation, age, medical history and other detailed variables can help form an overall clearer picture.

Instead of predicting test results, an alternative approach would be to model a positive diagnosis versus the general population. In this case, individual movement-tracking data can be a valuable resource, although privacy concerns must be taken into consideration. Accordingly, anonymization of such data is key.

Another option would be to obtain data from calls to the national ‘Corona Hotline’; transcribe the conversations or work with existing interaction summaries. Understanding how patients with positive test results describe themselves gives greater context. The potential to accurately predict a positive diagnosis for each caller creates actionable intelligence now. This provides government and healthcare officials with quicker insights into who should get sent to a drive-through testing facility, versus who can afford to wait for elective testing.

Securing lives and livelihoods

As the COVID-19 virus continues to spread globally and impact each and every one of us, the ability to use advanced analytics to understand and predict, not only allows policy makers to navigate complex challenges with confidence, but can actually secure the lives and livelihoods of millions.

Learn more about how SparkBeyond is powering the global response to COVID-19 here.

It was easier in this project since we used this outpout

Business Insights

Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis

Predictive Models

Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis


Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis

Features For
External Models

Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis

Business Automation

Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis


Apply key dataset transformations through no/low-code workflows to clean, prep, and scope your datasets as needed for analysis

Join our event about this topic today.

Learn all about the SparkBeyond mission and product vision.