Home » Sector Topics » Biotech AI

Using Data To Combat COVID-19

#FlattenTheCurve. #StayHome. #QuarantineLife. They’ve all been trending, and it seems like everybody knows that it’s time to stay at home, indoors, and as far apart (physically) as possible. We’re in the midst of a global pandemic, the worst we’ve seen in a century, yet we still see people out and about, on the streets, congregating in large groups, against the unambiguous advice of experts and health professionals.

Why jeopardize public safety? Is it due to a lack of information? A quick online search will produce results on how the virus is transmitted and how we can prevent its spread. We see live information about current infection rates, the number of mild and severe cases, and even the number of deaths. News media, blogs, and social media are teeming with COVID-19 reports and commentary. Why isn’t all that sinking in?

Let’s examine the communication breakdown from a more human perspective:

Two sides to every coin

Reporting numbers and statistics is only half the battle. The person receiving the information has to decode, interpret, and decide for themselves what this information means to them. This includes exacerbating factors like confirmation bias, and the inaccessibility of credible information (or, put another way, the relative ease with which one can find misinformation).

Confirmation bias

Confirmation bias is a type of cognitive bias where information that is consistent with our existing beliefs is weighted with greater importance and is more likely to be remembered than information that is inconsistent with our existing beliefs. For example, if you prefer the colour red to blue, and there are equally valid studies published on why red is better and why blue is better, you’re more likely to believe and remember the study that supports your love of the colour red.

Indeed, findings from a reputable medical journal confirms this bias. In a research study, Pascal Geldsetzer asked 3000 US and 3000 UK participants to answer 22 online survey questions. The survey contained questions relating to the causes, current state, and future development of the COVID-19 epidemic, their knowledge of symptoms and recommended healthcare-seeking behaviours, as well as their knowledge of preventing COVID-19 infection. When asked who were most at risk of death if infected with COVID-19, all participants correctly identified that only a small proportion of people who come into contact with the virus will die, and that if infected with the virus, older adults were most at risk. Interestingly, however, a significant proportion of participants (US 53.9%; UK 39.2%) also identified young children as being at high risk of death. This is inconsistent information published credible online sources, but inline with our common belief that young people are more vulnerable to infections and illness.

In another question, Geldsetzer asked participants what were the three most common symptoms of COVID-19, all participants were able to correctly identify fever, cough, and shortness of breath as symptoms. However, when asked what would be the best course of action if one believes that he/she had come into contact with someone who has been diagnosed with COVID-19, less than 65% of participants correctly identified the right course of action. The information about COVID tells us to stay home and phone a healthcare professional; but, since we have been taught to see a doctor when we’re sick, we rely on that wisdom first.

Blog-Bar-Chart-1

Seeing as some of the facts and the actions that we need to take in order to contain COVID-19 goes directly against our existing beliefs, many of us just simply did not absorb the new information presented to us. Gaps in our knowledge may, in part, contribute to why some people haven’t adapted their behaviours properly.

An overflow of bad information

Another reason that not everybody adheres to the advice of medical professionals could be the imbalance of good and bad information available. While media sources such as the news, Facebook, and Twitter are vital for raising awareness about the pandemic, the information disseminated by these sources is not always accurate.

A study conducted by Pew Research Center found that while most Americans claim to be following news at least ‘fairly closely,’ a substantial segment of American adults think that humans played a role in the creation of COVID-19. Nearly 29% of American think that COVID-19 was developed by humans, with 23% saying that it was developed intentionally, and 6% saying that it was developed by accident. But you won’t find that in the news, because it’s totally unfounded. Where do these conclusions come from with no evidence to support the claim?

Blog-Pie-Chart-1

It is easy for people with little to no medical knowledge or people with an ulterior agenda to misunderstand or miscommunicate ideas and information. When researchers from Pew Research Center asked Americans when they think a cure will be available, at least one in five Americans think that a vaccine will be available in the next couple of months.

The ratio of reliable COVID-19 information to unreliable information is low. Misinformation is dangerous as it has the potential to influence people to behave in ways that are inconsistent with expert advice, and posing risk to the greater society.

Fighting bad info with good data

So how can we use data and data science to improve people’s knowledge and minimize their misconceptions about COVID-19?

Our team here at ThinkData Works has collected a massive repository of COVID-19 data and made it accessible through Namara. We are constantly updating and expanding the data available there to help do our part in curbing this pandemic.

Here are some proposed ways in which data and data science can be used to tackle COVID-19, and perhaps help us be more prepared for any future outbreaks:

  1. Use clustering algorithms for effective and targeted communications
    We’re better able now than ever to reach specific audiences. Every brand with wide reach has a responsibility to disseminate safety messages to their followers, targeted specifically at segmented demographics. In fact, many businesses already do this to optimize their marketing efforts; only in this case, the goal of segmentation is NOT to increase sales, but to encourage uptake of responsible behaviours to prevent the further spread of COVID-19.

  2. Use graph analysis to identify individuals closest to infected individuals
    Humans are social beings. Some would argue that our gregarious nature and our ability to work collaboratively in society is perhaps what has made the human race so successful. This, however, is exactly the type of behaviour we must avoid during the COVID-19 pandemic. Using graph analysis on social media data to identify friendship groups, we can identify individuals who are “close” to the infected individual in terms of frequency and recency of communication, and invite them to take extra precautions. This early detection allows infected individuals to take responsible actions like self-quarantine.

  3. Using natural language processing (NLP) to identify fake news
    Using natural language processing on social media researchers can use this type of analysis to separate fact from fiction and help reduce the spread of fake news related to the coronavirus. 

  4. Using social media data mining to analyze people’s migratory patterns
    There’s a plethora of information that will allow us to predict people’s movements in the near future, including travel plans, destinations, and the number of travellers. Typically used by financial institutions as anti-fraud measures, repurposing these algorithms may help medical professionals identify regions at high risk of an outbreak. This intelligence can be used to understand areas with higher population movement.

  5. Using AI to model data from search engines to map outbreaks
    COVID-19 is just the latest of the growing list of modern diseases. To best help medical professionals be proactive in containing deadly outbreaks, we can apply AI to model data that can map outbreaks. This not only helps scientists understand the movement pattern of COVID-19, but adds to the body of knowledge that will help scientists contain future outbreaks.

In summary, credible information about COVID-19 is widely available, but these sources are not always the easiest to access. Misinformation and gaps in knowledge may in part be responsible for why people aren’t taking the necessary precautions during the COVID-19 pandemic. At ThinkData Works, we are collecting, validating, and processing COVID-19 data into one central repository so that researchers can spend more time running models and building predictions. We’re providing free access to anyone that can benefit from this data. We urge you to share this information and data with your network, as the more people that have access to this data, the bigger the impact we can make.


Sources of COVID-19 data available on Namara:


We have built a repository for refined COVID-19 data that is constantly updated and accessible through Namara. This has been developed with Roche, and is free for any team to use, so that we can advance the global community’s efforts against the pandemic. For more information please read the press release here.