Epidemiologists swim in data every day, but the value of that work shines brightest during times of crisis.
The COVID-19 pandemic spotlighted the role of data in protecting public health — and what is possible with more resources dedicated to data modernization.
In some communities, state and city officials, school districts, and employers could access near real-time dashboards to see how the virus was spreading and make informed decisions to protect communities. In many others, health departments, government leaders, and other decision-makers were stymied by outdated technologies, inefficient systems, and processes for collecting data that didn’t scale well to the enormity of the pandemic. It opened many eyes to the importance of modernized data systems and the possibilities of using data to make a real difference in public health outcomes for countless diseases and
conditions.
That’s why most public health departments know it’s time to boost their data science capabilities. Using data science, epidemiologists can evaluate and translate qualitative public health issues into quantitative problems, analyze them using machine learning models, and take action. Through this data-driven approach, they create solutions that improve individual and community outcomes, enhance operational efficiency, and advance health research.
These are laudable goals, but how do you set up your data science program to achieve them? For the data scientists at Ruvos, getting started always begins with the first step in the scientific process: asking a question.
Getting Started: Define the Problem
One of the most common misperceptions about data science is that you need to start with the right technology tools. However, a cutting-edge artificial intelligence (AI) model or sleek dashboard is only useful if it produces the actionable insights you need to solve your specific challenges.
A good data scientist starts by defining the problem that needs to be solved. Ruvos’ data scientists did just that when they launched an initiative to study PASC/long COVID. They wanted to answer the question, “How do we predict a patient’s risk of developing long COVID?” With that question in mind, they determined how to quantify the risk and then built a model to do just that.
Only after developing answerable questions can you determine what data you need to test your hypothesis — and what technology and tools you need to collect, analyze, and interpret that data.
Creating a Well-Oiled Data Machine
Collecting and analyzing data can feel overwhelming when you start thinking about the common issues surrounding public health data. Too often, data is incomplete, outdated, siloed, or reported in inconsistent formats, making it difficult to draw valid conclusions and take action. The problems are so pervasive that some epidemiologists report spending 80% of their time wrangling data before they can begin analyzing it.
Luckily, today’s technology tools can drastically reduce the time epidemiologists spend collecting and cleaning up data. It’s possible to automate most of the mundane tasks associated with data collection and integration, freeing up time to focus on the jobs only humans can do – deriving actionable insights and developing strategies to improve public health. That’s what one state discovered when it set out to
clean up its COVID-19 data.
Data Science in Action: How a State Department of Health Improved Lab Data Accuracy
During the pandemic, public health officials in a US state began noticing irregularities in the COVID-19 labs they were receiving from testing labs. The number of test results they received from each lab on a given day could vary between 10,000 and 100,000. Naturally, that vast range raised questions about the accuracy of the data. Were labs resubmitting the same data, making case numbers appear too high? Were
test results getting “clogged” somewhere, and the outbreak was actually more severe than it seemed?
The agency needed to get to the bottom of their data issue to then identify where outbreaks were
actually happening and develop mitigation strategies.
Combing through hundreds of thousands of lab reports daily to identify irregularities wasn’t an option for the state’s overextended epidemiologists. Instead, they turned to Ruvos to create an algorithm that would detect anomalies in the data flows. If a lab sent more or fewer results than were expected, the system flagged its data for closer inspection.
By automating this simple — but time-consuming — task, epidemiologists had greater confidence in the data and could focus on determining where and how to allocate resources. Improving the quality of this data pipeline is the first step in unlocking the power of data science to aid epidemiologists and state health officials.
Ruvos is now building automated analytical models and dashboards to enable the state’s Department of Health to forecast disease spread. With thoughtful leadership and continued investment in modernizing data systems, building analytic tools, and training public health practitioners to use and interpret data models, Ruvos envisions a future where predicting outbreaks and disease spread becomes as reliable as
predicting the weather — so that pandemics like COVID-19 can be prevented early.
Embarking on Your Data Science Journey
While most public health agencies embrace the potential of data science, funding and talent shortages have hindered many organizations from building their capabilities. Limited technology curriculum in traditional epidemiology programs and scarce computer science graduates interested in public health
have led to a weak talent pipeline. Lacking staff with the right skill sets, organizations face challenges in evaluating and implementing the tools needed to break through data roadblocks, especially with today’s rapid pace of technological change.
The right partner can make all the difference. The Ruvos data science team brings together computer science, mathematics, and epidemiology experts to partner with public health agencies in building tools and capacities to get the most from their data. We work with public health agencies nationwide to understand and solve their data challenges, focusing on making data actionable and finding efficient, cost-effective solutions.
Ready to embark on your journey? To discuss how your organization can harness data science to drive actions that enhance public health outcomes, contact Ruvos today.