Racial differences in medical testing could introduce bias to AI models

But tweaking the models could help overcome biased data sets

5:00 AM

Author | Derek Smith

floating AI-type images in red and blues and yellow on blue background
Justine Ross, Michigan Medicine

Black patients are less likely than white patients to receive certain medical tests that doctors use to diagnose severe disease such as sepsis, researchers at the University of Michigan have shown.

Because of the differences in testing rates, some sick Black patients are assumed to be healthy in data used to train AI, and the resulting models likely underestimate illness in Black patients. But that doesn’t mean the data is unusable—the same group developed a way to correct for this bias in data sets used to train AI.

These new insights are reported in a pair of studies: one published today in PLOS Global Public Health, and the other was presented at the International Conference on Machine Learning in Vienna, Austria, in July 2024.

In the PLOS study, the researchers found that medical testing rates for white patients are up to 4.5% higher than for Black patients with the same age, sex, medical complaints and emergency department triage score, a measure of the urgency of a patient’s medical needs. 

The difference is partially explained by hospital admission rates, as white patients were more likely to be assessed as ill and admitted to the hospital than Black patients.

“If there are subgroups of patients who are systematically undertested, then you are baking this bias into your model,” said Jenna Wiens, U-M associate professor of computer science and engineering and corresponding author of the study.

“Adjusting for such confounding factors is a standard statistical technique, but it’s typically not done prior to training AI models. When training AI, it’s really important to acknowledge flaws in the available data and think about their downstream implications.”

The researchers found this bias in medical testing records from two locations: Michigan Medicine in Ann Arbor, Michigan, and one of the most widely used clinical datasets for training AI, the Medical Information Mart for Intensive Care

The dataset contains the records of patients visiting the emergency room in the Beth Israel Deaconess Medical Center in Boston.

“This research highlights the risks of using health data to train AI models without a comprehensive understanding of the data,” said Michael Sjoding, M.D., associate professor of pulmonary and critical care medicine at Michigan Medicine.

“Because of these apparent testing difference, an AI model might infer that black patients are less sick than white patients and make predictions that are potentially biased.”

Computer scientists need to account for these biases so that AI can make accurate and equitable predictions of patient illness. 

One option is to train the AI model with a less biased dataset, such as one that only includes records for patients that have received diagnostic medical tests. 

A model trained on such data might be inaccurate for less ill patients, however.

To correct the bias without omitting patient records, the researchers developed a computer algorithm that identifies whether untested patients were likely ill based on their race and vital signs, such as blood pressure. 

The algorithm accounts for race because the recorded health statuses of patients identified as Black are more likely to be affected by the testing bias.

The researchers tested the algorithm with simulated data, in which they introduced a known bias by relabeling patients identified as ill as “untested and healthy.” 

The researchers then used this dataset to train a machine learning model, the results of which were presented at the International Conference on Machine Learning. 

When the researcher-imposed bias was corrected with the algorithm, a textbook machine-learning model could accurately differentiate between patients with and without sepsis around 60% of the time. 

Without the algorithm, the biased data made the model’s performance worse than random.

The improved accuracy was on par with a textbook model that was trained on unbiased, simulated data in which everyone was equitably tested. 

Such unbiased datasets are unlikely to exist in the real world, but the researcher’s approach allowed the AI to work about as accurately as the idealized scenario despite being stuck with biased data.

“Approaches that account for systematic bias in data are an important step towards correcting some inequities in healthcare delivery, especially as more clinics turn toward AI-based solutions,” said Trenton Chang, a doctoral student in computer science and engineering and the first author of both studies.

Additional authors: Mark Nuppnau, Ying He, Keith E. Kocher and Thomas S. Valley.

Funding/disclosures: This work was supported by the National Heart, Lung, and Blood Institute NHLBI R01 HL158626.

Paper cited: "Racial differences in laboratory testing as a potential mechanism for bias in AI: A matched cohort analysis in emergency department visits," PLOS Glob Public Health. DOI: 10.1371/journal.pgph.0003555

Sign up for Health Lab newsletters todayGet medical tips from top experts and learn about new scientific discoveries every week.

Sign up for the Health Lab PodcastAdd us wherever you listen to your favorite shows. 


More Articles About: Demographics Emergency & Trauma Care Emerging Technologies Health Care Delivery, Policy and Economics Lab Tests
Health Lab word mark overlaying blue cells
Health Lab

Explore a variety of health care news & stories by visiting the Health Lab home page for more articles.

Media Contact Public Relations

Department of Communication at Michigan Medicine

[email protected]

734-764-2220

In This Story
SJODING_Michael4x5.jpg Michael William Sjoding, MD, MSc

Associate Professor

Stay Informed

Want top health & research news weekly? Sign up for Health Lab’s newsletters today!

Subscribe
Featured News & Stories child looking at family outside of kitchen area
Health Lab
Encouraging spirituality in teens without forcing participation
Among parents who plan to attend religious services this holiday season, nearly half would insist their teen join even if they didn’t want to, a poll suggests.
surgical area of clinicians drawn out with blue background
Health Lab
New tools that leverage NIH’s ‘All of Us’ dataset could improve anesthesia and surgical care
In a report in JAMA Surgery, researchers propose two novel tools that leverage the All of Us dataset to look at acute health events such as surgery.
prescription pad drawn
Health Lab
Reducing dose of popular blood thinners may limit risk of future bleeding
For people taking the popular blood thinners rivaroxaban (brand name Xarelto) and apixaban (brand name Eliquis), after having a blood clot, a reduced dose may limit the future risk of bleeding as well as hospital visits, a Michigan Medicine-led study suggests.
out the window woman staring
Health Lab
1 in 3 older adults still experience loneliness and isolation
Rates of loneliness and social isolation in older people have declined from pandemic highs, but are still a problem especially for those with mental or physical health issues or disabilities.
patient looking at paper with provider in scrubs blue in clinic
Health Lab
How race impacts patients’ response to cancer immunotherapy
The first large scale analysis finds immune checkpoint inhibitors are equally effective in Black and white patients, with Black patients having fewer side effects.
On left, a young boy in a wheelchair has his doctor standing to his left and his parent is standing to his right in a show of support. On the right side of the image, the boy is now an adult and is wondering about the cost of his care and if his questions will be answered.
Health Lab
Changing the definition of cerebral palsy
Cerebral palsy is defined as a childhood disorder, which fails to recognize adults living with the condition and the lack of care they receive once they age out of pediatric clinics.