Skip to main content
News

Bias in Medical AI Must Be Addressed to Prevent Exacerbation of Health Disparities

Artificial intelligence (AI) application in health care holds immense promise but carries significant risks of perpetuating and exacerbating existing health disparities if not carefully implemented, according to an article published in PLOS Digital Health.

“Here, we highlight how biases in medical AI—especially in applications that involve clinical decisions—occur and compound throughout the AI development pipeline,” explained James Cross, Yale School of Medicine in New Haven, Connecticut, and coauthors.

Biases often originate from imbalanced or missing data during AI training, particularly underrepresentation of racial and ethnic groups. For instance, AI models predicting melanoma from skin images often fail for darker skin tones because training datasets predominantly contain light-skinned patients. Similarly, AI trained on fractured patient records, where data are split across health care systems, may systematically underestimate risks for certain populations, such as those with lower socioeconomic status. Publication bias further amplifies the problem, as medical AI studies disproportionately feature data from wealthier nations, focus on narrow specialties like radiology, and favor positive results, overlooking failures that might improve future tools. These biases diminish AI accuracy and equity in health care recommendations, leaving underserved groups vulnerable to worse clinical outcomes.

Beyond training, bias persists through model implementation. Developers who optimize models solely for overall performance may inadvertently worsen predictions for underrepresented groups. In real-world use, models can deteriorate when deployed in different hospital settings or regions, as seen with sepsis prediction tools like the Epic Sepsis Model. Even after rigorous model development, bias can emerge in how clinicians interact with AI tools. Physicians may hesitate to trust AI recommendations without clear interpretability or may selectively follow advice based on convoluted interfaces or workflow burdens.

To address these challenges, researchers emphasize collecting diverse, representative datasets, implementing statistical debiasing techniques, and improving AI transparency through interpretability tools. Real-world validation and feedback mechanisms are also critical, ensuring ongoing monitoring for performance and fairness across patient groups. Additionally, emerging technologies such as large language models (LLMs) require careful oversight to prevent perpetuating misinformation or producing inconsistent clinical recommendations.

“Debiasing medical AI models will prove crucial in preventing the perpetuation and exacerbation of health disparities and ensuring all patients benefit equally from the future of medical AI,” the study authors concluded.

Reference

Cross JL, Choma MA, Onofrey JA. Bias in medical AI: implications for clinical decision-making. PLOS Digit Health. 2024;3(11):e0000651. doi:10.1371/journal.pdig.0000651