Researchers at Massachusetts General Hospital say that spotting bias in artificial intelligence and machine learning requires a holistic evaluation – and that models can be biased against certain groups while simultaneously performing better for others.
“Despite the eminent work in other fields, bias often remains unmeasured or partially measured in healthcare domains,” observed the researchers in the study, which was published this week in the Journal of the American Medical Informatics Association.
“Most published research articles only provide information about very few performance metrics,” they added. “The few studies that officially aim at addressing bias usually utilize single measures … that do not portray a full picture of the story on bias.”
WHY IT MATTERS
For this particular study, the researchers examined four validated prediction models of COVID-19 outcomes in an effort to investigate whether they were biased when developed or whether the bias changed over time.
By analyzing data from 56,590 patients with a positive COVID-19 test over time, the researchers did not find consistent biased behaviors against all underrepresented groups, although they did find increased error rates on an individual level for older patients.
“Compared to the overall population, retrospectively and prospectively across time, the models marginally performed worse for male patients and better for Latinx and female patients,” read the study. For the rest of the groups, the performances were more mixed, they said.
They also noted a decline in performance around July 2021.
“The models that were developed with data from March to September 2020 provided relatively stable predictive performance prospectively up until May–June 2021,” they said. They noted this could be due to increased vaccination rates or increased capacity compared to the beginning of the pandemic.
Still, “despite the increased variability, the prospective modeling performance remained high for predicting hospitalization and the need for mechanical ventilators,” they said.
Overall, they concluded that medical AI bias is multifaceted and requires multiple perspectives to be practically addressed. “Nevertheless, the first step for addressing the bias in medical AI is to identify bias in a way that can be traced back to its root,” they said.
THE LARGER TREND
Merely observing that bias exists isn’t sufficient, they say. Steps must be taken to address it, and that can be a complicated endeavor. Some say hiring diverse teams is key, while others point to an “algorithmic nutritional label.”
Ultimately, when bias is effectively mitigated, said UC Berkeley’s Dr. Ziad Obermeyer in a 2021 interview, “we turn algorithms from tools that reinforce all of these ugly things about our healthcare system into tools that are just and equitable and do what we want them to do, which is help sick people.”
ON THE RECORD
“Addressing bias in medical AI requires a framework for a holistic search for bias, which can invigorate follow-up investigations to identify the underlying roots of bias,” said researchers.