Not So Fast: AI-Driven Medical Research Needs Guardrails
Whether you’re a physician embracing artificial intelligence (AI) or one with a wait-and-see attitude, generative AI is helping with research, literature reviews, data analysis, and even the creation of clinical trial protocols. Scary or thrilling?
In a recent Medscape report on AI, 24% of respondents reported using AI to review and analyze medical literature and data. And, while there’s a lot of potential to do robust research using OpenAI’s ChatGPT bot — it can help with written sections and assist in tabulating and organizing data — this technology isn’t quite the research assistant of a doctor’s dreams — yet.
That’s because ChatGPT essentially makes guesses when asked a question using information gleaned from the internet.
“The real problem right now is to assume that the information delivered by ChatGPT is correct given that the information is so well-written,” said Corey Maas, MD, a facial plastic surgeon, founder of The Maas Clinic in San Francisco, and the former president of the American Academy of Facial Plastic and Reconstructive Surgery. “That’s not always the case.”
Clinical reference, which is key to medical research, is another vital component that’s often missing when using AI.
“It’s so important to have the background research and references for all of the facts that are stated in a study or in an article,” Maas said. “In ChatGPT, there aren’t any references unless someone looks those up and puts them in. That’s scary, especially because you may be getting bad information.”
Specificity Matters
That said, the more clearly defined your research is, the more success you’ll have using AI, Gary Franklin, MD, MPH, a research professor of environmental and occupational health sciences and neurology at the University of Washington, Seattle, and medical director of the state’s worker’s compensation system, told Medscape Medical News.
“If you’re doing bench research and ask it [AI] a question like, ‘please tell me what the latest scientific literature says on the impact of a certain drug on mice,’ it probably could do a good job of pulling the articles that are appropriate to that question,” Franklin said. “The fact that you’ve said ‘mice’ and specified the question means that you’ve specified the exposure you’re looking for, and it might even specify what outcome you’re trying to find, such as evidence that this exposure causes mice to die.”
Be Skeptical — For Now
However, when considering using AI in research, experts believe that AI-guided research should be done with caution due to the inherent flaws in these bots.
“There is a significant amount of risk involved with using AI for medical research, said Warren D’Souza, PhD, co-director of the University of Maryland Institute for Health Computing, Baltimore. “These AI tools are nowhere close to providing sound feedback for the research the provider may be seeking.”
That’s because these bots are highly susceptible to “hallucinations,” which means the bots are inventing information that isn’t necessarily accurate.
“These AI models are trained, especially the large language models, on all kinds of information,” D’Souza added. “It’s hard for them to discern what’s good information and what’s bad information since it’s ingesting all of it. That’s why there should be a level of caution providers should exert when accessing such tools for research.”
Ultimately, just because an AI algorithm gives you an answer doesn’t mean it’s the right answer — though advancements are constantly being made.
“The algorithms themselves are improving by leaps and bounds every year,” D’Souza said. “There are additional mechanisms being put into place that allow some of these large language models to prioritize and glean info from reputable sources, such as peer-reviewed articles, versus things that are posted on Reddit or X.”
Stick With Reliable Research Sources
For the time being, it’s best to consider using AI for some research-related tasks as long as you do the extra work to validate the information AI has provided during your search.
Researchers should still use established resources like PubMed or research generated by professional organizations for most of their work.
“PubMed is the Library of Congress for medicine, Maas said. “It has every peer-reviewed journal that has been approved. It’s a big deal to be there, as it means you’re referenced and indexed, and your study is available for anyone to see online. It provides you with the most extensive database of background information for the research you’re doing.”
The same can’t be said for ChatGPT because it contains way less data than what’s available on other platforms.
“To have the most accurate background research, you have to go to PubMed,” he added. “Everything is about proving a hypothesis, and PubMed is carefully vetted. To be referenced in PubMed means your study has been validated, and you can’t say the same thing about the information contained in AI sources right now.”
The same thing goes for using AI to access basic facts.
“If you’re looking at a fact, such as the side effects of a particular drug, AI is not the right way to go,” said Elmer Bernstam, MD, professor of biomedical informatics and internal medicine at UTHealth Houston. “You’ll want to use another resource, whether that’s an online textbook or information on the drug as per the FDA.”
In this moment of rapid technological change, it’s still best to access research-supporting data via published papers, searches on venerable platforms, and lectures.
“The way that AI can help is to try to summarize and filter the enormous amount of data and information out there that’s useful to you right now, given your specific situation and question,” Bernstam said. “How reliable that is remains an open question and is likely to change. The answer it [AI] gives today may not be the answer it gives tomorrow, and you have no way of knowing that.”
Because AI may not yet be the game changer it may seem on the surface, old-fashioned data searching — and deep dives into prior research — is still the best way to proceed, our experts agree.
“I recently read an article that said that medical librarians are better than AI,” said Franklin. “The lesson is: Lean on trusted resources. We’re not there yet with a technology that outputs information in milliseconds.”
Lambeth Hochwald is a New York City–based journalist who covers health, relationships, trends, and issues of importance to women. She’s also a longtime professor at NYU’s Arthur L. Carter Journalism Institute.