A technological breakthrough could help to deal with the problem of artificial intelligence 'hallucination', wherein AI models, including chat bots, sometimes make things up or present false answers.
AI is beginning to be used widely in scientific projects – but models sometimes present misleading or downright false results.
A new technique called prediction-powered inference (PPI) uses a small amount of real-world data to correct the output of large, general models.
The technique can 'check facts' in AI models such as AlphaFold, which predicts protein structures, in the context of specific scientific questions.
Michael Jordan, the Pehong Chen distinguished professor of electrical engineering and computer science and of statistics at University of California Berkeley, said: "These models are meant to be general – they can answer many questions, but we don't know which questions they answer well and which questions they answer badly. And if you use them naively, without knowing which case you're in, you can get bad answer.
"With PPI, you're able to use the model, but correct for possible errors, even when you don't know the nature of those errors at the outset."
AI researcher calls for 'immediate worldwide ban' (Business Insider)
AI to reach $1,345 billion by 2030 (MarketsandMarkets)
What is AI hallucination?
'Hallucination' is when an AI model, such as ChatGPT, outputs false information based on perceiving patterns that are not there.
Famously, Google's Bard chatbot claimed falsely that the James Webb Space Telescope was the first to capture an image of a planet outside the solar system.
Meta withdrew its science-focused AI Galactica due to its habit of making up false information, within days after its 2022 launch.
What can AI do in science?
Over the past decade, AI has permeated nearly every corner of science.
Machine-learning models have been used to predict protein structures, estimate the fraction of the Amazon rain forest that has been lost to deforestation and even classify faraway galaxies that might be home to exoplanets.
How does this breakthrough help?
The problem is that machine-learning systems have many hidden biases that can skew the results.
These biases arise from the data on which they are trained, which are generally existing scientific research that may not have had the same focus as the current study.
Jordan said: "In scientific problems, we're often interested in phenomena which are at the edge between the known and the unknown.
"Very often, there isn't much data from the past that are at that edge, and that makes generative AI models even more likely to 'hallucinate', producing output that is unrealistic.
"There's really no limit on the type of questions that this approach could be applied to. We think that PPI is a much-needed component of modern data-intensive, model-intensive and collaborative science."