Artificial intelligence can generate misinformation when it’s asked to answer medical questions, but it has room to adjust to help doctors, a new study finds.
researchers in Google The performance of large language models was tested, similar to Chat GPTabout its responses to multiple-choice polls and common medical questions.
They found that the model incorporated bias about patients, which could exacerbate health disparities and produce inaccurate answers to medical questions.
However, the version of the model Google developed specifically for the medical field removed some of these negative effects and recorded levels of accuracy and bias closer to those of the monitored group of doctors.
researchers think artificial intelligence Capabilities in the medical field can be expanded by supporting clinicians in making decisions and accessing information faster, but more development is needed before effective use can be made.
Only 61.9% of the answers provided by the lay model met scientific consensus, as judged by the clinician panel, compared with 92.6% of the answers provided by the medicine-focused model.
The latter result matched 92.9% of the answers reported by clinicians.
Non-professional models were more likely to produce answers rated as likely to lead to harmful outcomes (29.7%) compared to 5.8% for professional models and 6.5% for clinician-generated answers.
read more
China could fall further behind U.S. in AI race due to ‘tough’ regulation
Tony Blair: The impact of artificial intelligence is comparable to the industrial revolution
Large language models are often trained on Internet texts, books, articles, websites, and other sources to develop a broad understanding of human language.
James Davenport, professor of information technology at the University of Bath, said the “elephant in the room” was the difference between answering medical questions and practicing medicine.
No matter where you get your podcasts, you can click to subscribe to Sky News Daily
“Practicing medicine doesn’t involve answering medical questions – if it was purely about medical questions, we wouldn’t need teaching hospitals and doctors wouldn’t need years of training after their academic courses,” he said.
Anthony Cohen, professor of automated reasoning at the University of Leeds, said there was always a risk of models producing false information due to their statistical nature.
“therefore [large language models] should always be seen as an aide rather than a final decision-maker, especially in a critical field such as medicine; indeed, ethical considerations make this especially true in medicine, where there are always issues of legal liability as well,” he said.
Professor Cohen added: “Another issue is that best medical practice is constantly changing, and how to change that is a question. [large language models] Adapting such new knowledge remains a challenging problem, especially when they require such a large amount of time and money to train. “