News › AI

Mount Sinai medical researchers say ChatGPT is ready to practice medicine

January 13, 2024

A team of medical researchers from the Icahn School of Medicine at Mount Sinai recently conducted a study on AI chatbots and determined that "generative large language models are autonomous practitioners of evidence-based medicine."

How did the test go?

According to pre-print research published on arXiv, the Mount Sinai team tested various consumer-facing large language models (LLMs) — including both ChatGPT 3.5 and 4, and Gemini Pro, as well as open-source models LLaMA v2 and Mixtral-8x7B.

The models were given prompts engineered with information such as "you are a medical professor" and then asked to follow evidence-based medical protocols to suggest the proper course of treatment for a series of test cases.

Once given a case, models were tasked with suggesting the next action — such as ordering tests or starting a treatment protocol. Then, they got the results of the action and were prompted to integrate this new information and suggest the next action, and so on.

According to the team, ChatGPT 4 was the most successful, reaching an accuracy of 74% over all cases and outperforming the next-best model (ChatGPT 3.5) by a margin of approximately 10%.

This performance led the team to the conclusion that such models can practice medicine.

"LLMs can be made to function as autonomous practitioners of evidence-based medicine," the paper says. "Their ability to utilize tooling can be harnessed to interact with the infrastructure of a real-world healthcare system and perform the tasks of patient management in a guideline directed manner."

Automating evidence-based medicine

Evidence-based medicine (EBM) uses the lessons learned from previous cases to dictate the trajectory of treatment for similar cases.

While EBM works somewhat like a flowchart in this way, the number of complications, permutations, and overall decisions can make the process unwieldy.

"Clinicians often face the challenge of information overload with the sheer number of possible interactions and treatment paths exceeding what they can feasibly manage or keep track of," the researchers write, adding that LLMs can mitigate this overload by performing tasks usually handled by human medical experts — such as "ordering and interpreting investigations, or issuing alarms," while humans focus on physical care.

"LLMs are versatile tools capable of understanding clinical context and generating possible downstream actions," the researchers add.

It works, but...

The researcher demonstrated that the capacity of LLMs to reason is a profound ability that can have "implications far beyond treating such models as databases that can be queried using natural language."

On the other hand, there's no general consensus among computer scientists that LLMs have any capacity to reason.

The paper doesn't mention the ethical considerations involving the insertion of an unpredictable automated system into existing clinical workflows.

The issue with LLMs such as ChatGPT is that they generate new text every time they're queried, and in a clinical setting - there is no method by which it can be constrained from occasionally fabricating nonsense, which is a phenomenon referred to as "hallucinating."

According to researchers from the Icahn School of Medicine at Mount Sinai, the hallucinations were minimal during their testing.

We can only hope that the technology will only get better and perhaps one day, replace doctors in treating common illnesses. That feat alone is worth pursuing.

source

Latest @

Smart ring maker Oura looking to expand to the Middle East

NYU Abu Dhabi researchers unveil low-cost technology to store tumor models

Altibbi, Truecaller to help build trust in digital health

Smart glucose monitor introduced in Kuwait

Abu Dhabi steps into the future of MS treatment

💡Did you know?

You can take your DHArab experience to the next level with our Premium Membership.
👉 Click here to learn more

🛠️Featured tool

Easy-Peasy
An all-in-one AI tool offering the ability to build no-code AI Bots, create articles & social media posts, convert text into natural speech in 40+ languages, create and edit images, generate videos, and more.
👉 Click here to learn more

Mount Sinai medical researchers say ChatGPT is ready to practice medicine

How did the test go?

Automating evidence-based medicine

It works, but...

Latest @

💡Did you know?

🛠️Featured tool

Free Newsletter

Take your DHArab experience to the next level!

👉 Click here to learn about our Premium Membership

Mount Sinai medical researchers say ChatGPT is ready to practice medicine

How did the test go?

Automating evidence-based medicine

It works, but...

Latest @

💡Did you know?

🛠️Featured tool

Free Newsletter

Keep the pulse on digital health in the Arab World

Get our newsletter, every Tuesday in your inbox. For FREE!

Take your DHArab experience to the next level!

👉 Click here to learn about our Premium Membership

Sign-up for our newsletter.Every Tuesday in your inbox. For FREE!

Keep the pulse on digital health in the Arab World

Get our newsletter, every Tuesday in your inbox. For FREE!

Sign-up for our newsletter.
Every Tuesday in your inbox. For FREE!