We already knew this ChatGPT This was not credible, especially when it came to our health. But a new study has just proven that OpenAI’s famous chatbot is particularly bad at diagnosing diseases in children. They put it to the test and failed in more than 80% of cases.
The new study was conducted by a team at Cohen Children’s Medical Center in New York. The researchers asked the most recent version of ChatGPT to resolve 100 pediatric cases published in JAMA Pediatrics and NEJM—two major medical journals in the United States—between 2013 and 2023.
The technique was simple. The researchers pasted text from each case study and gave ChatGPT instructions: “List the differential diagnosis and the final diagnosis.”. Differential diagnosis is a method used to make a preliminary diagnosis (or several of them) based on the patient’s medical history and physical examination. The definitive diagnosis refers to the final cause of the symptoms.
The answers provided by the artificial intelligence were assessed by two other pediatricians who were isolated from the rest of the study. There were three possible ratings: “correct,” “incorrect,” and “does not fully reflect the diagnosis.”
ChatGPT is finally here achieved correct answers only in 17 out of 100 diagnostic cases in children. In 11 cases he did not fully understand the diagnosis. In the remaining 72 cases, artificial intelligence failed. Then, after counting the erroneous and incomplete results, the chatbot failed 83% of the time. “This study highlights the invaluable role that clinical experience plays,” the authors emphasize.
Pediatricians cannot rely on ChatGPT to diagnose children
The researchers stressed that diagnosis in children is especially difficult because, in addition to taking into account all symptoms, it is necessary to consider how age affects them. In the case of ChatGPT, the team realized that had difficulty discovering known relationships between various conditions. What an experienced doctor would determine.
The chatbot, for example, was unable to establish a link between autism and scurvy, a deficiency of vitamin C. Neuropsychiatric diseases such as autism can lead to dietary restrictions and cause vitamin deficiencies. But ChatGPT did not notice this and in one case diagnosed a rare autoimmune disease.
The World Health Organization (WHO) already warned last year that “caution” was needed when using artificial intelligence tools such as ChatGPT in medical care. He warned that the data used to train these systems could be “biased.” and generate misleading information that could harm patients.
Another study from Long Island University in New York warns that ChatGPT is also very poor at resolving medication questions. These researchers asked a chatbot to answer 39 questions related to drug use. OpenAI artificial intelligence failed 75% of the time.
ChatGPT is clearly not ready for use as a diagnostic tool in either children or adults. But the team at Cohen Children’s Medical Center believes that more selective training could improve results. In the meantime, they say such systems could be useful for administrative tasks or writing patient instructions. Nothing more, for now.