Sunday, July 09, 2023

Researchers test AI-powered chatbot's medical diagnostic ability

by Jacqueline Mitchell, Beth Israel Deaconess Medical Center

In a recent experiment published in JAMA, physician-researchers at Beth Israel Deaconess Medical Center (BIDMC) tested one well-known publicly available chatbot's ability to make accurate diagnoses in challenging medical cases. The team found that the generative AI, Chat-GPT 4, selected the correct diagnosis as its top diagnosis nearly 40 percent of the time and provided the correct diagnosis in its list of potential diagnoses in two-thirds of challenging cases.

09 Jul 2023--Generative AI refers to a type of artificial intelligence that uses patterns and information it has been trained on to create new content, rather than simply processing and analyzing existing data. Some of the most well-known examples of generative AI are so-called chatbots, which use a branch of artificial intelligence called natural language processing (NLP) that allows computers to understand, interpret and generate human-like language.

Generative AI chatbots are powerful tools poised to revolutionize creative industries, education, customer service and more. However, little is known about their potential performance in the clinical setting, such as complex diagnostic reasoning.

"Recent advances in artificial intelligence have led to generative AI models that are capable of detailed text-based responses that score highly in standardized medical examinations," said Adam Rodman, MD, MPH, co-director of the Innovations in Media and Education Delivery (iMED) Initiative at BIDMC and an instructor in medicine at Harvard Medical School.

"We wanted to know if such a generative model could 'think' like a doctor, so we asked one to solve standardized complex diagnostic cases used for educational purposes. It did really, really well."

To assess the chatbot's diagnostic skills, Rodman and colleagues used clinicopathological case conferences (CPCs), a series of complex and challenging patient cases including relevant clinical and laboratory data, imaging studies, and histopathological findings published in the New England Journal of Medicine for educational purposes.

Evaluating 70 CPC cases, the artificial intelligence exactly matched the final CPC diagnosis in 27 (39 percent) of cases. In 64 percent of the cases, the final CPC diagnosis was included in the AI's differential—a list of possible conditions that could account for a patient's symptoms, medical history, clinical findings and laboratory or imaging results.

"While Chatbots cannot replace the expertise and knowledge of a trained medical professional, generative AI is a promising potential adjunct to human cognition in diagnosis," said first author Zahir Kanjee, MD, MPH, a hospitalist at BIDMC and assistant professor of medicine at Harvard Medical School.

"It has the potential to help physicians make sense of complex medical data and broaden or refine our diagnostic thinking. We need more research on the optimal uses, benefits and limits of this technology, and a lot of privacy issues need sorting out, but these are exciting findings for the future of diagnosis and patient care."

"Our study adds to a growing body of literature demonstrating the promising capabilities of AI technology," said co-author Byron Crowe, MD, an internal medicine physician at BIDMC and an instructor in medicine at Harvard Medical School.

"Further investigation will help us better understand how these new AI models might transform health care delivery."

More information: Zahir Kanjee et al, Accuracy of a Generative Artificial Intelligence Model in a Complex Diagnostic Challenge, JAMA (2023). DOI: 10.1001/jama.2023.8288

Popular 'low T' treatment is safe for men with heart disease, but doctors warn it's no youth serum

by Laura Ungar

Testosterone replacement therapy is safe for men with "low T" who have heart disease or are at high risk for it, a new study suggests. But doctors warn the popular treatment is no "anti-aging tonic."

09 Jul 2023--The research, published in The New England Journal of Medicine, found that heart attacks, strokes and other major cardiac issues were no more common among those using testosterone gel than those using a placebo.

That implies the gel is also safe for men without cardiovascular problems who have low T, said Dr. Steven Nissen, a cardiologist at the Cleveland Clinic and senior author of the study. But, he added, it doesn't mean the treatment should be used by men without low T—a condition also known as hypogonadism that's measured by levels of the sex hormone in the blood.

"What we've shown here is that for a very specific group of men, testosterone can be given safely," Nissen said. "But it is not to be given as an anti-aging tonic for widespread use in men who are aging."

More than 5,000 men ages 45-80 at 316 trial sites throughout the U.S. were randomly assigned to get the testosterone gel or the placebo, which they rubbed on their skin daily for an average of about 22 months. "Major cardiac events" occurred in 182 patients in the testosterone group and 190 patients in the placebo group.

The testosterone group did have a higher incidence of less severe problems, such as atrial fibrillation, acute kidney injury and issues from blood clots in veins.

The large study helps address "a gap of understanding" about how testosterone treatment affects cardiovascular outcomes for men with true low T, said Dr. Alan Baik, a cardiologist at the University of California-San Francisco who was not involved in the research.

But he'd like to see more research, he said, on whether testosterone therapy can actually reduce cardiovascular risk factors in men with low T, who seem more likely to have conditions like high blood pressure and diabetes.

Treating low T has been a big business for many years, largely driven by advertisements for pills, patches, gels and injections. Online sites and clinics across the nation offer the treatment, and many tie low T to common issues such as fatigue and weight gain.

The new study, led by the Cleveland Clinic and funded by a consortium of drug companies, was done in response to a 2015 mandate by the Food and Drug Administration for makers of testosterone products to carefully examine the risk of heart attack or stroke. A previous FDA review had shown that many men got low T treatment even though their testosterone levels hadn't been checked.

Nissen said while low T is a "very common disorder," aging men also want to feel like they're 18 again and "have the sexual performance they had when they were young," he said.

But the treatment, he added, "should not be used by bodybuilders. It should not be used by athletes. The concerns about the misuse of testosterone are quite high. And I think we have to be very cautious."

More information: Cardiovascular Safety of Testosterone-Replacement Therapy, The New England Journal of Medicine (2023). DOI: 10.1056/NEJMoa2215025 www.nejm.org/doi/full/10.1056/NEJMoa2215025

Fighting loneliness by finding purpose

by Washington University in St. Louis

grandmother — Credit: Unsplash/CC0 Public Domain

A new study published in Psychology and Aging co-authored by Patrick Hill, associate professor of psychological and brain sciences, offers an important message for our times: A sense of purpose in life—whether it's a high-minded quest to make a difference or a simple hobby with personal meaning—can offer potent protection against loneliness.

09 Jul 2023--"Loneliness is known to be one of the biggest psychological predictors for health problems, cognitive decline, and early mortality," Hill said. "Studies show that it can be as harmful for health as smoking or having a poor diet."

The new study, based on surveys of more than 2,300 adults in Switzerland, found that feelings of loneliness were less common in people who reported a purposeful life, regardless of their age. It was co-authored by Mathias Allemand of the University of Zurich in Switzerland and Gabriel Olaru of Tilburg University in the Netherlands.

Respondents were asked to score their feelings on a lack of companionship, isolation from other people, and a sense of being "left out or passed over" during a four-week period. Participants also filled out the six-item Life Engagement Test, which asked them to rate statements such as "there is not enough purpose in my life" and "I value my activities a lot."

"A sense of purpose is this general perception that you have something leading and directing you from one day to the next," Hill said. "It can be something like gardening, supporting your family, or achieving success at work."

Many of the activities that can provide a sense of purpose—joining a club, volunteering at a school, playing in a sports league—involve interaction with others, which is one reason why a purpose-filled life tends to be less lonely. In the study, people who said they received or provided social support were especially likely to report feelings of purpose.

But Hill noted that there's more to fighting loneliness than simply being around others. "We've all had time in our lives when we've felt lonely even though we weren't actually alone." There's something about having a sense of purpose that seems to fight loneliness regardless of how many other people are involved, he said.

The study found a slight uptick in reports of loneliness for people in their 70s and beyond, an age when a sense of purpose can be especially important. "We're trying to dispel the myth from previous generations that this is simply a time for retiring and resting," Hill said. "There are no downsides to finding something meaningful later in life."

Still, it's important to keep in mind that a quest for purpose can be somewhat self-defeating if taken too seriously. "Feeling like you need to save the world can lead to existential dread and distress," Hill said. When it comes to purpose and meaning, even small things can matter. "It's OK if someone else thinks that your purpose is trivial, as long as it's meaningful to you."

More information: Patrick L. Hill et al, Do associations between sense of purpose, social support, and loneliness differ across the adult lifespan?, Psychology and Aging (2023). DOI: 10.1037/pag0000733

Vitamin D supplements may reduce risk of serious cardiovascular events in older people

by British Medical Journal

vitamin supplements — Credit: Pixabay/CC0 Public Domain

Vitamin D supplements may reduce the risk of major cardiovascular events such as heart attacks among people aged over 60, finds a clinical trial published by The BMJ.

09 Jul 2023--The researchers stress that the absolute risk difference was small, but say this is the largest trial of its kind to date, and further evaluation is warranted, particularly in people taking statins or other cardiovascular disease drugs.

Cardiovascular disease (CVD) is a general term for conditions affecting the heart or blood vessels and is one of the main causes of death globally. CVD events such as heart attacks and strokes are set to increase as populations continue to age and chronic diseases become more common.

Observational studies have consistently shown a link between vitamin D levels and CVD risk, but randomized controlled trials have found no evidence that vitamin D supplements prevent cardiovascular events, possibly due to differences in trial design that can affect results.

To address this uncertainty, researchers in Australia set out to investigate whether supplementing older adults with monthly doses of vitamin D alters the rate of major cardiovascular events.

Their D-Health Trial was carried out from 2014 to 2020 and involved 21,315 Australians aged 60-84 who randomly received one capsule of either 60,000 IU vitamin D (10,662 participants) or placebo (10,653 participants) taken orally at the beginning of each month for up to five years.

Participants with a history of high calcium levels (hypercalcemia), overactive thyroid (hyperparathyroidism), kidney stones, soft bones (osteomalacia), sarcoidosis, an inflammatory disease, or those already taking more than 500 IU/day vitamin D were excluded.

Data on hospital admissions and deaths were then used to identify major cardiovascular events, including heart attacks, strokes, and coronary revascularisation (treatment to restore normal blood flow to the heart).

The average treatment duration was five years and more than 80% of participants reported taking at least 80% of the study tablets.

During the trial, 1,336 participants experienced a major cardiovascular event (6.6% in the placebo group and 6% in the vitamin D group).

The rate of major cardiovascular events was 9% lower in the vitamin D compared with the placebo group (equivalent to 5.8 fewer events per 1,000 participants).

The rate of heart attack was 19% lower and the rate of coronary revascularization was 11% lower in the vitamin D group, but there was no difference in the rate of stroke between the two groups.

There was some indication of a stronger effect in those who were using statins or other cardiovascular drugs at the start of the trial, but the researchers say these results were not statistically significant.

Overall, the researchers calculate that 172 people would need to take monthly vitamin D supplements to prevent one major cardiovascular event.

The researchers acknowledge that there may be a small underestimate of events and say the findings may not apply to other populations, particularly those where a higher proportion of people are vitamin D deficient. However, this was a large trial with extremely high retention and adherence, and almost complete data on cardiovascular events and mortality outcomes.

As such, they say their findings suggest that vitamin D supplementation may reduce the risk of major cardiovascular events. "This protective effect could be more marked in those taking statins or other cardiovascular drugs at baseline," they add, and they suggest further evaluation is needed to help to clarify this issue.

"In the meantime, these findings suggest that conclusions that vitamin D supplementation does not alter risk of cardiovascular disease are premature," they conclude.

More information: Vitamin D supplementation and major cardiovascular events: D-Health randomised controlled trial, The BMJ (2023). DOI: 10.1136/bmj-2023-075230, www.bmj.com/content/381/bmj-2023-075230

Benzodiazepine use associated with brain injury, job loss and suicide

by CU Anschutz Medical Campus

Benzodiazepine use and discontinuation is associated with nervous system injury and negative life effects that continue after discontinuation, according to a new study from researchers at the University of Colorado Anschutz Medical Campus.

09 Jul 2023--"Despite the fact that benzodiazepines have been widely prescribed for decades, this survey presents significant new evidence that a subset of patients experience long-term neurological complications," said Alexis Ritvo, M.D, M.P.H., an assistant professor in psychiatry at the University of Colorado School of Medicine and medical director of the nonprofit Alliance for Benzodiazepine Best Practices. "This should change how we think about benzodiazepines and how they are prescribed."

"Patients have been reporting long-term effects from benzodiazepines for over 60 years. I am one of those patients. Even though I took my medication as prescribed, I still experience symptoms on a daily basis at four years off benzodiazepines. Our survey and the new term BIND give a voice to the patient experience and point to the need for further investigations," said Christy Huff, M.D, one of the paper's co-authors and a cardiologist and director of Benzodiazepine Information Coalition.

The survey was a collaborative effort among CU Anschutz, Vanderbilt University Medical Center, and several patient-led advocacy organizations that educate on benzodiazepine harms. Several members of the research team have lived experience with benzodiazepines, which informed the survey questions.

Symptoms were long-lasting, with 76.6% of all affirmative answers to symptom questions reporting the duration to be months or more than a year. The following ten symptoms persisted over a year in greater than half of respondents: low energy, difficulty focusing, memory loss, anxiety, insomnia, sensitivity to light and sounds, digestive problems, symptoms triggered by food and drink, muscle weakness and body pain.

Particularly alarming, these symptoms were often reported as new and distinct from the symptoms for which benzodiazepines were originally prescribed. In addition, a majority of respondents reported prolonged negative life impacts in all areas, such as significantly damaged relationships, job loss and increased medical costs. Notably, 54.4% of the respondents reported suicidal thoughts or attempted suicide.

BIND is thought to be a result of brain changes resulting from benzodiazepine exposure. A general review of the literature suggests that it occurs in roughly one in five long-term users. The risk factors for BIND are not known, and more research is needed to further define the condition, along with treatment options.

Previous studies had described this injury with various terminologies, perhaps the most well-known being protracted withdrawal. As part of the study, a scientific review board unified these names under the term benzodiazepine-induced neurological dysfunction (BIND) to more accurately describe the condition.

To better characterize BIND, Ritvo and colleagues analyzed data from a 2022 survey, published in Therapeutic Advances in Psychopharmacology, of current and former benzodiazepine users that asked about their symptoms and adverse life effects attributed to benzodiazepine use.

The survey of 1,207 benzodiazepine users from benzodiazepine support groups along with health and wellness sites is the largest of its kind. Respondents included those taking benzodiazepines (63.2%), in the process of tapering (24.4%) or fully discontinued (11.3%). Nearly all respondents had a prescription for benzodiazepines (98.6%) and 91% took them mostly as prescribed.

More information: PLOS ONE (2023), DOI: 10.1371/journal.pone.0285584

Alistair J. Reid Finlayson et al, Experiences with benzodiazepine use, tapering, and discontinuation: an Internet survey, Therapeutic Advances in Psychopharmacology (2022). DOI: 10.1177/20451253221082386

Surgical stabilization of odontoid fractures shown to improve outcomes

by Wolters Kluwer Health

Surgical stabilization of odontoid fractures improves outcomes — Two illustrative cases. Case A is that of a middle-aged woman injured in a motor vehicle collision, found on trauma workup to have A1, a nondisplaced type II odontoid fracture that was treated with a rigid cervical orthosis and showed progressive healing at A2, 4 months and A3, 7 months after the injury. Case B is that of a middle-aged man injured in a cycling accident who had neck pain with movement and was neurologically intact on examination. CT revealed B1, a mildly displaced and angulated type II odontoid fracture treated with B2, C1-C2 posterior fixation with B3, excellent bony healing on follow-up CT. Credit: *Neurosurgery* (2023). DOI: 10.1227/neu.0000000000002557

Odontoid fractures—those occurring in the second cervical vertebra—are common in elderly patients after a low-energy fall. However, whether the initial treatment should be surgical or non-operative still isn't known. Previous studies haven't accounted for differences in injury severity, or the presence or absence of neurologic impairment, which can affect patients' results.

09 Jul 2023--On this topic, an article titled "Surgery Decreases Nonunion, Myelopathy, and Mortality for Patients with Traumatic Odontoid Fractures: A Propensity Score Matched Analysis" is published in the journal Neurosurgery.

Michael B. Cloney, MD, MPH, of the Department of Neurological Surgery at Northwestern University in Chicago, and colleagues have published evidence that surgery should be considered as the initial approach for many patients. Compared with non-operative approaches to treatment, surgical stabilization of the fracture was associated with less myelopathy (mobility impairment due to spinal cord damage), and lower rates of fracture non-union, 30-day mortality, and one year mortality.

"Given the increasing incidence of odontoid fractures with the aging population, we believe our findings could assist with neurosurgical decision-making for an increasingly common and complex problem," the researchers say.

Propensity score matching: A way to account for nonrandomized patient groups

Dr. Cloney and his colleagues reviewed initial treatment data on 296 patients who were cared for at Northwestern Memorial Hospital between January 1, 2010, and December 31, 2020, because of an odontoid fracture. Their average age was 73. During the hospitalization, 22% had surgery and 78% had non-operative treatment (5% were immobilized in a halo-vest and 73% received a cervical collar).

Since the patients weren't randomized to these treatments, the research team used a type of analysis called propensity score adjustment. They calculated "propensity scores" for each individual—the probability that the patient would have been assigned to receive one of the two treatment approaches based on certain characteristics.

For example, to study the effect of surgery on mortality rates, patients were matched on age, sex, Injury Severity Score, Nurick score (a measure of myelopathy), their number of chronic diseases and chronic conditions such as smoking, and whether they had to be admitted to the intensive care unit.

Surgical stabilization leads to better results

Follow up with patients lasted an average of 45 weeks. On the propensity score-matched analyses, the group that underwent surgery showed significantly better outcomes than the non-operative group:

Lower rate of fracture non-union—39.7% vs. 57.3%; treatment effect, 15% less risk of nonunion
Lower 30-day mortality rate—1.7% vs. 13.8%; treatment effect, 10% less risk of death
Lower one year mortality rate—7.0% vs. 23.7%; treatment effect, 10% less risk of death

Other analyses showed patients in the surgery group were 52% less likely than those in the non-operative group to have poor Nurick scores at the 26-week postoperative follow-up visit and were 41% less likely to die during the overall follow-up period. Both differences were statistically significant.

"The mortality benefit calculated in the existing literature typically represents an unadjusted mortality rate between two potentially different populations, which leaves it liable to confounding," the authors note. "Our study represents a relatively large institutional series that suggests a benefit from surgical stabilization in this population while controlling for confounding factors more thoroughly than existing literature."

More information: Michael Cloney et al, Surgery Decreases Nonunion, Myelopathy, and Mortality for Patients With Traumatic Odontoid Fractures: A Propensity Score Matched Analysis, Neurosurgery (2023). DOI: 10.1227/neu.0000000000002557

Patients with Alzheimer's disease, dementia face twice the risk of dying after ICU discharge

by American Association of Critical-Care Nurses (AACN)

Older patients with Alzheimer's disease and related dementia (ADRD) have almost twice the risk of dying soon after they are discharged from an intensive care unit (ICU) and within the 12 months afterward, according to research published in the American Journal of Critical Care.

09 Jul 2023--The study, "Mortality and Discharge Location of Intensive Care Patients With Alzheimer Disease and Related Dementia," examines data from a large, geographically diverse sample of patients enrolled in Medicare Advantage (MA) plans. The authors believe it is the only published study that examines ICU outcomes among MA enrollees with ADRD, and one of the few that focus on patients with ADRD covered by MA plans.

The study found that older adults with ADRD who were admitted to an ICU were much less likely to be discharged home and faced almost twice the risk of death in the same calendar month as discharge and the 12 months after discharge when compared with patients who did not have an ADRD diagnosis.

Co-author Mary Lynn Davis-Ajami, Ph.D., FNP, RN, is a health services researcher with expertise using national databases to focus on cost and quality outcomes in complex chronic disease, often with policy implications. Currently, she is transitioning to join the faculty at Michigan State College of Nursing, in East Lansing, as associate dean for academic affairs. She worked with colleagues from Indiana University and other institutions to conduct this study.

"Patients with ADRD often have a limited life expectancy, which can be further shortened after an ICU admission or other acute event," she said. "Our findings raise questions about proactive strategies to diminish the likelihood of an ICU admission or early discussions with families and caregivers about palliative care."

Deaths in the ADRD cohort were almost twice as common within the same calendar month after discharge as well as within the following 12-month period, compared with deaths in the non-ADRD cohort.

In addition to short-term and long-term mortality, the analysis revealed that a little more than one-third (37.6%) of patients with ADRD went home after hospital discharge, compared with more than two-thirds (68.6%) of non-ADRD patients.

Being dual-eligible for Medicare and Medicaid further raised patients' risk of not being discharged home from the ICU, as well as dying within the same calendar month after discharge and within 12 months following their discharge.

The observational study used Optum's de-identified Clinformatics Data Mart Database version 8.1, which covers the period from 2016 to 2019. The analysis included adults 67 years of age or older with continuous MA coverage who were first admitted to an ICU in 2018. ADRD and comorbid conditions were identified from claims.

After applying exclusion criteria, the final study population included 145,342 patients with a first-time admission to the ICU in 2018 and who were discharged from the ICU. Among this group, 10.5% (15,289) had a diagnosis of ADRD.

The analysis did not examine reasons for the initial ICU admission and causes of death or differentiate between types of ADRD or between mild and severe dementia, and other elements that might influence outcomes.

More information: Mary Lynn Davis-Ajami et al, Mortality and Discharge Location of Intensive Care Patients With Alzheimer Disease and Related Dementia, American Journal of Critical Care (2023). DOI: 10.4037/ajcc2023328

Smart watches could detect Parkinson's up to seven years before hallmark symptoms appear

by Cardiff University

Smart watches could identify Parkinson's disease up to seven years before hallmark symptoms appear and a clinical diagnosis can be made, new research reveals.

09 Jul 2023--In the new study, scientists analyzed data collected by smart watches over a 7-day period measuring participants' speed of movement. They found that they could accurately predict, using artificial intelligence (AI), those who would go on to later develop Parkinson's disease.

Researchers say this could be used as a new screening tool for Parkinson's disease, which would enable detection of the disorder at a much earlier stage than current methods allow.

The study was led by scientists at the UK Dementia Research Institute and Neuroscience and Mental Health Innovation Institute (NMHII) at Cardiff University. It is published today (July 3) in the journal Nature Medicine.

Parkinson's affects cells in the brain called dopaminergic neurons, located in an area of the brain known as the substantia nigra. It causes motor symptoms such as tremor, rigidity (stiffness), and slowness of movement. By the time these hallmark symptoms of Parkinson's begin to show, and a clinical diagnosis can be made, more than half of the cells in the substantia nigra will already have died.

Therefore, there is a need for cheap, reliable and easily accessible methods to detect early changes so that intervention can be made before the disease causes extensive damage to the brain.

The researchers analyzed data collected from 103,712 UK Biobank participants who wore a medical-grade smart watch for a 7-day period in 2013–2016. The devices measured average acceleration, meaning speed of movement, continuously over the week-long period.

They compared data from a subset of participants who had already been diagnosed with Parkinson's disease, to another group who received a diagnosis up to seven years after the smart watch data was collected. These groups were also compared to age- and sex-matched healthy people.

The researchers showed that, using AI, it is possible to identify participants who would later go on to develop Parkinson's disease, from their smart watch data. Not only could these participants be distinguished from healthy controls in the study, but the researchers then extended this to show that the AI could be used to identify individuals who would later develop Parkinson's in the general population. They found that this was more accurate than any other risk factor or other recognized early sign of the disease in predicting whether someone would develop Parkinson's disease. The model was also able to predict time to diagnosis.

A limitation to the study is the lack of replication using another data source, as there are currently no other comparable data sets that would allow for similar analysis. However, extensive evaluation was performed to mitigate any biases.

Study leader Dr. Cynthia Sandor, emerging leader at the UK Dementia Research Institute at Cardiff University, said, "Smart watch data is easily accessible and low-cost. As of 2020, around 30% of the UK population wear smart watches. By using this type of data, we would potentially be able to identify individuals in the very early stages of Parkinson's disease within the general population.

"We have shown here that a single week of data captured can predict events up to seven years in the future. With these results we could develop a valuable screening tool to aid in the early detection of Parkinson's. This has implications both for research, in improving recruitment into clinical trials, and in clinical practice, in allowing patients to access treatments at an earlier stage, in future when such treatments become available."

Dr. Kathryn Peall, clinical senior lecturer in the NMHII at Cardiff University, said, "For most people with Parkinson's disease, by the time they start to experience symptoms, many of the affected brain cells have already been lost. This means that diagnosing the condition early is challenging. Though our findings here are not intended to replace existing methods of diagnosis, smart watch data could provide a useful screening tool to aid in the early detection of the disease. This means that as new treatments hopefully begin to emerge, people will be able to access them before the disease causes extensive damage to the brain."

More information: Cynthia Sandor, Wearable movement-tracking data identify Parkinson's disease years before clinical diagnosis, Nature Medicine (2023). DOI: 10.1038/s41591-023-02440-2. www.nature.com/articles/s41591-023-02440-2

ChatGPT generates 'convincing' fake scientific article

by JMIR Publications

AI unleashes a Pandora's box: ChatGPT generates convincingly fake scientific article — AI-generated image, in response to the request "pandoras box opened with a physician standing next to it. Oil painting Henry Matisse style", (Generator: DALL-E2/OpenAI, March 9, 2023, Requestor: Martin Májovský). Credit: Created with DALL-E2, an AI system by OpenAI

A new study published in the Journal of Medical Internet Research by Dr. Martin Májovský and colleagues has revealed that artificial intelligence (AI) language models such as ChatGPT (Chat Generative Pre-trained Transformer) can generate fraudulent scientific articles that appear remarkably authentic. This discovery raises critical concerns about the integrity of scientific research and the trustworthiness of published papers.

09 Jul 2023--Researchers from Charles University, Czech Republic, aimed to investigate the capabilities of current AI language models in creating high-quality fraudulent medical articles. The team used the popular AI chatbot ChatGPT, which runs on the GPT-3 language model developed by OpenAI, to generate a completely fabricated scientific article in the field of neurosurgery. Questions and prompts were refined as ChatGPT generated responses, allowing the quality of the output to be iteratively improved.

The results of this proof-of-concept study were striking—the AI language model successfully produced a fraudulent article that closely resembled a genuine scientific paper in terms of word usage, sentence structure, and overall composition. The article included standard sections such as an abstract, introduction, methods, results, and discussion, as well as tables and other data. Surprisingly, the entire process of article creation took just one hour without any special training of the human user.

While the AI-generated article appeared sophisticated and flawless, upon closer examination expert readers were able to identify semantic inaccuracies and errors particularly in the references—some references were incorrect, while others were non-existent. This underscores the need for increased vigilance and enhanced detection methods to combat the potential misuse of AI in scientific research.

This study's findings emphasize the importance of developing ethical guidelines and best practices for the use of AI language models in genuine scientific writing and research. Models like ChatGPT have the potential to enhance the efficiency and accuracy of document creation, result analysis, and language editing. By using these tools with care and responsibility, researchers can harness their power while minimizing the risk of misuse or abuse.

In a commentary on Dr. Májovský's article, Dr. Pedro Ballester discusses the need to prioritize the reproducibility and visibility of scientific works, as they serve as essential safeguards against the flourishing of fraudulent research.

As AI continues to advance, it becomes crucial for the scientific community to verify the accuracy and authenticity of content generated by these tools and to implement mechanisms for detecting and preventing fraud and misconduct. While both articles agree that there needs to be a better way to verify the accuracy and authenticity of AI-generated content, how this could be achieved is less clear.

"We should at least declare the extent to which AI has assisted the writing and analysis of a paper," suggests Dr. Ballester as a starting point. Another possible solution proposed by Majovsky and colleagues is making the submission of data sets mandatory.

The article "Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened" was published in the Journal of Medical Internet Research.

More information: Martin Májovský et al, Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened, Journal of Medical Internet Research (2023). DOI: 10.2196/46924

Pedro L Ballester, Open Science and Software Assistance: Commentary on "Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened", Journal of Medical Internet Research (2023). DOI: 10.2196/49323

HONcode Web 2.0 rules as following

Principle 1 - Authority

all commenters are by default considered as non

medical professionals.

commenters must behave at all times with respect

and honesty.

Principle 3 - Confidentiality

posts in weblog do fraga are visible to

everyone and

commenters don't be able to modify or erase their

posts.

Principle 4 - Attribution

commenters could give references (links for ex.)

to the health/medical information they provide.

What we mean by personal experience is anything the person has undergone himself/herself.

Principle 5 - Justifiability

platform users must post information

which are true and correct to their knowledge.

Principle 8 - Honesty in advertising & editorial policy

advertisement from the commenters (links, banners,

content, etc.) is not permitted on the blog.

The information contained above is intended for general reference purposes only.

It is not a substitute for professional medical advice or a medical exam. Always seek the advice of your physician or other qualified health professional before starting any new treatment.

Medical information changes rapidly, some information may be out of date. The information should not be used to diagnose, treat, cure or prevent any disease without the supervision of a medical doctor.

This blog don't have any advertisement.

"The Web site owners undertake to honour or exceed the legal requirements of medical/health information privacy that apply in Brazil and will not share any of the visitors information."

WEBLOG DO FRAGA