ChatGPT Can Fool Researchers With Fake Abstracts, How?

JAKARTA - Artificial intelligence (AI) made by OpenAI, ChatGPT is reportedly deceiving researchers with fake scientific abstracts, in which they are made to think they are reading real text written by humans.

A team of researchers led by Northwestern University used the text generation tool to generate 50 abstracts based on real scientific paper titles in the style of five different medical journals.

An abstract is a neat summary that is added at the top of a research paper to give an overall picture of what is being studied.

Furthermore, the abstract was trained on selection from The British Medical Journal (BMJ) and Nature Medicine. Then, four researchers registered to take the test, and were divided into two groups of two.

An electronic coin flip was used to decide whether the original or fake AI-generated abstract was given to one reviewer in each group. One researcher is given the original abstract, the second researcher is given a fake abstract, and vice versa. Everyone reviews 25 scientific abstracts.

As a result, not only did the computer text make it past the anti-plagiarism detectors, but real researchers were unable to find the fake.

Human reviewers correctly identified only 68 percent of ChatGPT abstracts and 86 percent of original abstracts. The medical research group believes that 32 percent of AI-generated abstracts are real and 14 percent of real abstracts are fake.

"Our reviewers commented that it was very difficult to distinguish between real and fake abstracts. The abstracts generated by ChatGPT were so convincing, it even knew how large the patient cohort should be when finding the numbers," said study leader Catherine Gao at Northwestern University in Chicago, Illinois. Metro, Tuesday, January 17th.

"ChatGPT writes scientific abstracts that are credible, even with fully generated data, they explain in their study preprint," he added.

Gao added, the abstracts that ChatGPT creates are genuine without any detectable plagiarism but can often be identified using AI output detectors and skeptical human reviewers.

"Abstract evaluation for medical journals and conferences must adapt policies and practices to maintain strict scientific standards. We recommend the inclusion of AI output detectors in the editorial process and clear disclosure if this technology is used," said Gao.

"The ethical and acceptable limits of using large language models to aid scientific writing have yet to be determined," he added.

Large language models such as ChatGPT are trained on large amounts of text retrieved from the Internet. They learn to construct text by predicting what words are more likely to appear in a given sentence, and can write grammatically accurate syntax.

Unsurprisingly, even academics can be fooled into believing AI-generated abstractions are real. The big language model is good at producing texts with clear structures and patterns. Scientific abstracts often follow a similar format, and can be quite vague.

Gao believes tools like ChatGPT will make it easier for paper mills, which profit from publishing studies, to produce fake scientific papers.

"If other people try to build their science off of these faulty studies, it could be very dangerous," said Gao.