Researcher Study: ChatGPT Gives More Accurate Answer When Users Are Rude

JAKARTA A study showed that the Artificial Intelligence (AI) chatbot could produce a more accurate response when users are rude. This is evident from the results of testing on ChatGPT.

The study was published on October 6, 2025 in the arXiv precast database. Launching from Live Science, researchers used OpenAI's ChatGPT-4o in testing whether the user's voice tone can affect the performance of the AI system.

Researchers developed 50 basic double choice questions from various categories such as mathematics and science. Then, these various questions were made in five categories of tone, including very polite, polite, neutral, rude, and very rude.

Every question with a different tone is included as many times as in the ChatGPT-4o. Before the question was asked, chatbots were asked to ignore the previous conversation so as not to be influenced by the tone of the question that had been asked.

"Our experiment is still in its early stages and shows that sound tones can affect measured performance based on a score of answers to 50 questions significantly," the researchers wrote in his paper.

Testing also proved that a harsh tone of voice 'provided a better response than a polite tone of voice'. The accuracy of the response reached 80.8 percent for very polite questions and 84.8 percent for very harsh questions.

Interestingly, the accuracy of the AI model's answer increased along with the further the tone from the most polite category. The polite answer has an accuracy rate of 81.4 percent, followed by 82.2 percent for neutral, and 82.8 percent for rough tones.

However, the researchers suggested not to implement these findings. The addition of insulting or demeaning languages is claimed to have a negative impact on user experience, accessibility, and inclusiveness.

The research team plans to expand their research to other models, including Claude LLM from Anthropic and ChatGPT o3 from OpenAI. They also realized the need to measure other performance dimensions, such as smoothness and coherence beyond the accuracy of multiple choice answers.