Polite Warning From Twitter Against User Accounts Can Lower Hate Speech

JAKARTA - A research team recently ran an experiment on Twitter. They found that when Twitter users were warned of punitive action for using hateful terms, the likelihood of posting hateful content now actually dropped by 20 percent.

Twitter has been battling the problem for a while and has recently stepped up its efforts to make the platform less dangerous. One of the most significant victims of Twitter's content rules was former US President Donald Trump, who was permanently banned from the platform.

Over the past few quarters, Twitter has taken many steps to address the toxic interactions and misinformation issues. In May this year, Twitter began rolling out requests asking users to reconsider before posting anything that offends or hurts an individual or group.

To ensure that users are contextually aware and not helping to spread misinformation or other harmful content, Twitter began asking users to read articles before they retweet. However, the problem persists, and addressing it is proving to be a difficult path for Twitter, especially in markets outside the US.

A team of researchers from New York University conducted a test that involved warning users that their accounts could be suspended if they posted hateful content.

Published by Cambridge University Press, the paper titled "Short Suspension: How Suspension Alerts Can Reduce Hate Speech on Twitter" examines the effectiveness of user warnings about suspension versus immediate suspension and how each scenario affects the likelihood of hate posts.

The team concluded that the warnings actually worked, and if the warning messages were more polite, the chances of posting malicious content were much less. As part of the research, the team selected more than 600,000 hateful tweets posted in the week before July 21 last year and isolated a total of 4,300 followers of accounts suspended for violating the platform's content policies.

The team then issued a warning message that started with the line, "The user [@account] you are following has been suspended, and I suspect this is due to hateful language."

However, the language that follows varies, which can be something as straightforward as "If you continue to use hate speech, you may be temporarily suspended." Or a little more polite like "If you continue to use hate speech, you may lose your posts, friends, and followers, and your account won't come back."

The goal is to deliver an effective warning message, which makes the order appear legitimate, and also conveys the idea that reprieve action will be taken if the target audience posts problematic content.

The team noted that sending a warning prompt helped reduce the hate tweet rate by 0.007 during the week or up to 10 percent a week later. But in scenarios where the message was expressed more politely, the reduction in hate tweets rose even higher and hovered between 15 and 20 percent.

While the effects of such an alert only last up to a month, even a temporary reprieve from the toxic tweet is an encouraging sign, allowing the team at Twitter to build on it.

The English, Chinese, Japanese, Arabic, and French versions are automatically generated by the AI. So there may still be inaccuracies in translating, please always see Indonesian as our main language. (system supported by DigitalSiber.id)

Tag: twitter media sosial akun medsos ujaran kebencian