Context reduces racial bias in hate speech detection algorithms
3 min read
When it comes to accurately flagging hate speech on social media, context matters, says a new study aimed at reducing errors that could amplify racial bias.
Understanding what makes something harmful or offensive can be hard enough for humans, never mind artificial intelligence systems.

So, perhaps it's no surprise that social media hate speech detection algorithms, designed to stop the spread of hateful speech, can actually amplify racial bias by blocking inoffensive tweets by black people or other minority group members.

In fact, one previous study showed that AI models were 1.5 times more likely to flag tweets written by African Americans as "offensive"--in other words, a false positive--compared to other tweets.

Why? Because the current automatic detection models miss out on something vital: context. Specifically, hate speech classifiers are oversensitive to group identifiers like "black," "gay," or "transgender," which are only indicators of hate speech when used in some settings.

Now, a team of USC researchers has created a hate speech classifier that is more context-sensitive, and less likely to mistake a post containing a group identifier as hate speech.

To achieve this, the researchers programmed the algorithm to consider two additional factors: the context in which the group identifier is used, and whether specific features of hate speech are also present, such as dehumanizing and insulting language.


"We want to move hate speech detection closer to being ready for real-world application," said Brendan Kennedy, a…
Read full article