Yahoo fights hate speech with abuse algorithm

The automated process can identify abuse in 90 per cent of cases

Yahoo has developed an algorithm that can detect if people are using hate speech online and in tests, which has been able to detect abuse in 90 per cent of cases in tests.

The algorithm is built upon machine learning technologies and crowdsourced abuse detection, analysing parameters such as comment length, number of insult words and punctuation on thousands of articles on Yahoo News and Finance to work out what can be classed as abusive.

Advertisement - Article continues below

Trained abuse-hunter humans also analysed these same pages, inputting them into the database and further training the algorithm to pick up what is classed as abuse.

The third aspect was using Amazon's Mechanical Turk website, which allows companies to pay people to use their own human intelligence to sort out which comments were classed as abusive and which weren't. Yahoo paid each member who wanted to identify abusive/non abusive comments $0.02 for each comment they categorised.

However, this second group of humans was much less successful at identifying the abusive comments, Yahoo said, which demonstrated people need to be trained to work out which comments were classed as hate speech and which were not.

Nonetheless, merging these processes managed to uncover the majority of abusive comments on the sites, making it one of the most accurate uses of machine learning to date.

Advertisement
Advertisement - Article continues below

Now Yahoo is releasing the database of hate speech to the wider world, hoping to stop the problem spreading.

However, Yahoo explained it has not tested the technology outside of its own sites, but it still represents a significant step forward in the field of natural language processing.

Featured Resources

Top 5 challenges of migrating applications to the cloud

Explore how VMware Cloud on AWS helps to address common cloud migration challenges

Download now

3 reasons why now is the time to rethink your network

Changing requirements call for new solutions

Download now

All-flash buyer’s guide

Tips for evaluating Solid-State Arrays

Download now

Enabling enterprise machine and deep learning with intelligent storage

The power of AI can only be realised through efficient and performant delivery of data

Download now
Advertisement

Most Popular

Visit/infrastructure/server-storage/355118/hpe-warns-of-critical-bug-that-destroys-ssds-after-40000-hours
Server & storage

HPE warns of 'critical' bug that destroys SSDs after 40,000 hours

26 Mar 2020
Visit/software/video-conferencing/355138/zoom-beaming-ios-user-data-to-facebook-for-targeted-ads
video conferencing

Zoom beams iOS user data to Facebook for targeted ads

27 Mar 2020
Visit/cloud/355098/ibm-dedicates-supercomputing-power-to-coronavirus-researchers
high-performance computing (HPC)

IBM dedicates supercomputing power to coronavirus research

24 Mar 2020
Visit/software/355113/companies-offering-free-software-to-fight-covid-19
Software

These are the companies offering free software during the coronavirus crisis

25 Mar 2020