Alphabet fights 'toxic' comments with machine learning
- Author: Carolyn Briggs Фев 23, 2017,
Фев 23, 2017, 22:52
Cohen emphasized that Perspective was a work in progress and would only improve if people contributed to it.
The software is called Perspective, and it works by applying a score to comments based on similarities to other comments classified by humans as "toxic".
Once Perspective scores a comment, that information is passed along to the publisher to use as they wish.
With Perspective running, publishers are free to choose what it does. Perspective uses machine learning models to scan online content and determine a comment's "toxicity" level. Publishers of news sites can also use the tool to monitor their comment boards, he said. The company also mined the comment sections of Wikipedia and collected data from victims of online harassment who had kept a record of their experiences.
Google noted that 72 percent of American internet users have witnessed harassment online and almost have personally experienced it. With almost half of internet users surveyed reporting being harassed online and one-third of surveyed users reporting that they're censoring themselves to avoid starting a fight online, Perspective is hitting the internet at an opportune time, and will likely learn quite a lot in a short time. We'll be releasing more machine learning models later in the year, but our first model identifies whether a comment could be perceived as "toxic" to a discussion. Some may choose to show the scores to readers and use crowdsourcing methods that are similar to current systems where readers can flag offensive language to human moderators.
It's up to publishers to decide how to use the tool.
Asked whether the site could result in censoring free speech, Cohen said that the software tool wasn't meant to bypass human judgment, but to flag "low-hanging fruit" that could then be passed on to human moderators. Alternatively, Perspective could merely act as a filter, pushing what might be negative comments to a human team. People were asked to rate internet comments on a scale from "Very toxic" to "Very healthy".
'We've been testing a version of this technology with The New York Times, where an entire team sifts through and moderates each comment before it's posted-reviewing an average of 11,000 comments every day, ' explained Cohen. For publications that still have comments, manual curation takes money, labor, and time.
Perspective, now an "early-stage technology", was developed by Jigsaw in partnership with Wikipedia, The New York Times, The Economist and The Guardian.
Perspective is the latest in a host of machine learning technologies that Google has been making available to developers, alongside such tools as the TensorFlow library and the Cloud Machine Learning Platform.
'When Perspective is in the hands of publishers, it will be exposed to more comments and develop a better understanding of what makes certain comments toxic, ' Cohen shared.