Tech companies should do something about harassment, but not this

doctorow · December 10, 2014, 7:00pm

fuzzyfungus · December 10, 2014, 7:22pm

It is unlikely to help that ‘Content ID’ and friends are notoriously opaque and twitchy, so any ‘anti harassment’ mechanism based on similar concepts will take about 4 seconds to become a playground for trolls to lead victims into triggering it and being silenced (which is, after all, half the fun).

JessicaValenti · December 10, 2014, 9:50pm

This would be interesting if I had ever written that we should use a system like Content ID to stop online harassment. But I never suggested that - I just used it as an example of how when money is on the line, companies make things happen fast. I get the feeling that maybe you just read Jeong’s article and not mine (which isn’t linked to as far as I can see so people can see my argument in full). Honestly, I think a retraction is in order.

JessicaValenti · December 10, 2014, 9:53pm

Here’s my original article - you can see I never made any such argument.

Ambiguity · December 10, 2014, 10:12pm

Yea, the article is a little different than portrayed.

A couple of questions/comments, though:

And it shouldn’t need to rise to being a question of constitutional law.

Seems to me that deciding if on-line speech is or isn’t the same as speech IRL in terms of harassment, this is an entirely valid use of SCOTUS’s time.

And one other point:

If these companies are so willing to protect intellectual property, why not protect the people using your services?

Just to be clear: blocking someone from threatening someone else isn’t the same as protecting them, in the same way that talking Craigslist out of accepting sex worker ads ends sex work.

Tech companies can and should provide better tools for their users to block harassing speech, but I don’t think that should be cast as providing any real protection.

On the other hand, if you want to somehow create a monitoring system to automatically report things to LEO, a) Corey’s comments are germane, and b) that probably needs the aforementioned noodling by SCOTUS.

Greg_Sheppard · December 11, 2014, 12:40am

The most ludicrous thing is you call the guardian’s favourite clickbait writer (you know this cause of just how often the guardian tweet and facebook an articlenofnhers) ‘normally sensible’.

I’m very pro feminist but I have my suspicions that Jessica Valentini is actually a shill for the patriarchy because she is so utterly absurd, badly thought out and almost every article clearly written to just provoke response by being so stupid an un-nuanced. She is far and away the worst regular feminist writer for the guardian, she would be the worst regular writer but they do employ Owen Jones who seems to also be on a crusade to discredit the progressive left.

Her latest one is a joke - she complains that we shouldn’t buy into the consumerism of Christmas at the start then blames her husband for not buying into or caring about the consumerism of Christmas and gifts and nonsense as she does. She is projecting her issues onto a supposed wider issue because her husband has different values to her.

SoItBegins · December 11, 2014, 1:42am

[quote=“Greg_Sheppard, post:6, topic:47973, full:true”]
…[/quote]
[citation needed]

Boundegar · December 11, 2014, 3:13am

That sure seems like your intent, although you can argue that you only meant that somebody like YouTube should use something like ContentID. That would be missing the point.

Those tools you use as examples work terribly. They cause pain and suffering every day. The task of finding popular songs is infinitely easier than the task of accurately identifying harassment. And the pain and suffering caused by censoring speech is vastly greater than the pain of censoring music.

No algorithm will ever understand the nuances that separate humor from harassment; even humans are terrible at the task. And what happens when the next President decides that political criticism is a form of harassment? What if Richard Nixon had had this technology at his command?

Shane_Simmons · December 11, 2014, 4:19am

I just read the article, and I’m inclined to agree. Real shame I have zero pull here.

EDIT: I mean…here we are, on a forum, built by someone who’s trying to build a platform for civilized discourse. And IIRC, it already has auto-banning for certain types of behavior, though I’m not sure if that’s relevant…anyway…I’m wondering why Cory read this piece, thought, “hey, she’s proposing a Content ID-style system,” and felt the need to write it up that way. When I think of people going off half-cocked, I don’t think of Cory Doctorow.

Good golly, that’s some weapons-grade hyperbole.

Boundegar · December 11, 2014, 4:43am

Is it? One hundred hours of video are uploaded per minute now. If an average video is ten minutes long, that’s 600 uploads per minute, or about a million uploads per day.

Even if Content ID was 99% accurate, there would be 10,000 false positives every day - 10,000 perfectly legal videos blocked for no reason. Maybe a few hundred legitimate, non-infringing users permanently banned every day.

You think there’s any pain and suffering involved?

Willondon · December 11, 2014, 4:55am

I agree totally. It just conjured up a loop of “Don’t Fear the Reaper” in my head, which is going to take a little while to fade out. I’m certainly not complaining, though.

doctorow · December 11, 2014, 6:00am

Jessica, I did read your article when it was first published, and I didn’t write it up at the time because I thought you’d made some really significant errors that I didn’t have time to go into, so when I saw that Sarah had addressed those errors in depth, I linked to her piece.

Here is a closer set of reactions to your piece:

When money is on the line

It wasn’t money, it was an existential threat to the system itself. The context for the current Content ID rules is Viacom’s $1B lawsuit against Google through which the company intended to have control of Youtube transferred to it (internal emails released in discovery reveal Viacom execs arguing over who would head up Youtube when it was a Viacom division).

The distinction matters, because the context that created the extremely problematic Content ID system is a belief that anyone who creates a platform where anyone can speak should have to police all speech or have their platform shuttered.

Content ID was an attempt to “show willing” for judges and the court of public opinion, but it’s obvious at a cursory glance that Content ID makes no serious dent in copyright infringement.

internet companies somehow magically find ways to remove content and block repeat offenders.

No, they don’t. Youtube can disconnect your account for repeat offenses, but not you – indeed, the number one rightsholder complaint about automated copyright takedown is that people just sign up under different accounts and start over (the exact same complaint that is fielded about online harassment).

just try to bootleg music videos or watch unofficial versions of Daily
Show clips and see how quickly they get taken down.

As Sarah points out, Youtube is full of Daily Show clips and music videos that haven’t been taken down – but the analogy is a false one, since the whole set of “works that Youtube is policing for” can be contained in a database, while “harassment” is not a set of works or words, but nuanced behaviors and contexts.

Interestingly, Content ID is asked to police within these sorts of boundaries in the case of fair use. When your material is taken down because of a Content ID match, but you believe that the use is fair, Content ID has a process to address this.

And this process is easily the least-functional, most broken, least effective part of Content ID.

In other words, the part of content monitoring that most closely resembles an anti-harassment system is the part that works worst.

But a look at the comments under any video and it’s clear there’s no real screening system for even the most abusive language.

Again, that sounds right to me. A system that just blocked swears would not make much of a dent in actual harassment (as we’ve seen since AOL’s chatrooms and their infamous anti-profanity filters, it is trivial to circumvent a system like this).

Meanwhile, swearing – even highly gendered and misogynist swearing – isn’t the same thing as harassment. For starters, people who have been harassed often have need to directly quote that harassment, and an automated system that blocks “the most abusive language” would prevent that.

There is also the problem of substring matching (“Scunthorpe”), discussion about words themselves (“Here is my video on the etymology of the word ‘whore’”) etc.

“If Silicon Valley can invent a driverless car, they can address online harassment on their platforms.”

As Sarah points out, Silicon Valley can’t invent driverless cars.

But while Twitter’s rules include a ban on violent threats and “targeted abuse”, they do not, I was told, “proactively monitor content on the platform.”

This sounds right to me. How would you “proactively monitor” more than 1,000 tweets/second? (Unless, of course, you were using something like Content ID, which you say you’re not advocating).

==

To sum up (and to reiterate my original post): there are things that Silicon Valley can do to fight harassment, but none of the suggestions in your column:

Adapting Content ID for harassment
Blocking “abusive” language
Pro-actively monitoring tweets

are things that would be good for this, and all pose significant threats to free speech.

Further, all the examples in your column of hard things that Silicon Valley has done that are similar in scope to blocking harassment are not things that they’ve actually done:

Terminating repeat offenders
Blocking music videos and Daily Show clips
Making a self-driving car

These three problems are actually canonical examples of the kinds of problem that no one has figured out how to do on unbounded, open systems:

Build a judgement system that can cope with rapidly changing contexts and a largely unknown landscape (Google’s cars are a conjuror’s trick: http://boingboing.net/?p=339976)
Uniquely identify and interdict humans (rather than users or IP addresses)
Preventing people from re-posting material that they believe should be aired in public

There are good and bad reasons to perfect all these technologies (yes to self-driving cars, no to better military drones; yes to detecting corporate astroturfers, no to unmasking whistleblowers; yes to blocking harassment, no to blocking evidence of human rights abuses). They are all under active research at universities and corporations and governments.

But none are actually the kinds of things that we can expect to do much about harassment today, and today is when we need things done about harassment.

doctorow · December 15, 2014, 7:02pm

This topic was automatically closed after 5 days. New replies are no longer allowed.

Topic		Replies	Views
Twitter's got a new troll stick boing	45	5284	May 3, 2015
YouTube let a contentID scammer steal a popular video boing	26	2468	December 31, 2018
YouTube. Not the LGBT defenders you were looking for wrath	28	947	July 11, 2019
Clever hack that will end badly: playing copyrighted music during Nazis rallies so they can't be posted to Youtube boing	53	2570	July 28, 2019
Censorship flood: takedown notices to Google increased by 711,887% in four years boing	21	3159	April 1, 2014

Tech companies should do something about harassment, but not this

Related topics