Thursday, June 11, 2020

How Google Is Being Used To Erase Web Searches - By Bad Actors Manipulating A 1998 Copyright Law


Graphic showing how bad actors are able to mislead Google into extirpating valid news, links from the internet. (From The Wall Street Journal, May 15,  p. A1)

I confess that my first reaction on reading a recent WSJ article  from May 15th:

Google Hides News, Tricked by Fake Claims - WSJ



was stark disbelief.  I simply couldn't buy that Google -  a tech company that uses a quasi IQ test to recruit - was being played  in order to scrub alleged copyright infringing material from Google Search.  If true, it meant that no searches made using Google could be trusted because any one might be absent on account of the nefarious work of a bad actor or actors - vipers misusing a particular  law known as the  Digital Millenium Copyright Act (DMCA).  This 1998 law grants immunity to tech companies  from claims in copyright cases so long as they quickly act to take down material that indicates copyright violations - once alerted.

While the law sounds reasonable it turns out it is now being misused to go after content the enemies of truth might not like to have around, or accessible to people seeking genuine information.   To be sure, most takedown requests to Google are legitimate requests - as the WSJ piece notes - i.e. requesting that pirated copies of a movie or album be removed from search results.  

However, the Journal cites the case of one Colorado bad actor (Dak Steiert) which definitely can be regarded as a cautionary tale, e.g.

"When a Colorado man, Dak Steiert, faced state court charges of running a fake law firm in 2018, he sent Google a series of copyright claims against blogs and a law firm website that discussed his case, claiming they had copied the posts from his own website. That wasn't true, the Journal determined, but Google erased the pages from its search engine anyway."


That, of course, is beyond reprehensible, especially as last year :

"Steiert pleaded guilty to Colorado state court to one count of false advertising in his business. The Colorado  Supreme Court closed his practice.  But the articles about him in other blogs, website remained invisible in Google searches until the Journal flagged the cases to Google, which reinstated the links."

How could such a travesty occur?  Well, because the DMCA  ditches judicious process for speed, penalizing  tech companies for responding too slowly to  copyright violation claims. When what's really needed is the "pause"   button and investigations made into the claims.  Make no mistake that this is serious, given Google "has struggled to keep up with the flood of claims".  Worse,  one UC Berkeley Law professor (Jennifer Urban) "found that in 2017 as many as one third of takedown requests to Google made claims that might not stand up in court."

According to Daphne Keller, former Google lawyer and now a program director at Stanford's Cyber Policy Center:

"If people can manipulate the gatekeepers to make important and lawful information disappear, that's a big deal."

Indeed it is, and it isn't helped by the fact that  "Google has automated much of the process of reviewing takedown requests, relying on techniques that don't require human reviews to enable removals at a large scale."

In other words, DMCA-honed algorithms and 'techniques' are employed which forgo critical thinking and reasoned judgment by humans.  The situation is almost guaranteed to unleash mass removals where these aren't truly justified. How can it be otherwise when a dumb machine  or program without active,  analytical critical judgment capacity is allowed to bludgeon its way through claims?

Then there is the other aspect of the law which weights inordinately on just removing material without a fight. According to one website (techdirt.com)  that referenced the WSJ piece:

"While the WSJ article is very well researched and reported, and highlights this huge problem, my one complaint with it is that it barely acknowledges how the real problem here is the structure of the DMCA 512's notice and takedown structure -- which is heavily one-sided. Under the rules of 512, if you receive a takedown notice, you don't have to remove the content, but the legal pressure and liability is heavily weighted in that direction. If you remove the contested content, you are then (mostly) free from any copyright liability. If you refuse to remove the content, you are not. And while you might still prevail should a lawsuit be brought and you can make use of other defenses, the 512 safe harbor means that you'll get out of the case faster and easier if you just remove the content (and you're much less likely to be sued).   

What's odd, however, is how little attention people seem to be paying in most of these discussions to whether or not we need to fix the DMCA in the other direction -- to fix for the fact that the notice-and-takedown provisions of the DMCA are regularly used for censorship, even of news. Late last week, the Wall Street Journal had a very thorough article (possibly paywalled) detailing how they found hundreds of news articles that were taken out of Google's search due to what appears to be bogus DMCA takedowns. After contacting Google about this, the company said that it had found approximately 52,000 news articles that had been deleted from its index via bogus copyright notices, i.e.:


After the Journal shared its findings with Google, the company conducted a review and restored more than 52,000 links it determined it had improperly removed, she said. Google said its review identified more than 100 new abusive submitters, declining to discuss individual cases. 
 Obviously this is not good enough and one can reason that the law itself is totally defective if it can be manipulated to the degree the WSJ describes, e.g.

"Someone wanting Google to hide a webpage will find a little-trafficked blog and post a copy of the content from the legitimate webpage. After backdating the plagiarized post, the complainant will file an electronic notice with Google claiming the real article is a copyright violation."

Spurious 'backdating' indeed, appears to be a general method used to trick Google into purging web searches.  But even that barely touches the ubiquitous treachery and sketchy ploys used to defeat perceived enemies. For example, in one of its investigations of sender information tied to phony complaints the Journal identified:  "names of people that couldn't be confirmed, photos of ostensible senders that were cribbed from the internet, blogs claiming to be violated that had only a few posts, and phony claims to ties with legitimate publications."

"It's a jungle out there!"  as my nephew would put it.  Fortunately, the era of scamming and pranking and punking Google to erase searches may soon be at an end.  According to the WSJ piece the U.S. Copyright Office "will soon conclude a years long study into how well DMCA is working".

 Let us hope the Copyright Office makes new demands for better oversight, not so wed to rapid takedowns - and also -  more input from human critical thinkers rather than A.I. robots and ham-fisted algorithms.

See Also:




No comments: