Internet Archive to ignore robots.txt directives

smut_clyde · April 24, 2017, 8:25pm

Ah, thanks.
I tend to ignore the Wayback Machine’s web-crawling aspect, using in cases of Someone being Wrong on the Internet to record the craziness of the scam or whatever it is, when there is a likelihood that the culprit will later scrub the page. For that purpose, I am disappointed by the Archive and its adherence to the no-robots tag even for manual archiving.

Daaksyde · April 25, 2017, 4:41am

I don’t know if it will go that far, I think people will still care about privacy. But they will learn that once you’ve published something, it’s public, and you can’t put that genie back in the bottle. And people will reach a point where we’ve all published something we shouldn’t have or that we no longer agree with, and we’ll learn not judge each other so harshly based on what we wrote in our livejournal/tumblr/whatever during our angsty teen years.

If anything, that seems to be the direction - things that were taboo a couple of generations ago, younger people are more than willing to talk about openly and accept in each other.

Knackfloh · April 25, 2017, 9:58am

or we get data standards with inherent (configurable) expiration dates.

beschizza · April 27, 2017, 1:35pm

This topic was automatically closed after 5 days. New replies are no longer allowed.

Topic		Replies	Views
Request list to be deleted from partially, welcome thread, lounge archive meta	78	3900	May 22, 2017
Internet Archive: "DRM for the Web is a Bad Idea" boing	10	1612	April 23, 2017
Pianist wants bad review taking down under EU "right to be forgotten" rules boing	65	4614	November 9, 2014
Actually, it's about ethics in archiving meta	7	897	May 19, 2017
MSNBC's Joy Reid says anti-gay posts on her old blog were 'fabricated', Internet Archive responds boing	81	4448	April 29, 2018

Internet Archive to ignore robots.txt directives

Related topics