I’ve noticed that a number of news sites are bundling their articles up in JSON within javascript. When the page loads, the script inserts the data in the JSON into the DOM to make it readable. The actual script isn’t part of the page, sometimes on another site. (If only we had some standard like HTML5…)
At the moment, it’s not a big deal to reach inside the script blackbox and pull out the article text, but I wonder where this is heading? There’s a whiff of impending DRM in the air, which would make archiving news articles useless.