Paranoid Browsing: anti-profiling plugin seeks feedback


#7

Ironically, Chrome only. I mean, he could have made it for a browser not made by Google...


#8

Is that a hard requirement, or do de-Googled, but otherwise similar, Chromium-based browsers work?

A hard requirement would be foolish; but getting a defanged version of Chrome isn't exactly difficult.


#9

Hmm, the plugin leads off talking about "Advertisers and government agencies" - color me skeptical.

Agree w/ weissritter that this sort of "chaff" really just raises surface area for false positives. Say you're searching for backpacks and the background browser does a search for pressure cookers, for example...

Beyond that, if tools like XKeyscore work as stream filters as the leaked docs imply, then once you get targeted, you don't really get much protection at all as finer-grained filters are applied (say at a site or keyword level).

For third party commercial third party datamining, it seems like Disconnect would be fine, or using private browsing for things that you don't want first parties to track. If you want more privacy you could combine that with VPNs and TOR (this of course, doesn't work against state actors - that'll just get you flagged for extra scrutiny and as soon as anything leaks you're hosed).


#10

I assume it will work with Chromium. There is no reason why it wouldn't.

It would be more in line with their ideals to use Firefox though.


#11

Ideologically that is probably true; but (just another little detail that makes their problem more difficult), determining which browser somebody is using by examining traffic on the wire or getting logs from a cooperative host is generally not that difficult.

By default, the browser will send an honest UA string, in the clear, with all sorts of requests. Even if you configure the browser to lie, individual sites often sniff more carefully (so they know what ghastly javascript tricks will or won't work, etc.), sometimes with different resources requested from the server depending on the result.

Unless the system is to be trivially obvious, it is probably necessary to have the spoofed traffic generated by the same browser that the user typically uses for real activity.

I don't know how the stats are these days, so I don't know if Chrome is the best starting point or not; but if you want it to have a chance of working, supporting just one browser, any one, is going to be an issue (and let's not even think about mobile browsers, which are power/data constrained, sometimes locked down, and should be easy to correlate with subscriber data provided by the notoriously privacy friendly Cell companies. People don't necessarily do most of their browsing on phones; because it's unpleasant; but they often hit authenticated services from both phone and PC). It is not an easy problem.


#12

I was thinking the same thing, do security minded people still use Chrome?


#13

Chrome has very good security, actually. I just prefer an open source project and Mozilla is run by a non-profit foundation.

(Disclosure: I work for the Mozilla Corporation.)


#14

Since the type of background browsing this plugin does is configurable, there is no chance a malicious agency could use it on a target, someone who isn't tech savvy, the kind of person with a dozen random toolbars installed,for example, to give them a false profile for browsing terrorist related or illegal content?

Or am I just being super paranoid?


#15

AFAIK it won't work. Given a long enough history it's easy to filter out the noise from the signal. At least that's what I was told by security experts when suggesting similar techniques to hide other data. A quick google brings up this paper.

http://www.csee.umbc.edu/~hillol/PUBS/kargupta_privacy03a.pdf

I'm sure there's more but basically I was told it's security 101 that adding random noise will not hide the data.


#16

The Kargupta paper is nicely general, but the amount of noise they add is not enormous: they note that for SNR << 1 the method doesn't do that well.

A more directly relevant paper is
http://www.informatica.si/PDF/34-2/14_Pozenel%20-%20Separating%20Interleaved%20HTTP%20Sessions.pdf
which shows how to disambiguate interleaved clickstreams using Markov models. This looks harder to spoof: if there is no link from a page on site A to site B and vice versa, then a sequence A1 B1 A2 B2... is pretty easy to separate. Same thing for parts of a site that rarely have transitions between each other. However, this method has problems when users can jump into a site at nearly any point (hard to tell where a session starts) and when sessions overlap in the sequence.

So my suggestion for the plugin is to have the option of using URLs in the current browsing history to create more overlapping clickstreams, and maybe find a few more ways of jumping straight into sites (googling some keywords gleaned from past pages and then jumping?)


#17

I think because the extension clicks on the actual links in the page, the method described in Pozenel et al. won't work.


#18

But can I get a plugin to automatically connect me through seven proxies?

http://knowyourmeme.com/memes/good-luck-im-behind-7-proxies


#19

Yes, but by tending towards the more popular websites, all it does is make your browsing history more average than it was. And that's the same as not knowing anything about you.

Why? Because if they didn't know anything about you, they would assume you were average.

Imagine the nyancat video were trending, and everyone was clicking on it. In the absence of any information, YouTube would have to assume you liked nyancat. If you get annoyed at it because it's not as relevant to you, that actually means that you secretly like the fact that YouTube is knows enough about you to show you videos about makers, or about backpacks and pressure cookers, or whatever it is you tend to search for, and not show you nyancat videos.

If you don't actually like that fact, then you can install this and, sure, you'll start seeing more ads for beer and football. But that's exactly what you would have been seeing in the absence of other information.


#20

Unless it downloads movies, you're unlikely to hit any reasonable* data-cap using this.

  • yeah, okay. Poor choice of word. No data-cap is reasonable.

#21

I'm confused as to the value of this type of obfuscation.

If the scenario is a watchdog agency is looking for users that look for flagged keywords/content/URLs, this strategy will not stop the flags from being triggered. That activity will still be in there amongst the automated activity noise.

Or is the thought that for some profiler algorithms, the percentage of activity for flagged activity is a primary determining factor if the user becomes a person of interest so by adding a bunch of average activity noise, that user will not be flagged by the watchdog.


#23

Ok, you just totally cracked me up there .. (good point too)


#24

#25

To take one example: any person browsing the web will encounter a few mentions of the word "censorship" in the pages they read. What would flag them as being unusually interested in "censorship" is if a high fraction of their pages contained this word.

Obviously, if you spend a lot of time looking at super-ultra-trigger words this extension can't help, because even one google search for "how to build a bomb so I can overthrow the US government" will probably put you on a watch list. But it can help if you are a "mild" person of interest.


#26

I would argue that we are using the wrong logic. If everyone were to start using the key words that trigger NSA spying, this would overload the system as everyone is now a suspect. I am Spartacus !


#27

This is a very old idea. It was implemented in TrackMeNot (http://cs.nyu.edu/trackmenot/) at least 5 years ago. People should do their homework...