ImperialViolet

Future Battles (28 Mar 2005)

I'm sure that everyone who uses Firefox (1.0.2, keeping up with the security releases, right?) has discovered AdBlock. A more useful plugin doesn't exist (well, maybe greasemonkey with some good scripts) but I think we can expect that AdBlock is going to work a whole lot worse quite soon.

We've seen the efforts that some sites put into getting pop{up|under}s passed blockers. They don't seem to be doing too well from my point of view, but may be I just don't go to the right sites. None the less, they are fighting a loosing battle. Fundamentally the browser can stop Javascript on random sites from opening new windows - it's not rocket science.

The battle AdBlock is fighting is the other way round. For the moment, many sites are neatly organised with all their adverts in a directory called /ads/, or from a host called advertising.com. This makes AdBlock work very well with simple patten matching. But I soon expect that we'll see sites where every image is a random filename.

What do we do then? We could use greasemonkey scripts to rewrite the webpage as we like, right? We could remove the adverts and we can get rid of non-image adverts too (which AdBlock currently doesn't).

That's going to work for a while; probably a long time after AdBlock stops working due to the amount of effort required to create each script. Someone only needs to create it once, but there are a lot of websites and people will have to download and install these things etc.

I don't expect it will work forever. There's a strange idea amongst people who call themselves "content producers" that it's wrong for you to view their content in any way other than as they intended it. (For examples see Odeon reacting to an excellent scrape and most of the anti Google Autolink stuff recently).

It's more difficult to imagine how they're going to stop it but, as tools like greasemonkey become better, expect to see DOM obfuscators running in webservers. These will mess up the HTML differently for every GET request. The pages will look the same in a browser, but you won't be able to use nice class names and such to extract the bits you need.

(The greasemonkey script below would be far more complex if Google didn't neatly put search results into their own class.)

Within a few years I expect that we'll have AI-like filters to remove adverts and obfuscators+human workers doing their best to defeat them. Much as spam filters work today.

But that's a ray of hope because the efforts put into spam filters have paid off. I get 30+ spam messages a day (after simple blacklist filtering which removes a lot) and Gmail's filters have a 0% percent false-positive rate and maybe 1-2% false-negative. That's very good.