Why The Filter Won't Work, A Technical Story

The proposed filtering technique is based on exact HTTP URLs, not IP addresses nor domain names. URLs (Uniform Resource Locators) are the full address that you might type into your web browser's address bar. For example: http://www.gizmodo.com.au/2010/07/the-evolution-of-labor-internet-filter-policy/

This URL can be broken down into sections thus: http <- the protocol for accessing the online resource. 'http' is but one way to access online resources. Others you may have seen include "https" (for secure web sites) and "ftp" (for file transfer), but there are many more.

www.gizmodo.com.au <- the name of the web server. (See below for how this is transformed into an IP address.)

:80 <- implied if not specified, 'port 80' is the default doorway through which you can access this content on this web server. There are many alternate ports through which a web server can choose to share content.

The remaining /2010/07/the-evolution-of-labor-internet-filter-policy/ <- is the specific path to your file; and invisible but again implied is a default file name (probably 'index.html' or similar ).

Why this is important is that the government's proposed URL filter only targets the entire URL, not its constituent parts. So if you (as a content publisher) change the protocol, or the name of the web server, or the port it runs on, or the path to the file, or the specific name of the file, or even exploit features of how URLs are accessed, then that URL will no longer match in the 'blocked' list, and a user will be able to access it.

For a simple example for users, try adding a question mark at the end of the URL thus: http://www.gizmodo.com.au/2010/07/the-evolution-of-labor-internet-filter-policy/? This 'new' URL would not match the entry on the blocked list, allowing users to see it.

The government might then choose to add both URLs to avoid this, but then you could add a dummy value to create another URL: http://www.gizmodo.com.au/2010/07/the-evolution-of-labor-internet-filter-policy/?mydog=hasnonose

Now this is a different URL which passes a nonsense value to the web page (which will be ignored by the Gizmodo web server), again allowing the user to see the web page. There are far too many permutations available to the user for a blacklist of 10,000 URLs to capture them all - and this is for one specific web page!

As a content distributor, if you became aware that your URL was blocked but you wanted to help your users access that content, you could easily change the path name or file name on your web server and relink that from your front page in under five minutes. (to say, '2010/07/the-evolution-of-labor-internet-filter-policy_2/' )

And all this is without users even having to consider non-HTTP traffic options or the use of proxies and VPNs.

So if URL filtering won't work, what about IP address filtering? While it's not the government's proposal at this time, it's still worth knowing why that option won't work either:

What is the difference between IP addresses and domain names?

An IP address is simply a string of numbers. You can think of it as analogous to a telephone number, only the number is longer (and frankly, that number may only get you to 'reception').

Now human beings aren't terribly accurate when it comes to remembering very long numbers, so the Domain Name System (DNS) came about so we could remember words instead. To continue with the telephone analogy, DNS is like having directory assistance. You could ask a DNS server for the IP address of 'gizmodo.com.au' and it will respond with something like '114.141.196.60'.

How easy is it to change a site's IP address?

Incredibly easy.

Since most people don't type '114.141.196.60' into their web browsers to see your site, but instead type 'http://gizmodo.com.au/', so long as you or your web hosting partner keep your DNS entry up to date (i.e. keep your directory entry up to date), you can change the IP address incredibly often and it would be surprising if anybody noticed.

The only thing that stops you changing your IP address too often is it may take time for your change to propagate to all relevant DNS servers on the planet. The usual maximum time that sysadmins quote is around 72 hours (because of caching - it makes responses faster but updates slower). But even 72 hours is orders of magnitude quicker than governmental processing of complaints could ever be.

Why a blacklist based on IP addresses is a problem?

Apart from the ease with which you can change your IP address, it is actually not that common that only one web site runs on one server:

Firstly, many sites can co-exist on the same IP address and often do, particularly when a company purchases web hosting space from an external provider. That IP address may only get you to the 'front gate' so to speak. So if you blacklist by IP address, you are likely to block many innocent sites when you choose to block one bad apple.

Secondly, some sites are 'load balanced' across multiple servers on multiple IP addresses.

If you wanted to look at SBS television's World Cup coverage on their website, I am pretty sure you wouldn't be alone at the moment. To handle so many requests at once, and to allow for redundancy in case one server fails, SBS would share that load across multiple servers on many IP addresses. So if a complaint was upheld and the decision was made to block SBS by IP address (because that person so despises the sound of the vuvuzela), they would fail to block the site as more than one IP address can respond. (And conversely, if SBS are aware that, say, Senator Conroy's filter is using an IP address based filter and they didn't want anybody to block their coverage, they could simply change that IP address and presto, the filter would no longer work. )

But the reality of video streaming is that many companies choose to delegate that work to external specialists like Akamai. Akamai is a company that assists companies like SBS Television to stream data such as that for the video on the SBS Tour De France website (see http://www.akamai.com/html/solutions/media_delivery.html). The general gist of it is that Akamai's servers are distributed in many locations and with many IP addresses, so any given video feed could be coming to you from a large selection of IP addresses - addresses that will be recycled constantly. So to block such a load-balanced site by IP address not only fails to block the content as it will be available on other IP addresses, but it will block subsequent clients' content on that same reused IP address.

So this is a fairly lengthy list of reasons why it can't work, and we've only just scratched the surface. There are many more issues and many more workarounds available to both users and content providers. (And that is without even exploring non-technical issues such as censorship and freedom of speech). Even Enex Labs' commissioned report on this issue to the government listed 37 different methods by which such a filter could be bypassed.

Industry experts (such as SAGE-AU members) are all saying the same thing: that legislating to force ISPs to perform such filtering is a costly exercise in futility.

Andy Leyden serves on the national executive of the System Administrators Guild of Australia. A programmer and system administrator, he works in and around the web every day, seeing the medium as an opportunity - not as a threat.

[Fight The Filter]


Comments

    well yeah... I'm pretty sure the current web filter plan couldn't even block the entirety of gizmodo.

    What tech does the Australian filter use?

    The NZ filter doesn't work that way - the claim is that it can block by domain name, domain name + path or full URL.

    If you're interested, here's the FAQ we wrote about it. http://techliberty.org.nz/issues/internet-filtering/filtering-technical-faq/

      Senator Conroy's May 2010 response to Senator Ludlam refers specifically to a requirement for ISPs to filter 'a defined list of URLs' only. http://parlinfo.aph.gov.au/parlInfo/genpdf/chamber/hansards/2010-05-11/0165/hansard_frag.pdf
      The Enex Labs pilot also only investigated a defined list of URLs (although the report acknowledges that other jurisdictions such as NZ a choosing to block entire domains).

      The technology used in the Enex pilot is not specified, but their methods are described in the report: http://www.dbcde.gov.au/__data/assets/pdf_file/0004/123862/Enex_Testlab_report_into_ISP-level_filtering_-_Full_report_-_Low_res.pdf

    "costly exercise in futility"

    That sounds like the Australian government, spend lots of money, make a big show and tell about, achieve absolutely nothing.

      ...and end up with fires in your attic

        Or nomads land rights...

      or spams and scams coming through "the" portal.

    Excellent article. Very interesting read. The expense of setting up such a filter must be immense and knowing it is so unworkable and hence will ultimately fail, makes it all the more infuriating when you think of all the worthwhile things our tax payer's dollars could be spend on instead.

    This is just depressing. Because this filter is just so bad, and easy to workaround and a giant waste of money. God I hate politics.

    Where is the source that says the Aus filter cannot block at the domain level?
    I've heard otherwise: that the filter could work at the TLD, domain level or sub-domain level (.com, gizmodo.com.au or sub-domain.gizmodo.com.au). Is this no longer the case?
    Is the filter also incapable of sub-string matching (of course this would be slower, so perhaps it is out).
    Note that full domain matching would not require sub-string matching (its in the http request).

      I wasn't intending on implying that blocking at domain level and subdomain level wasn't possible (with all of the associated overblocking that would entail). Only that the proposed requirement as voiced by Senator Conroy is full URLs only.

      However please don't get caught up on this detail. I used that example as an easy to understand insight on the simplicity of bypassing the filter.

      There really are plenty of methods to circumvent the proposed filter as the proposal currently stands (and no penalties for doing so). The point I was making is that this is a very expensive project that would not achieve its desired aims.

    Considering 'protect the children' is the war-cry being used to push this, it's telling that input from Sys Admins working in education hasn't been sought.

    We've been fighting this battle for decades now. We were on the front lines well before this stupid plan was a twinkle in the eye of whoever is ultimately responsible for it.

    Why, then, were the people who's life and career is all about blocking and filtering technologies and and child safety not accessed as a resource?

    Simple answer is; because we'd all say the same thing "It doesn't work. We've tried it. It doesn't work." Which isn't an answer that fits with the agenda driving this crap.

      " ‘protect the children’ is the war-cry "

      It is the excuse, not the whole reason.
      Everyone hears filter and assumes its to block porn from children and that is why everyone is behind it. The FACT is that it is NOT a porn filter, it is simply the government choosing what you can view on the internet.

      This government is not going to block sites like facebook or you tube but there is no garentee that the next one might feel differently. The way it is set up, they can make that change when ever they wish without any open process to know why. Is this what Australia wants?

        "The state must declare the child to be the most precious treasure of the people. As long as the government is perceived as working for the benefit of the children, the people will happily endure almost any curtailment of liberty and almost any deprivation."

    Great article!

    Andy, there are several ISP/telco grade URL filter outfits that resolved these tactics long ago in their URL filters, and are effective against them. They have taught the filters to recognise the useful part of a URL with superfluous addon characters, resolve IP / hex / whatever rewrites, etc.

    You will always find products that cannot handle these tactics, however these are not the technologies that a halfway smart ISP or telco implements, irrespective of what was tested in the ENEX managed ISP tests.

    The ENEX managed ISP tests were not a test or trial in order to select filtering products for ISPs on behalf of the DBCDE, they were tests for specific functionality and performance metrics, and URL deformation was not one of them.

    Anyone relying on the results of these ISP based tests to define the capabilities of limits of modern, telco grade URL filters will be seriously in error and will be underestimating what is available and being used by telcos today, and what these can achieve. In order to be accurate, you would need to research and analyse systems typically being used for such purposes in various service providers internationally.

    So this is gonna be like the Chinese olympics all over again.
    We are going to look so stupid.

    are you serious ? HTTP url based filtering ? I seem to have lost faith in the australian government and their technical adivsors after reading this article.

    God bless us.

    Lol! I'm sure somewhere someone would be so determined to hack their filtering servers if this goes into place. It would be epic if someone blocked all sites ending with ".gov.au" for one day.

    The problem is, we are not going to have real independent ISP's for much longer. Australia has committed to delivering the National Broadband Network (NBN) **everywhere** in Australia. The "isp" will simply be an "international transit" pipe being resold. So, in short, the government will have no problems, for all practical purposes, routing any byte anywhere they like (or denying them all). They've practiced it before, in Victorian Schools - the VicONE network is a virtual carbon copy of the technique of "deliver private network, ISP can resell a proxy server, and pay for the privilege of connecting to the network) ... It's a big-nasty-non-internet. And unfortunately too many people will accept it as normal and reasonable -- who could want to do something obscure like, I don't know, transmit an IP packet end-end without a proxy in the middle?
    (lots of ISP's already have transparent proxies on their customers, and they're a total PITA. TPG, I'm looking at you)

    If so called 'IT Experts' have been recommending filtering by URL to our Government, then the Australian IT industry is completely ran by idiots.

    That's what you get for voting for Labor... perhaps Australia has learnt its lesson? Well, you would have thought so after that drunk Hawke was elected.

    we should be ashamed of this filter, but not because it's technically stupid and infeasible...

    if you *know* a web address (such that you can put that IP or domain or full URL into an internet filter system) then you can make somebody or some company or even some country accountable for that web address..

    if the government were serious about stopping this from happening in the first place, why not stop it at the source? blocking the sites is sticking your head in the sand..

    oh and the government has a list of websites with inappropriate content? i *hope* that that list does not go *viral*....

    but ignoring for the moment the very real possibility that this list could end up being a child porn collectors' dream and could end up triggering a world wide epidemic of kiddy porn traffic (imagine if a list of a million child porn sites was leaked? wasn't the test list leaked already?), but besides that, what are we doing blocking those web sites?

    you know the addresses, so you should be going after the companies that own the IP addresses and getting them to prosecute or disclose who the people are publishing the material!!

    if the company who owns that address space won't reveal the owner of the web site when it can be be clearly demonstrated that it is hosting inappropriate content, then a court orders can fix that, can it not?

    if international boundaries are a problem, then involve the government of the hosting country, if said government won't co-operate or their laws don't adequately cater for prosecuting inappropriate internet content and/or exploitation of children etc, then cut diplomatic ties with that country or block every web site hosted in that country or with that company, or drop the routes completely.

    are we serious or not?

    as it stands, even if this proposed filter *did* work and it *did* block 100% of every bit of child porn on the net in some magical way, it still doesn't mean children have been made safe

    international relations on this matter need to improve if it is quicker to add your bad sites to the filter list than it is to get those sites shut down

    Why go to that much effort to prevent it being seen when those resources could go to stopping the people making and hosting it in the first place?

    oh, and if the people actually doing it are caught, you might actually discover the other thousands of web sites they operate or visit that you did not have on your filter, and you might find all the other mechanisms other than via the web that they used to spread their "material" ..

    oh and you might just get a list of *other* people they are sharing "material" with ... ??

    And bottom line is, the people making the "porn" are the people doing the abusing, they don't need to market their "material" to anybody? how does a "filter" stop it from actually happening?

Join the discussion!

Trending Stories Right Now