ImperialViolet

Changing HTTPS (05 Sep 2010)

At Google we're pretty fanatical about speed. We know that when things run faster people aren't just happier to use them, but that it fundamentally changes the way that people use them. That's why we strive to make Chrome the fastest browser possible and include experiments like SPDY.

As part of this, we've been looking at SSL/TLS, the protocol which secures HTTPS. We love SSL/TLS because of the privacy and security that it gives to our users and, like everything, we want to make it faster.

One of the simplest changes that we're experimenting with is called False Start. It can shave a round trip from the setup time of many HTTPS connections. A round trip varies depending on how far away the webserver is from the client. Crossing the continental US and back takes about 70ms. Going from the west coast to Europe and back takes about 150ms.

That might not seem like very much. But these costs are multiplied when loading a complex site. If it takes an extra 100ms to start fetching the contents of the page, then that's 100ms until the browser discovers resources on other websites that will be needed to render its contents. If loading those resources reveals more dependents then they are delayed by three round trips.

And this change disproportionately benefits smaller websites (who aren't multihomed around the world) and mobile users or areas with poorer Internet connectivity (who have longer round trip times).

Most attractively, this change can be made unilaterally. Browsers can implement False Start and benefit without having to update webservers.

However, we are aware that this change will cause issues with about 0.05% of websites on the Internet. There are a number of possible responses to this:

The most common would be to admit defeat. Rightly or wrongly, users assign blame to the last thing to change. Thus, no matter how grievous or damaging their behaviour, anything that worked at some point in the past is regarded as blameless for any future problem. As the Internet becomes larger and more diverse, more and more devices are attached that improperly implement Internet protocols. This means that any practice that isn't common is likely to be non-functional for a significant number of users.

In light of this, many efforts which try to make lower level changes fail. Others move on top of common protocols in the hope that bugs in the network don't reach that high.

Chrome still carries an idealism that means that we're going to try to make low level changes and try to make them work.

The second common response to these problems is to work around them. This means detecting when something has gone wrong and falling back to the old behaviour. In the specific case of False Start, detecting the failure is problematic. However, even if it were not, then we still wouldn't want to work around the issue because we think that it's damaging to the Internet in its own way.

Fallbacks allow problematic sites to continue to function and stems the flow of angry users. That alone makes it very attractive to developers. However, it also removes any motivation for the sites to change. In most cases, it means that sites will never even know about the issue. Fallbacks do nothing to move the world towards a position where the change in question functions correctly.

Additionally, fallbacks add more unwritten rules and complexity. As an example, take the change from SSLv3 to TLS 1.0. This was a change to the same SSL/TLS protocol that False Start changes and it was finialised as a standard nearly 12 years ago. TLS 1.0 was designed so that browsers could talk to older SSLv3 websites without issues. The only slight problem was that some webservers needed to be updated to ignore some extra data that TLS 1.0 clients would send. A very minor change.

In order to make the transition smoother, browsers added a fallback from TLS to SSLv3. The Internet in 1999 was a much smaller and more flexible place. It was assumed that the problematic webservers could be fixed in a few years and the fallback could be removed.

Twelve years later, the fallback is in robust health and still adding complexity. A security update to TLS earlier this year was made much more complex by the need to account for SSLv3 fallback. The operators of the problematic webservers are largely unaware of the problems that they are causing and have no incentive to change in any case. Meanwhile, the cost of SSLv3 fallback continues to accumulate.

Because of these problems, we're going to try a new approach with False Start: blacklists.

Chrome (on the dev channel at first) will contain a list of the 0.05% of problematic sites and we won't use False Start with them. We're generating this list by probing a large list of websites although, of course, we'll miss some which we'll have to address from bug reports.

Blacklisting gives us two advantages. Firstly, it limits the accumulation of new problematic websites. Sites which have never worked are a very different case from sites which used to work.

Secondly, we can contact the problematic sites in question. We already have a good idea of where the problem lies with many of them and we're in contact with the stakeholders to plan a way forward.

Blacklists require effort to maintain and we'll have to be responsive to make it work. But, with our near-weekly dev channel and even more frequently updated Canary channel, we think that we can do it. In the end, success will be measured by whether we manage to make the Internet a safe place to implement False Start and by how much we manage to reduce the blacklist over time.