ChaCha20 and Poly1305 for TLS (07 Oct 2013)

Today, TLS connections predominantly use one of two families of cipher suites: RC4 based or AES-CBC based. However, in recent years both of these families of cipher suites have suffered major problems. TLS's CBC construction was the subject of the BEAST attack (fixed with 1/n-1 record splitting or TLS 1.1) and Lucky13 (fixed with complex implementation tricks). RC4 was found to have key-stream biases that cannot effectively be fixed.

Although AES-CBC is currently believed to be secure when correctly implemented, a correct implementation is so complex that there remains strong motivation to replace it.

Clearly we need something better. An obvious alternative is AES-GCM (AES-GCM is AES in counter mode with a polynomial authenticator over GF(2128)), which is already specified for TLS and has some implementations. Support for it is in the latest versions of OpenSSL and it has been the top preference cipher of Google servers for some time now. Chrome support should be coming in Chrome 31, which is expected in November. (Although we're still fighting to get TLS 1.2 support deployed in Chrome 30 due to buggy servers.)

AES-GCM isn't perfect, however. Firstly, implementing AES and GHASH (the authenticator part of GCM) in software in a way which is fast, secure and has good key agility is very difficult. Both primitives are suited to hardware implementations and good software implementations are worthy of conference papers. The fact that a naive implementation (which is also what's recommended in the standard for GHASH!) leaks timing information is a problem.

AES-GCM also isn't very quick on lower-powered devices such as phones, and phones are now a very important class of device. A standard phone (which is always defined by whatever I happen to have in my pocket; a Galaxy Nexus at the moment) can do AES-128-GCM at only 25MB/s and AES-256-GCM at 20MB/s (both measured with an 8KB block size).

Lastly, if we left things as they are, AES-GCM would be the only good cipher suite in TLS. While there are specifications for AES-CCM and for fixing the AES-CBC construction, they are all AES based and, in the past, having some diversity in cipher suites has proven useful. So we would be looking for an alternative even if AES-GCM were perfect.

In light of this, Google servers and Chrome will soon be supporting cipher suites based around ChaCha20 and Poly1305. These are primitives developed by Dan Bernstein and are fast, secure, have high quality, public domain implementations, are naturally constant time and have nearly perfect key agility.

On the same phone as the AES-GCM speeds were measured, the ChaCha20+Poly1305 cipher suite runs at 92MB/s (which should be compared against the AES-256-GCM speed as ChaCha20 is a 256-bit cipher).

In addition to support in Chrome and on Google's servers, myself and my colleague, Elie Bursztein, are working on patches for NSS and OpenSSL to support this cipher suite. (And I should thank Dan Bernstein, Andrew M, Ted Krovetz and Peter Schwabe for their excellent, public domain implementations of these algorithms. Also Ben Laurie and Wan-Teh Chang for code reviews, suggestions etc.)

But while AES-GCM's hardware orientation is troublesome for software implementations, it's obviously good news for hardware implementations and some systems do have hardware AES-GCM support. Most notably, Intel chips have had such support (which they call AES-NI) since Westmere. Where such support exists, it would be a shame not to use it because it's constant time and very fast (see slide 17). So, once ChaCha20+Poly1305 is running, I hope to have clients change their cipher suite preferences depending on the hardware that they're running on, so that, in cases where both client and server support AES-GCM in hardware, it'll be used.

To wrap all this up, we need to solve a long standing, browser TLS problem: in order to deal with buggy HTTPS servers on the Internet (of which there are many, sadly), browsers will retry failed HTTPS connections with lower TLS version numbers in order to try and find a version that doesn't trigger the problem. As a last attempt, they'll try an SSLv3 connection with no extensions.

Several useful features get jettisoned when this occurs but the important one for security, up until now, has been that elliptic curve support is disabled in SSLv3. For servers that support ECDHE but not DHE that means that a network attacker can trigger version downgrades and remove forward security from a connection. Now that AES-GCM and ChaCha20+Poly1305 are important we have to worry about them too as these cipher suites are only defined for TLS 1.2.

Something needs to be done to fix this so, with Chrome 31, Chrome will no longer downgrade to SSLv3 for Google servers. In this experiment, Google servers are being used as an example of non-buggy servers. The experiment hopes to show that networks are sufficiently transparent that they'll let at least TLS 1.0 through. We know from Chrome's statistics that connections to Google servers do end up getting downgraded to SSLv3 sometimes, but all sorts of random network events can trigger a downgrade. The fear is that there's some common network element that blocks TLS connections by deep-packet inspection, which we'll measure by breaking them and seeing how many bug reports we get.

If that works then, in Chrome 32, no fallbacks will be permitted for Google servers at all. This stage of the experiment tests that the network is transparent to TLS 1.2 by, again, breaking anything that isn't and seeing if it causes bug reports.

If both those experiments work then, great! We can define a way for servers to securely indicate that they don't need version fallback. It would be nice to have used the renegotiation extension for this, but I think that there are likely already too many broken servers that support that, so another SCSV is probably needed.

If we get all of the above working then we're not in too bad a state with respect to TLS cipher suites. At least, once most of the world has upgraded in any case.