ImperialViolet

Maybe Skip SHA-3 (31 May 2017)

In 2005 and 2006, a series of significant results were published against SHA-1 [1][2][3]. These repeated break-throughs caused something of a crisis of faith as cryptographers questioned whether we knew how to build hash functions at all. After all, many hash functions from the 1990's had not aged well [1][2].

In the wake of this, NIST announced (PDF) a competition to develop SHA-3 in order to hedge the risk of SHA-2 falling. In 2012, Keccak (pronounced “ket-chak”, I believe) won (PDF) and became SHA-3. But the competition itself proved that we do know how to build hash functions: the series of results in 2005 didn't extend to SHA-2 and the SHA-3 process produced a number of hash functions, all of which are secure as far as we can tell. Thus, by the time it existed, it was no longer clear that SHA-3 was needed. Yet there is a natural tendency to assume that SHA-3 must be better than SHA-2 because the number is bigger.

As I've mentioned before, diversity of cryptographic primitives is expensive. It contributes to the exponential number of combinations that need to be tested and hardened; it draws on limited developer resources as multiple platforms typically need separate, optimised code; and it contributes to code-size, which is a worry again in the mobile age. SHA-3 is also slow, and is even slower than SHA-2 which is already a comparative laggard amongst crypto primitives.

SHA-3 did introduce something useful: extendable output functions (XOFs), in the form of the SHAKE algorithms. In an XOF, input is hashed and then an (effectively) unlimited amount of output can be produced from it. It's convenient, although the same effect can be produced for a limited amount of output using HKDF, or by hashing to a key and running ChaCha20 or AES-CTR.

Thus I believe that SHA-3 should probably not be used. It offers no compelling advantage over SHA-2 and brings many costs. The only argument that I can credit is that it's nice to have a backup hash function, but both SHA-256 and SHA-512 are commonly supported and have different cores. So we already have two secure hash functions deployed and I don't think we need another.

BLAKE2 is another new, secure hash function, but it at least offers much improved speed over SHA-2. Speed is important. Not only does it mean less CPU time spent on cryptography, it means that cryptography can be economically deployed in places where it couldn't be before. BLAKE2, however, has too many versions: eight at the current count (BLAKE2(X)?[sb](p)?). In response to complaints about speed, the Keccak team now have KangarooTwelve and MarsupilamiFourteen, which have a vector-based design for better performance. (Although a vector-based design can also be used to speed up SHA-2.)

So there are some interesting prospects for a future, faster replacement for SHA-2. But SHA-3 itself isn't one of them.

Update: two points came up in discussion about this. Firstly, what about length-extension? SHA-2 has the property that simply hashing a secret with some data is not a secure MAC construction, that's why we have HMAC. SHA-3 does not have this problem.

That is an advantage of SHA-3 because it means that people who don't know they need to use HMAC (with SHA-2) won't be caught out by it. Hopefully, in time, we end up with a hash function that has that property. But SHA-512/256, BLAKE2, K12, M14 and all the other SHA-3 candidates do have this property. In fact, it's implausible that any future hash function wouldn't.

Overall, I don't feel that solving length-extension is a sufficiently pressing concern that we should all invest in SHA-3 now, rather than a hash function that hopefully comes with more advantages. If it is a major concern for you now, try SHA-512/256—a member of the SHA-2 family.

The second point was that SHA-3 is just the first step towards a permutation-based future: SHA-3 has an elegant foundation that is suitable for implementing the full range of symmetric algorithms. In the future, a single optimised permutation function could be the basis of hashes, MACs, and AEADs, thus saving code size / die area and complexity. (E.g. STROBE.)

But skipping SHA-3 doesn't preclude any of that. SHA-3 is the hash standard itself, and even the Keccak team appear to be pushing K12 rather than SHA-3 now. It seems unlikely that a full set of primitives built around the Keccak permutation would choose to use the SHA-3 parameters at this point.

Indeed, SHA-3 adoption might inhibit that ecosystem by pushing it towards those bad parameters. (There was a thing about NIST tweaking the parameters at the end of the process if you want some background.)

One might argue that SHA-3 should be supported because you believe that it'll result in hardware implementations of the permutation and you hope that they'll be flexible enough to support what you really want to do with it. I'm not sure that would be the best approach even if your goal was to move to a permutation-based world. Instead I would nail down the whole family of primitives as you would like to see it and try to push small chips, where area is a major concern, to adopt it. Even then, the hash function in the family probably wouldn't be exactly SHA-3, but more like K12.