Revocation still doesn't work (29 Apr 2014)
I was hoping to be done with revocation for a bit, but sadly not.
GRC have published a breathless piece attacking a straw man argument: “Google tells us ... that Chrome's unique CRLSet solution provides all the protection we need”. I call it a straw man because I need only quote my own posts that GRC have read to show it.
The original CRLSets announcement contained two points. Firstly that online revocation checking doesn't work and that we were switching it off in Chrome in most cases. Secondly, that we would be using CRLSets to avoid having to do full binary updates in order to respond to emergency incidents.
In the last two paragraphs I mentioned something else. Quoting myself:
“Since we're pushing a list of revoked certificates anyway, we would like to invite CAs to contribute their revoked certificates (CRLs) to the list. We have to be mindful of size, but the vast majority of revocations happen for purely administrative reasons and can be excluded. So, if we can get the details of the more important revocations, we can improve user security. Our criteria for including revocations are:
- The CRL must be crawlable: we must be able to fetch it over HTTP and robots.txt must not exclude GoogleBot.
- The CRL must be valid by RFC 5280 and none of the serial numbers may be negative.
- CRLs that cover EV certificates are taken in preference, while still considering point (4).
- CRLs that include revocation reasons can be filtered to take less space and are preferred.”
In short, since we were pushing revocations anyway, maybe we could get some extra benefit from it. It would be better than nothing, which is what browsers otherwise have with soft-fail revocation checking.
I mentioned it again, last week (emphasis added):
“We compile daily lists of some high-value revocations and use Chrome's auto-update mechanism to push them to Chrome installations. It's called the CRLSet and it's not complete, nor big enough to cope with large numbers of revocations, but it allows us to react quickly to situations like Diginotar and ANSSI. It's certainly not perfect, but it's more than many other browsers do.”
“The original hope with CRLSets was that we could get revocations categorised into important and administrative and push only the important ones. (Administrative revocations occur when a certificate is changed with no reason to suspect that any key compromise occurred.) Sadly, that mostly hasn't happened.”
And yet, GRC managed to write pages (including cartoons!) exposing the fact that it doesn't cover many revocations and attacking Chrome for it.
They also claim that soft-fail revocation checking is effective:
The claim is that a user will download and cache a CRL while not being attacked and then be protected from a local attacker using a certificate that was included on that CRL. (I'm paraphrasing; you can search for “A typical Internet user” in their article for the original.)
There are two protocols used in online revocation checking: OCSP and CRL. The problem is that OCSP only covers a single certificate and OCSP is used in preference because it's so much smaller and thus removes the need to download CRLs. So the user isn't expected to download and cache the CRL anyway. So that doesn't work.
However, it's clear that someone is downloading CRLs because Cloudflare are spending half a million dollars a month to serve CRLs. Possibly it's non-browser clients but the bulk is probably CAPI (the library that IE, Chrome and other software on Windows typically uses - although not Firefox). A very obscure fact about CAPI is that it will download a CRL when it has 50 or more OCSP responses cached for a given CA certificate. But CAs know this and split their CRLs so that they don't get hit with huge bandwidth bills. But a split CRL renders the claimed protection from caching ineffective, because the old certificate for a given site is probably in a different part.
So I think the claim is that doing blocking OCSP lookups is a good idea because, if you use CAPI on Windows, then you might cache 50 OCSP responses for a given CA certificate. Then you'll download and cache a CRL for a while and then, depending on whether the CA splits their CRLs, you might have some revocations cached for a site that you visit.
It seems that argument is actually for something like CRLSets (download and cache revocations) over online checking, it's just a Rube Goldberg machine to, perhaps, implement it!
So, once again, revocation doesn't work. It doesn't work in other browsers and CRLSets aren't going to cover much in Chrome, as I've always said. GRC's conclusions follow from those errors and end up predictably far from the mark.
In order to make this post a little less depressing, let's consider whether its reasonable to aim to cover everything with CRLSets. (Which I mentioned before as well, but I'll omit the quote.) GRC quote numbers from Netcraft claiming 2.85 million revocations, although some are from certificate authorities not trusted by mainstream web browsers. I spent a few minutes harvesting CRLs from the CT log. This only includes certificates that are trusted by reasonable number of people and I only waited for half the log to download. From that half I threw in the CRLs included by CRLSets and got 2356 URLs after discarding LDAP ones.
I tried downloading them and got 2164 files. I parsed them and eliminated duplicates and my quick-and-dirty parser skipped quite a few. None the less, this very casual search found 1,062 issuers and 4.2 million revocations. If they were all dropped into the CRLSet, it would take 39MB.
So that's a ballpark figure, but we need to design for something that will last a few years. I didn't find figures on the growth of HTTPS sites, but Netcraft say that the number of web sites overall is growing at 37% a year. It seems reasonable that the number of HTTPS sites, thus certificates, thus revocations will double a couple of times in the next five years.
There's also the fact that if we subsidise revocation with the bandwidth and computers of users, then demand for revocation will likely increase. Symantec, I believe, have experimented with certificates that are issued for the maximum time (five years at the moment) but sold in single year increments. When it comes to the end of the year, they'll remind you to renew. If you do, you don't need to do anything, otherwise the certificate gets revoked. This seems like a good deal for the CA so, if we make revocation really effective, I'd expect to see a lot more of it. Perhaps factor in another doubling for that.
So we would need to scale to ~35M revocations in five years, which is ~300MB of raw data. That would need to be pushed to all users and delta updated. (I don't have numbers for the daily delta size.)
That would obviously take a lot of bandwidth and a lot of resources on the client. We would really need to aim at mobile as well, because that is likely to be the dominant HTTPS client in a few years, making storage and bandwidth use even more expensive.
Some probabilistic data structure might help. I've written about that before. Then you have to worry about false positives and the number of reads needed per validation. Phones might use Flash for storage, but it's remarkably slow.
Also, would you rather spend those resources on revocation, or on distributing the Safe Browsing set to mobile devices? Or would you spend the memory on improving ASLR on mobile devices? Security is big picture full of trade-offs. It's not clear the revocation warrants all that spending.
So, if you believe that downloading and caching revocation is the way forward, I think those are the parameters that you have to deal with.