ImperialViolet

CRIME (21 Sep 2012)

Last year I happened to worry on the SPDY mailing list about whether sensitive information could be obtained via SPDY's use of zlib for compressing headers. Sadly, I never got the time to follow up and find out whether it was a viable attack. Thankfully there exist security researchers who, independently, wondered the same thing and did the work for me! Today Duong and Rizzo presented that work at ekoparty 2012.

They were also kind enough to let Firefox and ourselves know ahead of time so that we could develop and push security fixes before the public presentation. In order to explain what we did, let's start by looking at how SPDY compressed headers:

(This is inline SVG, if you can't see it, check here.)

? ? ? ? ? ? ? ? : h o s t ? ? ? ? w w w . g o o g l e . c o m ? ? ? ? : m e t h o d ? ? ? ? G E T ? ? ? ? : p a t h ? ? ? ? / ? ? ? ? : s c h e m e ? ? ? ? h t t p s ? ? ? ? : v e r s i o n ? ? ? ? H T T P / 1 . 1 ? ? ? ? a c c e p t ? ? ? ? t e x t / h t m l , a p p l i c a t i o n / x h t m l + x m l , a p p l i c a t i o n / x m l ; q = 0 . 9 , * / * ; q = 0 . 8 ? ? ? ? a c c e p t - c h a r s e t ? ? ? ? I S O - 8 8 5 9 - 1 , u t f - 8 ; q = 0 . 7 , * ; q = 0 . 3 ? ? ? ? a c c e p t - e n c o d i n g ? ? ? ? g z i p , d e f l a t e , s d c h ? ? ? ? a c c e p t - l a n g u a g e ? ? ? ? e n - U S , e n ; q = 0 . 8 ? ? ? ? c o o k i e ? ? ? ? P R E F = I D = 7 0 c b b f e 7 e e 8 8 a 5 e 2 : F F = 0 : T M = 1 3 4 7 4 0 0 4 7 9 : L M = 1 7 8 7 4 1 4 3 9 5 : S = f J q m r k s _ 8 G h 3 i 1 K D ; N I D = 6 3 = k 9 f K d q _ 2 g u F 0 O 1 F 5 g V 6 K 3 C I t F b x d d 2 f D W L g x T H f a Q 3 5 P q 4 S D d d d A i H F 9 G G 9 6 2 3 A J v - W U b U p A h 8 _ 0 Y Z T 6 B H Q N 4 f A B 2 j O 3 _ Y 5 q H 7 w x e d A N v I n J 5 R r h l s i i h M Y q e - s u 1 U 1 O ? ? ? ? u s e r - a g e n t ? ? ? g M o z i l l a / 5 . 0 ( X 1 1 ; L i n u x x 8 6 _ 6 4 ) A p p l e W e b K i t / 5 3 7 . 1 0 ( K H T M L , l i k e G e c k o ) C h r o m e / 2 3 . 0 . 1 2 6 4 . 0 S a f a r i / 5 3 7 . 1 0 ? ? ? ? x - c h r o m e - v a r i a t i o n s ? ? ? 0 C M + 1 y Q E I l b b J A Q i d t s k B C K S 2 y Q E I p 7 b J A Q i 9 t s k B C L u D y g E = ? ? ? ? ? ? ? ? : h o s t ? ? ? ? w w w . g o o g l e . c o m ? ? ? ? : m e t h o d ? ? ? ? G E T ? ? ? ? : p a t h ? ? ? ? / c s i ? v = 3 ? s = w e b h p ? a c t i o n = ? s r t = 2 4 6 ? p = s ? n p n = 1 ? e = 1 7 2 5 9 , 2 3 6 2 8 , 2 3 6 7 0 , 3 2 6 9 0 , 3 5 7 0 4 , 3 7 1 0 2 , 3 8 0 3 4 , 3 8 4 4 9 , 3 8 4 6 6 , 3 9 1 5 4 , 3 9 3 3 2 , 3 9 5 2 3 , 3 9 9 7 8 , 4 0 1 9 5 , 4 0 3 3 3 , 3 3 0 0 0 4 7 , 3 3 0 0 1 1 7 , 3 3 0 0 1 2 5 , 3 3 0 0 1 3 2 , 3 3 0 0 1 3 5 , 3 3 0 0 1 5 7 , 3 3 1 0 0 1 1 , 4 0 0 0 1 1 6 , 4 0 0 0 2 6 0 , 4 0 0 0 2 6 7 , 4 0 0 0 2 7 8 , 4 0 0 0 3 0 8 , 4 0 0 0 3 5 2 , 4 0 0 0 3 5 4 , 4 0 0 0 4 7 2 , 4 0 0 0 4 7 6 , 4 0 0 0 5 1 6 , 4 0 0 0 5 1 9 , 4 0 0 0 5 5 3 , 4 0 0 0 5 9 3 , 4 0 0 0 6 0 5 , 4 0 0 0 6 1 6 , 4 0 0 0 7 6 2 , 4 0 0 0 8 2 5 , 4 0 0 0 8 3 7 , 4 0 0 0 8 4 1 , 4 0 0 0 8 4 9 ? e i = Y r N P U N P t H 8 T C g A f Y 7 I C A C g ? i m c = 2 ? i m n = 2 ? i m p = 2 ? r t =

That's a pretty busy diagram! But I don't think it's too bad with a bit of explanation:

zlib uses a language with basically two statements: “output these literal bytes” and “go back x bytes and duplicate y bytes from there”. In the diagram, red text was included literally and black text came from duplicating previous text.

The duplicated text is underlined. A dark blue underline means that the original text is in the diagram and there will be a gray line pointing to where it came from. (You can hover the mouse over one of those lines to make it darker.)

A light blue underline means that the original text came from a pre-shared dictionary of strings. SPDY defines some common text, for zlib to be able to refer to, that contains strings that we expect to find in the headers. This is most useful at the beginning of compression when there wouldn't otherwise be any text to refer back to.

The problem that CRIME highlights is that sensitive cookie data and an attacker controlled path is compressed together in the same context. Cookie data makes up most of the red, uncompressed bytes in the diagram. If the path contains some cookie data, then the compressed headers will be shorter because zlib will be able to refer back to the path, rather than have to output all the literal bytes of the cookie. If you arrange things so that you can probe the contents of the cookie incrementally, then (assuming that the cookie is base64), you can extract the cookie byte-by-byte by inducing the browser to make requests.

For details of how to get zlib to reveal that information in practice, I'll just refer you to Duong and Rizzo's CRIME presentation. It's good work.

In order to carry out this attack, the attacker needs to be able to observe your network traffic and to be able to cause many arbitrary requests to be sent. An active network attacker can do both by injecting Javascript into any HTTP page load that you make in the same session.

When we learned of this work, we were already in the process of designing the compression for SPDY/4, which avoids this problem. But we still needed to do something about SPDY/2 and SPDY/3 which are currently deployed. To that end, Chrome 21 and Firefox 15 have switched off SPDY header compression because that's a minimal change that easily backports.

Chrome has also switched off TLS compression, through which a very similar attack can be mounted.

But we like SPDY header compression because it saves significant amounts of data on the wire! Since SPDY/4 isn't ready to go yet we have a more complex solution for Chrome 22/23 that compresses data separately while still being backwards compatible.

Most importantly cookie data will only ever be duplicated exactly, and in its entirety, against other cookie data. Each cookie will also be placed in its own Huffman group (Huffman coding is a zlib detail that I skipped over in the explanation above). Finally, in case other headers contain sensitive data (i.e. when set by an XMLHttpRequest), non-standard headers will be compressed in their own Huffman group without any back references.

That's only a brief overview of the rules. The code to follow them and continue to produce a valid zlib stream wasn't one of the cleaner patches ever landed in Chrome and I'll be happy to revert it when SPDY/4 is ready. But it's effective at getting much of the benefit of compression back.

To the right are a couple of images of the same sort of diagram as above, but zoomed out. At this level of zoom, all you can really see are the blocks of red (literal) and blue (duplicated) bytes. The diagram on the right has the new rules enabled and, as you can see, there is certainly more red in there. However that's mostly the result of limited window size. In order to save on server memory, Chrome only uses 2048-byte compression windows and, under the new rules, a previous cookie value has to fit completely within the window in order to be matched. So things are a little less efficient until SPDY/4, although we might choose to trade a little more memory to make up for that.