This isn’t even remotely funny. Every major search crawler provides different Accept-Encoding headers that make it bypass cache and always hit the backend. It is easy to hack Squid to disregard spaces between options (as IE puts them in headers: gzip, deflate
, and Mozilla does not: gzip,deflate
), but some of these things make caching hell:
- msnbot:
Accept-Encoding: identity;q=1.0
- googlebot:
Accept-Encoding: gzip
- yahoo (slurp):
Accept-Encoding: gzip, x-gzip
Add Opera with it’s Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0
and KHTML with Accept-Encoding: x-gzip, x-deflate, gzip, deflate
, and you get a hell where bold normalization solutions have to be applied. I guess we just have to treat it as single-bit ‘gzip’ and ‘plain’ difference, and screw everything else.
Update: squid patch :)