By request flood I mean, request flood, as in sending insanely high number of requests per unit of time (second) to the target server to cause exhaustion of its resources.
You're right, with HTTP/1.1 we have single request in-flight (or none in keep-alive state) at any moment. But that doesn't limit number of simultaneous connections from a single IP address. An attacker could use the whole port space of TCP to create 65535 (theoretically) connections to the server and to send requests to them in parallel. This is a lot, too. In pre-HTTP/2 era this could be mitigated by limiting number of connections per IP address.
In HTTP/2 however, we could have multiple parallel connections with multiple parallel requests at any moment, this is by many orders higher than possible with HTTP/1.x. But the preceeding mitigation could be implemented by applying to the number of requests over all connections per IP address.
I guess, this was overlooked in the implementations or in the protocol itself? Or rather, it is more difficult to apply restrictions because of L7 protocol multiplexing because it's entirely in the userspace?
Added:
The diagram in the article ("HTTP/2 Rapid Reset attack" figure) doesn't really explain why this is an attack. In my thinking, as soon as the request is reset, the server resources are expected to be freed, thus not causing exhaustion of them. I think this should be possible in modern async servers.
> But that doesn't limit number of simultaneous connections from a single IP address.
Opening new connections is relatively expensive compared to sending data on an existing connection.
> In my thinking, as soon as the request is reset, the server resources are expected to be freed,
You can't claw back the CPU resources that have already been spent on processing the request before it was cancelled.
> By request flood I mean, request flood, as in sending insanely high number of requests per unit of time (second) to the target server to cause exhaustion of its resources.
Right. And how do you send an insanely high number of requests? What if you could send more?
Imagine the largest attack you could do by "sending an insanely high number requests" with HTTP/1.1 with a given set of machine and network resources. With H/2 multiplexing you could do 100x that. With this attack, another 10x on top of that.
> An attacker could use the whole port space of TCP to create 65535 (theoretically) connections to the server and to send requests to them in parallel.
This is harder for the client than it is for the server. As a server, it's kind of not great that I'm wasting 64k of my connections on one client, but it's harder for you to make them than it is for me to receive them, so not a huge deal with today's servers.
On this attack, I think the problem becomes if you've got a reverse proxy h2 frontend, and you don't limit backend connections because you were limiting frontend requests. Sounds like HAProxy won't start a new backend request until the pending backend requests is under the session limit; but google's server must not have been limiting based on that. So cancel the frontend request, try to cancel the backend request, but before you confirm the backend request is canceled, start another one. (Plus what the sibling mentioned... backend may spend a lot of resources handling the requests that will be canceled immediately)
You're right, with HTTP/1.1 we have single request in-flight (or none in keep-alive state) at any moment. But that doesn't limit number of simultaneous connections from a single IP address. An attacker could use the whole port space of TCP to create 65535 (theoretically) connections to the server and to send requests to them in parallel. This is a lot, too. In pre-HTTP/2 era this could be mitigated by limiting number of connections per IP address.
In HTTP/2 however, we could have multiple parallel connections with multiple parallel requests at any moment, this is by many orders higher than possible with HTTP/1.x. But the preceeding mitigation could be implemented by applying to the number of requests over all connections per IP address.
I guess, this was overlooked in the implementations or in the protocol itself? Or rather, it is more difficult to apply restrictions because of L7 protocol multiplexing because it's entirely in the userspace?
Added: The diagram in the article ("HTTP/2 Rapid Reset attack" figure) doesn't really explain why this is an attack. In my thinking, as soon as the request is reset, the server resources are expected to be freed, thus not causing exhaustion of them. I think this should be possible in modern async servers.