The Several Million Dollar Bug

stevejones · on June 9, 2014

This isn't remotely a bug, it's not even really anything to do with browsers. The browser will be doing (roughly) the following:

    while pending_requests():
        send_request()
        read_response()

But what send_request and read_response are doing is putting data on the OS's outbound queue and then attempting to get data from the inbound queue. If the data is already in the inbound queue before the request is put on the outbound queue it doesn't matter - the browser is not aware of this fact. So long as the "responses" don't come in faster than the browser is sending requests and causing the queue to overfill, and so long as the responses send out come in the order the browser is sending requests this technique will work. In general this is just an "optimistic strategy".

jacquesm · on June 9, 2014

I sure would have been happy to have you on our team in '95, it wasn't obvious to me at all. Afterwards of course it was obvious. But I'm always a little bit suspicious of myself when I think after I am presented with how something works that it is obvious.

ryanjshaw · on June 9, 2014

Glad to hear you say that. The part of RFC 1945 you quoted seems pretty "clear" to me that the server can't send a response prior to receiving the request. It's pretty "obvious" to me that if the client validates responses and can't find an associated request you're going to have problems, and relying on undocumented behaviour is a bad idea.

In my mind the fact it works seems like dumb luck and I would never have thought to try it -- all of which is pretty depressing seeing as how evidently it made you lots of money, which is certainly something I could do with :) In hindsight, it makes perfect sense to exploit a simple solution with low implementation costs even if it has an unknown lifespan or potential risks - if it breaks, you're no worse off than you were before (well maybe not if your clients have come to depend on it and you don't have a backup), if it doesn't, great!

jacquesm · on June 9, 2014

I can't tell you how many nights of sleep I lost whenever a new browser release by one of the larger browser manufacturers was announced. Every time I was sure our house of cards would come tumbling down but it never did!

Sharlin · on June 9, 2014

Yes, I think this is a good example of the fact that "obvious" and "obvious in retrospect" are two very different things. In other words, hindsight is 20:20.

MichaelGG · on June 9, 2014

How would it even validate the request was sent first? It'd have to keep looking at the receive buffers and make sure they're always zero until it knows it actually sent the last packet. Right?

It sounds like it'd be hard to implement and for what benefit?

jacquesm · on June 9, 2014

That was the million dollar question and I gambled that I understood enough of the implementation details that it would be impossible to close the hole. That didn't stop me from living in fear of just that :)

stevejones · on June 16, 2014

Well coming up with the strategy (and actually implementing it, edge cases abound) is the hard part. However the mechanism by which it works isn't a bug.

pbhjpbhj · on June 9, 2014

So once the initial request is made you can push anything you like to the browser using this pipelining method? What is stopping the responses from coming in too quickly and "overfilling" the queue? To make it work doesn't seem too hard but aren't there possibilities for exploits if you're loading in to memory unrequested data?

colanderman · on June 9, 2014

It doesn't work that way. All this buffering happens at the OS level, which is 100% unaware of HTTP. The OS just sees a TCP connection, which is simply a pair of unidirectional streams. It buffers data in both directions to decouple the application from the network; this is necessary to keep data flowing at a reasonable rate.

"Early" responses just sit in this buffer until the application (the browser) gets around to reading them; presumably after it's finished sending the request. The size of this buffer is advertised by the client's OS to the server's OS. A well-behaved server will stop sending data when the client's buffer is full. If the server is not well-behaved, the client just drops future packets (which the server will re-send later). The client is not aware of any of this.

If on the other hand, the server sends a wholly unsolicited response, well, it will still sit in the buffer if the client only reads data after sending a request. But say the client is designed to process incoming responses regardless of whether it sent a request, sure, there could be an exploit there, but that's no different from any other buggy network code.

jacquesm · on June 9, 2014

Hm. Now you have me wondering what potential for abuse there is in this. I never looked at it from that angle.

joosters · on June 9, 2014

Potentially, you could imagine a poorly-written web browser that could be fooled by an extra HTTP response, confusing it with a later request to a different web site.

So you could follow up a HTTP response for http://empty.website.with.no.other.files.to.request.com/ with a HTTP response containing malicious javacript. When the user then tries to view a different website (say their facebook page), they get 'served' your javascript that is now running under the context of the new site. Cookie stealing and other attacks could be run.

In practice it would be unlikely to happen. If the client doesn't read the 2nd request then it is likely to be sitting in a network buffer assigned to the first connection, which will probably be thrown away when it comes to open a connection to the new site.

chetanahuja · on June 9, 2014

That's not how it would work. The later request would be made on a fresh TCP connection (since it's a different URL). Your previous unsolicited response is sitting in the buffers for the old TCP connection. They would not mingle.

x1798DE · on June 9, 2014

I think it's only checking the incoming buffer for responses that match the request they just sent out. Assuming you know what the request is going to be, you can craft the response packet beforehand and assume that the request will be made before the response actually arrives. I'm not super familiar with the nitty-gritty details, but at the very least I think you'd need to hijack the TCP connection to inject malicious responses.

stevejones · on June 16, 2014

What happens if you request something and get something different in response? The client has no idea about the buffering so it's just a case of whether it's smart enough to handle a misbehaving server.

lallysingh · on June 9, 2014

TCP has flow control.

jacquesm · on June 9, 2014

Yes, but if the implementation is indeed as naive as sketched above the buffer will be emptied regardless rendering any flow-control measures moot.

colanderman · on June 9, 2014

I'm not sure which implementation you're referring to; in stevejones's example, incoming data remains buffered by the OS until the client specifically requests one response's worth in read_response(). If the OS's buffer ever gets full, it will signal the server's OS to stop sending data; if it continues to get packets it will simply drop them (thus minimizing resource usage).

It's of course totally possible for one to make a client that reads responses regardless of whether it send a request, but that's rather silly as giving up flow control like that immediately opens your application up to a DoS attack. (Of course just because it's silly doesn't mean no-one does it!)

jacquesm · on June 9, 2014

Ah, I see, you're right, this assumes 'blocking' code.

I wonder if that's really how browers work or if they employ an array of open connections that are periodically polled for responses to outstanding requests.

Obviously, responding to something that wasn't requested is a bad idea.

Flow control only works for large responses (which is good, because that is at least one resource that you can protect), makes you wonder what you could do with multiple answers small enough to fit in the same window and if that would allow you to identify HTTP implementations that have taken 'asynchronous' one step too far.

pbhjpbhj · on June 9, 2014

Surely reading the responses opens the browser up to active exploit whilst simply buffering unrequested "responses" allows the possibility of denial of service by filling the buffer and causing actively requested packets to be refused.

So, presumably if someone requests anything from your site you can keep bombarding their browser with unrequested content that will get queued. As jacquesm indicates having an array of connections with reserved queues would avoid this blocking requested content.

colanderman · on June 9, 2014

In the OS, the buffers are per TCP connection. Only packets destined for the full buffer are dropped. TCP connections can't be opened by a remote attacker unless the browser is actively listening for them.

So effectively all that can happen is a website can DoS the connection to itself.

(Yes, an attacker can try to initiate many many TCP connections. This uses much fewer resources than an actual HTTP session would but can still be an effective DoS attack. This is known as a SYN flood.)

joosters · on June 9, 2014

An alternative way to pump lots of webcam frames was to use multipart-MIME responses. That way, there was only one HTTP request and the response just streamed JPEG images, one after the other. No need to break any specifications to get full network usage.

jacquesm · on June 9, 2014

Yes, we did that too (on Netscape, IE did not do multipart for the longest time and when it did it at first was quite buggy).

joosters · on June 9, 2014

I'd forgotten the pain of trying to get IE to work with this! I can see how the technique in the article could be a good workaround.

ISTR some webcam software that would fall back to a java applet (that would parse the multipart MIME) on IE. Anything is preferable to that though!

jacquesm · on June 9, 2014

Yes, but we did everything we could to stay away from things like java, downloads and plug-ins. Our mantra was 'it just has to work with whatever the user already has'.

And there always was a way, even if sometimes it required some - for want of a better word ;) - unorthodox methods.

steveklabnik · on June 9, 2014

Reminds me of 'websockets over GIF' https://github.com/videlalvaro/gifsockets

jacquesm · on June 9, 2014

Hehe, that's brilliant.

x1798DE · on June 9, 2014

For anyone who was, like me, confused by who "we" is and why I was supposed to know that, this is from the author's "About" page:

>My main occupations are being owner/operator of ww.com, which pioneered streaming webcam technology, and working as a consultant to do technical due diligence.

Reading the article, at first I assumed he was someone who worked on an early browser or something, then maybe a hardware webcam manufacturer. I assume that he's just not used to people showing up at his blog with no context about who he is, so there you go.

oskarth · on June 9, 2014

It's jacquesm (https://news.ycombinator.com/user?id=jacquesm), a regular and top contributor here (https://news.ycombinator.com/leaders). I think most regulars on HN recognize his blog and are somewhat familiar with what he has done. It would be nice to have a short explanation in the article though :)

mjn · on June 9, 2014

I put together a set of capsule bios of the top 20 HN contributors, which might be useful background info: http://www.kmjn.org/notes/hacker_news_posters.html

goatforce5 · on June 9, 2014

Nice work! It would be useful to have their HN names as part of their bios.

mjn · on June 9, 2014

I was a bit undecided about that, though I'm leaning towards adding them now. There was a small discussion about that when I first submitted this: https://news.ycombinator.com/item?id=6957005

jacquesm · on June 9, 2014

I've added a link to the ww.com/camarades.com story, that should clear up any confusion.

Thank you for the feedback.

colanderman · on June 9, 2014

Some IPSes (Intrusion Protection Systems) that perform deep-packet inspection won't pass such traffic.

But this isn't really a "bug" per se; the TCP model is a stream is a stream is a stream. There's no notion of time, packets, or correlation between streams. So browsers (and the OS) are acting only as they can; by treating a TCP connection as two independent streams.

(Though, how could it be otherwise? Assume HTTP over SCTP (sequenced packets). We can't require, or even allow, HTTP servers to ignore response packets that arrive "too early", since it's possible that observers of the client (e.g. Wireshark) may not observe the exact same timing, which would lead to divergent interpretations of the conversation.)

Amazon does this too. Upload APIs will return 4xx errors well before the body is uploaded in the event there's an issue with the headers. Not that (a) most HTTP clients pay attention to this, or (b) they can do anything about it without closing and reopening the connection.

joosters · on June 9, 2014

It is fine to respond early to a HTTP request, e.g. returning a permissions error when a user tries to upload a file. Beware however that some clients will go badly wrong if they don't get to send all their data.

What can happen is:

* Client starts to send HTTP request (e.g. POSTing a file). The file is large, so it will take a while to upload.

* Server spots that the user isn't allowed to upload the file and immediately returns a 4xx error of some kind.

* Server then thinks all is finished, closes connection.

* Client, still sending the file, gets an error as the write() fails. Complains about the broken connection to the user but never notices the actual HTTP response.

Many HTTP clients are written in a simple 'send my request, then (and only then) read the response'. They don't react well to getting an early error message. Often you have to work around this by not closing the connection to the client, and continuing to slurp up any further data received.

bluedino · on June 9, 2014

>> Some IPSes (Intrusion Protection Systems) that perform deep-packet inspection won't pass such traffic.

We had a vendor who's product did this. Everything worked except one feature (the main feature) and a quick glance at the firewall logs showed 'malformed tcp packet' flooding the console.

It was a simple thing to disable (from just their appliance, not the whole network), but I still found it odd that they did that.

jacquesm · on June 9, 2014

> Some IPSes that perform deep-packet inspection won't pass such traffic.

So will those ISPs also not pass Amazon's upload APIs responses? That would be pretty sloppy!

colanderman · on June 9, 2014

IPS (Intrusion Protection System), not ISP. And yes, they'd drop the connection on such an error response if configured to do so. (Which is likely what you want to do anyway in this case. Amazon's engineers had the foresight not to respond early in the case of a successful upload thankfully.)

jacquesm · on June 9, 2014

Ah! Complete reading fail on my end. (thanks for the edit, it is much clearer now, I apparently substituted ISP for IPS).

I don't think it is possible to respond early in case of a successful upload, after all, that means the upload can still fail for a variety of reasons. Success indicates that you can move to the next state, and an 'early success' might still turn into a late failure.

nanofortnight · on June 9, 2014

I think he meant "IDS" as in "Intrusion Detection System"

colanderman · on June 9, 2014

No, I meant "IPS", "Intrusion Protection System".

saturdayplace · on June 9, 2014

The difference, if anyone's still playing along, is what action the device takes. An IDS (detection system) is a monitoring and alerting device; traffic still gets through. An IPS (prevention system) drops the flagged traffic.

billyhoffman · on June 9, 2014

As you alluded to, the distinction between IDS and IPS is largely configuration and mode of operation.

Years ago, IDS and IPS were separate products, where IDS was the earlier, more primitive version of the other. Now-a-days, you are buying an IPS, which is run either in alerting mode (operating like an IDS) or in "shunning" mode, where the device tracks some defensive action (such as dropping traffic, bandwidth throttling, blacking the IP for a fixed period of time, etc).

"Shunning" mode can be dangerous, since you are essentially building in a feature to "Deny service to X for Y amount of time" into your network.

Attackers can spoof attacks to deliberately trigger the shunning of legitimate users. Because of this, it is less common to see an IDS/IPS with shunning enabled in production. It depends on where "Access to service for legitimate users" and "stopping and possibly hurting attackers" fall on the priorities list.

sysexit · on June 9, 2014

Agreed with some of the other posters that this isn't a bug. It would be pretty hard for a browser to make this not work. To make it not work, the browser would have to check whether there's data available in the local socket buffer before issuing an HTTP request. On Unix, you could e.g. put the socket in non-blocking mode, issue a read() to read 1 byte, and then see if you're getting an EWOULDBLOCK. If you get data instead of EWOULDBLOCK, then (supposedly) you're in violation of the RFC and therefore the browser might decide to close the connection (what should it do otherwise?)

It just doesn't make a lot of sense doing the above. Especially because there's a fundamental race condition here: there is no way to distinguish between data that's in-flight but not received prior to the browser issuing the request, and data that was generated after the remote peer read the browser's request.

colanderman · on June 9, 2014

You could encounter this behavior (dropping unsolicited responses) multiple "legitimate" ways (all of which suffer from the race condition you mention): reading & writing in separate threads can do it; so can an asynchronous receive mechanism.

Erlang TCP connections can be configured for asynchronous receive: any incoming data is delivered as a message to a given process, which usually immediately acts on it. Say this process has not yet sent a request; it's not unreasonable to just drop the incoming data.

Of course, I would consider such behavior non-conforming, for the reasons you point out. Time isn't really defined in a TCP stream.

Better is to utilize the flow control Erlang provides for asynchronous receive, but this is extra effort so it's plausible a naive implementation would miss this.

neckro23 · on June 9, 2014

I remember this sort of thing being called "push" back in the day (1995 or so). Before animated GIF support was added to Netscape this was the only way you could achieve animation of any sort on the Web.

The only concrete example I can recall is that Suck.com used this to have an animated logo at the top of their page. (I think this predated the Java applet version that you see on the Internet Archive...)

jacquesm · on June 9, 2014

You're thinking of x-multipart-replace.

Mawaai · on June 9, 2014

Isn't this what SPDY try to implement?

jacquesm · on June 9, 2014

One of the elements in SPDY is that responses to requests can be pushed by the server anticipating a request. But that's a relatively new development compared to when I figured out that this 'feature' is supported by just about every browser out there. And it's kind of logical, if you implement HTTP in the most straightforward way then the network stack will buffer the response until the next read, regardless of what the rest of the program is doing. So when the browser issues that read (either in a separate reader thread or in the same one if it is programmed single threaded write-then-read style) it immediately finds the answer to the request it just sent out.

Strictly speaking extra bytes sent past the end of the response to the current request (or before even any request has been sent) is a protocol violation but I'm really not complaining about this one, after all, that line in the spec does not actually specify the timing. We all just read between the lines to see what we expect to see: ping ... pong.

nl · on June 10, 2014

This is a nice hack.

Combining it with dynamically generated DNS names might be a nice "content accelerator" add-on for CDNs, etc.

ie: a page uses resources, each of which has a unique url.

You have custom infrastructure (that sits in front of a normal website) which dynamically generates a new subdomain for each resource, and replaces the resource urls with the new urls (using the subdomain).

At the top of the page (or ideally on the previous page) you include some zero-length resources with the same MIME-type as the resources you want to serve.

The browser requests these resources, and as soon as you have the connection open you reply with the zero length resource and then the actual resource you want to serve.

Subsequently the browser requests the actual resource, and finds it already waiting.

The unique hostnames are needed to allow you predict which resource will be requested.

(This was probably patentable until I wrote it all out, too ;))

jacquesm · on June 10, 2014

That needs a POC, I'm really curious if you can get that to work reliably. You may even have the ordering problem solved.

codingdave · on June 9, 2014

This sounds less like a bug, and more like a specific tweak to his logic due to his specialized use case.

In most cases, even if the web server knows that a specific page contains images, it does not know if the browser is actually going to request those images. What if it is a bot? What if the user cancels? What if they have disabled image downloads in that browser? What if they have the images and other secondary files cached?

I do think it is worthwhile to consider such things for your individual needs, but most use cases won't change the standard request/response mechanism.

jacquesm · on June 9, 2014

This was actually one of the easier aspects to solve, the webcam server ran on a different port than your regular web server so it knew exactly what it was that you were going to request, it existed for one purpose only: to serve up those images. There was no HTML or other stuff to be confused with. Technically it was probably possible for a browser to re-use the same connection and to request say in index.html after requesting an image but in practice this simply never happened. After the first image request all the subsequent requests on that same socket would be image requests as well.

Svip · on June 9, 2014

While true that you cannot know whether the client will request the images, you can use their user agent to make a pretty decent prediction. There will be corner cases where you are wrong, but most of the time your prediction will be true.

For instance, if a bot is pretending to be a Chrome browser, you'd think it was a regular client, but in fact it was not. But that's the bot's fault, not your implementation.

TeMPOraL · on June 9, 2014

> But that's the bot's fault, not your implementation.

It's your implementation that breaks the protocol spec, not the bot, so it is still your fault.

damian2000 · on June 9, 2014

Is it possible that the technique had become widespread and actually known about by the browser makers? i.e. it was a bug but they didn't want to break any applications so didn't fix it...

atesti · on June 11, 2014

What is the purpose of this trick? So reduce lantency?

I get that if the browser sends a GET request for a picture, it's possible that the server already answered. But what happens afterwards?

Is the connection closed? Or does the server send another JPG again?

multipart-mime / content-replace only worked with Netscape, not with IE.

What else is needed for this solution?

Is there a index.html page with a <img>-tag and some javascript that requests the picture again every time the image was loaded?

pedrocr · on June 9, 2014

Am I missing a trick or does this only work when the only thing you're serving at that HTTP server is the JPEG image of the camera? Otherwise the user later refreshes the page thus doing a "GET / HTTP/1.1" and gets /image.jpg instead.

jacquesm · on June 9, 2014

Yes, you're missing a trick. The url was modified on each request to bust the caches in between.

pedrocr · on June 9, 2014

But how will that work if you're sending the response before you parse the request? You don't know the URL the client is after. Were you relying on the browser keeping the same connection alive so you always went index.html->jpegs?

jacquesm · on June 9, 2014

The cam only sends out images, it can't really do much else. So you don't need to know the request, it is implicit.

pedrocr · on June 9, 2014

Right, what I meant was that you can't have the camera serve a nice /index.html with the embedded image and other niceties like modern IP cameras do, because you reply with an image to every request.

jacquesm · on June 9, 2014

Well, you can actually. All you need to do is switch modes after the first request, which you handle like every other. Which is in fact what it did... The idea here is that once you've received one request for an image all subsequent requests on that socket will be images as well.

pedrocr · on June 9, 2014

Ok, then you are relying on multiple requests on a single socket, which was what I had suggested before. Does that work reliably though if the user reloads the page while it's streaming? Wouldn't the browser reuse the same connection to request the HTML page again and get an image instead?

carsonreinke · on June 9, 2014

Does anyone have any specific examples of this?

jacquesm · on June 9, 2014

If you put your email address in your profile (or send me a line) I'll reply with a link to a cam that is still online from way back when using this technology.

I'd rather not post the link in the thread because the poor people sending out the stream would not be able to satisfy even a small portion of the kind of volume that HN can direct to a site in an eyeblink.

maxk42 · on June 10, 2014

Alright, here's one of those dumb questions you seem very open to: Can we use this technique to (for example) reply with all of a page's dependencies upon the initial request? i.e. if a user goes to www.example.com/ and the server immediately replies with /, /favicon.ico, /styles.css, /script.js, /banner.png, etc? I imagine if it were possible, this would result in a massive reduction in latency...

jacquesm · on June 10, 2014

Well, you can and you can't. See the problem is that you have no idea what the next request will be about! So if you're sending the same kind of request from the client you can respond with a payload in the mime type that is being expected. But for your usecase you could receive a request for /style.css and respond with /favicon.ico if a client decided to make the requests in an order that you did not anticipate.

If you get lucky it will work, but if you're unlucky then you'll be sending out the wrong payloads on all but the first request.

The only reason this trick worked for the webcam is because it knows ahead of time what kind of request will come (the request for the next frame). That's why it can anticipate.

carsonreinke · on June 10, 2014

Unless you use subdomains to know what asset is being requested (style.css.example.com). Even a wildcard subdomain would work.

maxk42 · on June 10, 2014

Informative!

maxk42 · on June 10, 2014

Now that I think about it, one could use a small javascript library embedded in the index page to make a number of additional requests and interpret them as the correct types via data: URLs. That would be a lot of messy hacking to shave off a few hundred ms, but might be an interesting exercise to undertake...

Istof · on June 9, 2014

Using this method, you could possibly reduce load times by sending Javascript and CSS files with the HTML file

kevinbowman · on June 9, 2014

That's a multipart response [0]. In the early days of 3G (and multimedia phones) we used to use those for some models of phone which supported it to give a better browsing experience - the gamble being that more downstream data would be faster than costing a request / response round-trip if you were pretty confident that the phone was going to ask for it.

[0] http://stackoverflow.com/questions/1806228/browser-support-o...

jonny_eh · on June 9, 2014

Assuming those requests are sent in order, that could be an interesting optimization!

jacquesm · on June 10, 2014

> Assuming those requests are sent in order

That's the problem right there. Right off the bat I don't see how to get around that one.

jonny_eh · on June 10, 2014

They'd also have to be all on the same connection. Also, you'd lose out on caching.

At that point you might as well just inline the css and javascript.

nl · on June 10, 2014

Dynamically generated unique hostnames per resource would work.

corford · on June 10, 2014

I'm probably missing something but... wouldn't all the extra dns lookups remove most of the speed advantage?

tootie · on June 9, 2014

Isn't this what UDP is for? Or did it not exist in those days?

jacquesm · on June 9, 2014

Sure UDP existed 'in those days'.

But how are you going to use UDP to send images to a browser without using a plug-in or an applet? The whole idea was to remain 'compatible' (for small values of compatible) with HTTP, which more or less guaranteed delivery.

UDP wouldn't make it through most firewalls and would make all kinds of assumptions about port forwarding and so on besides that fact that browsers simply do not expect content to arrive via UDP.