When developing a webapp these days, I use a local proxy.
This allows me to type a staging/production URL into Chrome, and get the frontend and the backend from either my local machine or staging/prod. I can mix and match any combination by checking/unchecking a box in Proxyman.
This means there is no need to whitelist localhost for CORS, and other hoops. Another advantage is that you're experiencing the app with SSL, so you may notice bugs that you would miss if you're used to work with HTTP locally. I've had these bugs which "only happen on production" in the past, and it's a nasty thing to deal with because it will be in a rush, since it happens as a surprise, and impacts users immediately. It can also bypass QA if the QA environment is also using workarounds and not a prod-like setup with SSL and the likes.
I recommend giving it a try. It's a workflow I haven't seen promoted anywhere before.
I tried to achieve the same thing in the past and have run across issues with HSTS. The details are escaping me but I think it might have been that when using the production app without a proxy, the SSL certificate was associated with the HSTS records in the browser and when I switched to a proxy, the HSTS started failing because the certificate has changed. Have you run into this at all? How have you solved it?
I don't develop at the same URL as I publish at - I'll generate keys with mkcert and host my site locally via a proxy at local.realdomainhere.com for a site that has dev.realdomainhere.com and the prod domain www.realdomainhere.com.
I see, I misunderstood. My dream setup is one where I have the same URL for TEST/STAG/PROD and just switch which BE the URL resolves to based on some configuration/tool.
I run into occasional HSTS issues on my home network sometimes because I've wildcarded HSTS and not everything internal has a cert. It's usually recently arrived IoT stuff that hasn't been cordoned off behind a stronger proxy.
Additionally, I run my own internal CA for various reasons and Firefox/Android seems to have stopped recognizing it. Irritating but it's a good razor for whether to upgrade internal or experimental apps to a Let's Encrypt cert.
I can tell you how to lose at CORS in Chrome. If your browser caches a response, and sometime later you mutate the request by adding the "Origin" header it (e.g, add attribute crossorigin="anonymous" to a <script> tag), Chrome won't make a new request. What it will do is use the cached response, which is missing the ACAO response header, and thus the browser rejects a file from its own cache via draconian security policy.
There are many ways to lose at CORS and this one is my story.
You're right, the real issue is CloudFront won't include Origin in the Vary response header if it wasn't included in the initial request. And if you change your HTML attributes, you're changing your request, but you essentially end up with a poisoned local cache. Rolling out crossorigin="anonymous" on previously cached assets is a subtlety you won't know about (even if you think you know CORS) until your site breaks as critical assets are missing.
Yeah, it's generally understood that you need to change the URL to cache-bust when the content changes, but it's easy to forget that you need to do the same thing if important headers change.
Hmm, I think I'll add a section to the article on this when I'm back at my laptop.
If your data is coming from an S3 origin and the original (cached request) was not CORS (but you also want to support CORS), then you can inject the necessary headers with Lambda@Edge or CloudFront Functions.
S3 is used as an example, as it does not include a Vary header for non-CORS requests. However, the same would be true of some other origin which isn't correctly inserting a Vary header.
That works, but I found a simpler / cheaper alternative to fix this particular Cloudfront problem: Create a custom "Cache policy" that only includes the header "origin" and then the answer from Cloudfront will always include "Vary: origin", even if it wasn't in the request (That behaviour is not documented anywhere I could find, stumbled upon it by accident)
Ha, I feel your pain, I ran into something similar with Chrome and Cloudfront.
For us it got triggered because an image on the site appeared both as a video "poster" attribute (which loads with CORS) and as a regular image. So depending on which image the user encountered first you would see a CORS error. But it would be gone after a reload, so devilishly hard to reproduce until you realise what's happening.
Also took me ages to figure out that Cloudfront didn't include the Origin in the Vary header if it wasn't in the original request.
(@jaffathecake perhaps that Cloudfront behaviour warrants a special mention. Great article by the way, I learned a lot)
Hi Jake! (I watch your videos ;))
Many times, the (Angular) ServiceWorker has given us CORS headaches on various requests to our APIs and AWS storage files.
The only method we've found, after various AWS configs and ngsw-config.json attempts, was to add `ngsw-bypass=true` to all requests.
I'll see if I can find the Github issue, where this has come up before.
Incidentally, while you're here, I'm planning on building something which will require use of SharedArrayBuffer. Will adding the (now) required Cross-Origin isolation COOP/COEP headers cause any issues to existing CORS setups?
I usually use NodeJS, but it turns out the HTTP library they use turns the HTTP method into an enum, so only a subset is supported (https://github.com/nodejs/node/blob/d798de1c653efa5ec0015d44...). This restriction only exists in their HTTP/1 library, their HTTP/2 library supports any method.
Anyway, I couldn't use that, so I used Deno via Deno Deploy. Their HTTP library supports any method, and the APIs they use are very similar to web APIs, so it was really easy to get started. Here's the server code: https://github.com/jakearchibald/cors-playground/blob/main/i....
My recollection from circa 2013 is that Node.js at least used to use the nginx HTTP parser, which was a horror of manually-implemented state machine written so in the name of performance, but consequently basically unmaintainable and fairly bug-riddled. And not as fast as it should have been, anyway. (The state machine approach is fine, but it should have used a lot more code generation.) It read the method byte by byte into the state machine, and baulked at unknown methods. Evidently they’ve kept that limitation, whatever they may have changed since in the parser they use (and I think nginx did eventually abandon and replace that parser entirely).
(These are my recollections from investigation I did back in 2013 when I was writing the first serious Rust HTTP library.)
My colleagues and I came to the same conclusion some time ago. I think CORS is not that complicated in essence, but I don't need to deal with it often enough to really memorize all the details. And there are many, as the article shows. Somehow we were always able to in the end deal with it, but at one point we didn't want to anymore. You make all these plans on how this service should be service.example.com and that one service2.example.com and this kind of feels like building something, but in reality it's just a headache unless you really need to do this for some reason.
This is just not an option for so many very legitimate use cases. Building client-side apps that aggregate and display information from multiple web services hosting on domains totally outside of my control is valuable.
Point of fairly idle curiosity about the presentation of the article: why do you put a trailing slash on your empty elements (img, link) in your code samples?
Some aren’t aware that the trailing slash is useless in the HTML syntax, simply being ignored by the parser and not doing anything. (Except for in inline SVG and MathML content, which switch the parser into a more XML-like mode where the trailing slash behaves as in XML.) I know you’re not in that category.
Some hold that the trailing slash should be encouraged because it reminds the reader it’s a void element.
I can imagine some recommending it for XML compatibility (which is related to the original purpose of the ignore-the-trailing-slash behaviour, though slightly inverted in direction), but I don’t think I’ve ever encountered anyone saying so.
I hold that in HTML syntax the trailing slash is mildly harmful, because I almost never see it applied consistently so that the document wasn’t valid XML syntax anyway (for example, your site’s source has a <link rel="preload" as="font" crossorigin href="/c/logo-font-5449c974.woff2">) and because it misleads people into thinking that it’s a way of closing elements.
So yeah, I’m curious, because it looks deliberate. Habit? Reason? :-)
(For those that don’t know about the two syntaxes for HTML: XML syntax for HTML is still a thing and probably always will be; load data:application/xhtml+xml,<html%20xmlns="http://www.w3.org/1999/xhtml"/> in your browser as a starting point for a demonstration.)
Author here! I used to have strong feelings about formatting stuff like this, but I since realised there are better things to spend effort on.
For formatting, I just let https://prettier.io/ do it's thing, and it added the />. Although I do configure it to use single quotes in JS, so I guess I still have some opinion there.
In terms of HTML, how far does your "but it isn't necessary" opinion go? Lots of closing elements are unnecessary in HTML, for instance, check out the source of https://fetch.spec.whatwg.org/
Well, for my own personal stuff I omit just about all that I can—head/body start and end tags, html end tag (not start tag because it has at least a lang attribute), tbody start tag where possible, thead/tbody/tfoot/tr/th/td/li/dt/dd/p end tags almost all of the time, attribute value quotes where valid… mostly just because it’s fun doing so, and in some cases because it makes things decidedly cleaner (especially tables). I also don’t use autoformatters because I will often disagree with their opinions in specific cases.
For stuff I’m working on with others, I’ll act more like a normal person, though I will still prefer to drop at least <head></head><body></body></html>, and I’ve never worked with anyone that wanted to put trailing slashes on void elements (and haven’t ever used Prettier on HTML, evidently).
As for Prettier putting the trailing slash in: huh, that’s a really weird decision (and no flag for it!), given that they’re not emitting valid XML (not escaping >, at the least), so it’s just the personal preference thing, for something that was just added as an XHTML compatibility mechanism.
Seems like your HTML formatting opinions are very similar to the owner of the fetch spec!
Yeah, I don't always agree with Prettier, but ugh, I wasted hours in my early career arguing about formatting with teammates, but now I just let Prettier do it's thing, get over it, and spend the time on something else.
Hence me only living life on the edge like that in personal projects! Most of what I write with others these days is in Rust, and I definitely go along with using rustfmt on such projects, even if I regularly dislike its opinions (sometimes even strongly).
For me, I write the trailing slash because it's easier for me to reason about mentally. It also means I don't have to context switch if I'm writing JSX or XML instead of HTML.
For me, adding the ending slash is just like adding semicolons at the end of javascript statements. They may be optional sometimes, but there is something to be said about consistency and clarity that comes with using them always. Less cognitive overload to boot.
Although there are certainly some similarities, trailing slash on empty tags is a different case to automatic semicolon insertion. Semicolons are mostly optional, but the trailing slash is never required, and does absolutely nothing—most specifically, it doesn’t close tags, and that’s what I’m getting at with my position of the trailing slash being mildly harmful: it’s teaching a mental model that’s simply wrong.
Code (and layout) is written for the human reader first. The machine does not care about most whitespace, should we write everything into a single line? (Some of my colleagues seem to think so.)
The name is literally a "self-closing tag", isn't it? And it's better for someone else reading: you may not recall what the tag is, but you know you don't have to look for a closing tag below.
But that’s the thing—it doesn’t do that. If you want an empty div, you can’t write <div/>, because that’s equivalent to just <div>; you’ll have to write <div></div> instead.
It’s just that some tags (like IMG) are always self-closing, some are never self closing (DIV) and some can optionally be closed (P). The trailing / just signals the reader that the tag is meant to end then and there.
Not 100%. There are a small handful of really wicked gotchas. I think there’s a lot of articles on them. I can’t find the one I like and don’t want to share one I haven’t read yet.
In this first example, ASI inserts an undesired semicolon:
return
{a: 0}
This returns undefined, and doesn’t continue on to execute the block containing a statement 0 with label a. (Change it to {a: 0, b: 0} and you get a syntax error because of this reinterpretation of what was intended as an object literal.)
In this second example, ASI doesn’t insert a desired semicolon:
f()
[].forEach.call(…)
This becomes a syntax error, because the [] has become subscripting rather than an array literal. (Incidentally, [].forEach is smelly anyway; prefer Array.prototype.forEach, maybe assign that to a constant if you’re doing it much.)
The second example could be replaced with something more common like:
f()
['foo', 'bar'].forEach(...)
Which is probably a type error, except if `f()` returns something like:
function f() {
return {
bar: ['Not the array', 'you were expecting'],
}
}
Then you would actually iterate over the returned `bar` array, not the expected `['foo', 'bar']` array.
Another fairly likely example of a desired semicolon not inserted is when the following line starts with a template string literal:
f()
`This is tagged with whatever f() returns`
This will most likely be a type error, except if `f()` returns a function, that function will then be called on the template string to do whatever. However I have a hard time imagining when you would want to start a statement with a template string literal without doing something smelly like:
> I can imagine some recommending it for XML compatibility (which is related to the original purpose of the ignore-the-trailing-slash behaviour, though slightly inverted in direction), but I don’t think I’ve ever encountered anyone saying so.
Well, pleased to meet you!
I do use it for XML compatibility: it allows me to use XML editor modes (usually nxml in Emacs) which, being simpler to implement as they don’t have to encode the rules of which tags close where, are more likely to be available and to work well.
I’ve also done little customizations to nxml over the years, and this way I have them always there no matter whether I’m working with XML or HTML.
And then occasionally I use other XML tools on it, such as xsltproc. You can still pipe the HTML through tidy to translate it (although xsltproc has --html, tidy has been more reliable), but having it compatible with XML does spare me small chores again and again.
Yeah, that’s fair enough. In the past I was more likely to close tags like p/tr/td than I am now because the Vim indent file I was using for html didn’t handle some of those properly back then.
I tried really hard to get <video> and <audio> to default to requiring same-origin but people were still skeptical about CORS deployment, and there were also arguments for consistency with <img>. Oh well, I think we eventually got to a consensus that that "consistency" is not worth having.
I think you folks were in a really tricky spot. Making <audio>/<video> require CORS would have been the right decision for security reasons, particularly since <audio>/<video> introduces range requests. It would have prevented these security bugs https://jakearchibald.com/2018/i-discovered-a-browser-bug/.
However, the competition at the time was Flash, and making <audio>/<video> so much harder than it was with Flash would have put developers off.
Fwiw, I regret that opaque responses can go into the service worker cache, since it caused quota-sniffing issues that we had to work around. But, if we didn't allow it, it would have been a feature regression vs appcache. sigh
Yeah, when Web fonts rolled around we fought that battle again and won. Some important people saw the light :-). (And to be fair, CORS was more widely available on servers.)
I think font foundries were particularly keen on that juicy Origin header so they could tell who was using their font, and block sites that hadn't paid.
When I'm writing some frontend that is hosted on localhost, with an API that is hosted on its domain somewhere, it always is some sort of PITA to get the dev environ started.
There's a plugin for firefox that ignores CORS which is helpful for this. It's becoming less useful for me as my APIs now usually have a toggle to add a cross origin header which allows localhost. Still useful.
Just use http-proxy to set up a local domain so you can access your dev environment at frontend.yoursite.local (proxied to localhost:3000) and the api at api.yoursite.local (proxied to whatever 3rd party api). Boom, problem solved. You can even rewrite headers and content of the requests in and out.
> writing some frontend that is hosted on localhost
I assume this would also work for CORS purposes: for some time I've not used localhost where possible, for SSL reasons. Giving the local machine a perfectly valid name that I can get a cert for via LE (or already have a cert for, I actually use a non-production name for which I maintain a wildcard cert) is slightly less faf than having my own signing cert installed as trusted everywhere I might need it. Anything I might do publicly is HTTPS-only so my dev/test environments are too.
I wish there was a way to let me make POST requests on behalf of the user without dealing with CORS. Of course the request wouldn't include any user cookies etc to avoid the security issues.
I just want to easily let my users make API requests to an API whose CORS settings don't allow it. Instead I have to tell my users to give me their API key so I can make the request from my server. Or I'd have to tell them to run a program that makes the requests.
> I wish there was a way to let me make POST requests on behalf of the user without dealing with CORS. Of course the request wouldn't include any user cookies etc to avoid the security issues.
You can! And it can include credentials!
You can do this with a basic <form> element, so fetch() lets you do the same. What you can't do is read the response.
Thank you! Unfortunately I need to read the response (so extra thank you for pointing that out before), but maybe just posting will be enough for a future application :)
I've banged my head against some CORS puzzles recently. That's a pretty good guide, but I actually know an extra quirk that was missed:
You know that nice "Access-Control-Allow-Credentials: true" header? In theory, it means that you can make authorized requests with cookies included to the cross-origin API. It actually has some extra rules that aren't obvious though. The cookies won't actually send unless they explicitly have "SameSite=None" set. Which itself isn't valid unless you are both making the request over HTTPS, and the cookie also has the flag "Secure". See here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Se...
The cookie logic is kind of mysterious. When everything is set up correctly and it works, it's pretty invisible to the code making the cross-origin requests. But if it isn't set up quite right, it will just not work, and it can be pretty tricky to figure out what's wrong. The whole SameSite logic is a pretty interesting kludge too. Ideally, most things would be SameSite=Strict, but that also means that links the user follows from third-parties to your site won't include cookies, since in theory those requests could include GET params doing who knows what.
Digging into all of the CORS rules and how they interact is almost an archaeology project into how the web was built and all of the weird things that both legit sites and malicious attackers have tried to do over the years.
The article mentions the SameSite stuff early on, then kinda mentions it in passing when it comes to CORS + credentials "The same-site rules around cookies still apply, as do the kinds of isolation we see in Firefox and Safari. But these only come into effect cross-site, not cross-origin".
Is it fair to say that the strongest CORS request allows the weakest? That is, if your server supports a preflighted, credentialed, cached CORS request with weird headers and methods, then it would support just about anything?
Perhaps a detailed walkthrough of that one specific scenario, which would enable almost all others, would be helpful.
I don't think that's very accurate generally - the CORS Access-Control headers are pretty flexible, and can be locked down to only allow the specific weird thing you want to do. The part where that is kind of true is setting cookies with SameSite=None - it's needed for the CORS credentials to work, but it might cause security issues with other uses of your website.
You can't always access an image across origin boundaries.
If you load an IMG into WebGL as a texture, that's not allowed cross-domain. It's considered "processing" the image, rather than just displaying it. I ran into this when displaying slippy maps from map tiles. You can display the map tiles with JavaScript regardless of origin, but use WebGL, and you have to deal with the origin problem.
Author here! I added a small reference to the acronym in the article, but I don't think it really matters.
I actually forgot what it stands for the other week, but it didn't prevent me understanding it. And relearning the acronym didn't help me understand it more.
I think this is misguided reasoning. Everyone starts somewhere, and the hallmark of reasonably competent writing is a ramp-up. Whether it’s one sentence or one chapter.
I think failing to define acronyms is a violation of just about any style guide out there.
Communication is a majority of engineering so I think these things are very important.
> If you don't already know what CORS is, you're probably not a Web developer and don't need to know.
This is an appallingly poor and misguided take, and goes against the most basic rules of writing technical documents. Docs need to be clear, unambiguous, and self-contained. The very first time a acronym is presented, it must be after the full name is presented.
It takes less than a sentence to do the right thing. There is no excuse.
Sometimes OPS people get the job of fixing CORS for developers who don't understand how it works exactly.
I work with a client who built a web app in... Vue, I think. For unknown reasons they decided that it would be better that the APIs they need to call to live on the same domain. At the same time, the developers decided that the API microservices should not return CORS headers. Instead it was left to operations to hack in CORS headers in the webserver/loadbalancer.
If CORS is supposed to solve the problem of credentials being sent to third-party sites without the user's knowledge, I don't understand why the solution wasn't just to not send the credentials.
Enlightening, thanks! I had thought maybe a site could in theory use your IP or even just your ability to connect to figure out who it thinks you are, but didn't know this was so common.
The problem with CORS is that most people do not understand it and will not bother the change anything to make life easier for other developers. At least, static files open to the public shall have 'Access-Control-Allow-Origin' set to '*' by default. Vercel and github page do that already which is nice. Most other web host don't, Not even Netlify, who supposedly advocates for the JAMStack.
For dynamically generated content, the site runner should also consider open up if the content is for public consumption and not varying by user's credential.
For some real fun try accessing a redirect's resource when the redirector requires an Authorization header. You'll need 4 separate round trips, and the final endpoint will need to restrict requests based on Referer in addition to Origin, meaning you need to write all your own preflighting logic.
Dude, writing out the background and history of everything must have taken quite some time, but I found it very valuable in helping to understand the present state of things. I just want to say thank you.
This is the resource I wish I'd had some years ago when I was starting out as a server dev. I remember the (lack of a) Vary header in particular causing me hours of frustration.
CORS is a stupid idea that serves no purpose. If someone is really determined they will either 1) turn off CORS with a browser extension 2) simply call your precious API from something other than an a browser
It is essentially security by obscurity and protects nothing. Don't get me started how some technologies like AWS Lambda with a Gateway, when a function has an error, responds by default in such a way it makes the browser log a CORS error instead of, you know, a 500 or the actual error message.
I always do the "allow anything" setting for CORS when I make an API.
> CORS is a stupid idea that serves no purpose. If someone is really determined they will either 1) turn off CORS with a browser extension 2) simply call your precious API from something other than an a browser
> It is essentially security by obscurity and protects nothing.
What.
I think there's been a fundamental misunderstanding of who is being protected here on your part, and what CORS is actually for.
It's your run-of-the-mill user that CORS protects, and CORS being enforced protects them when they visit e.g. an attacker-controlled site with, for example, a valid cookie-based session on your service. It prevents the attacker's site from making dangerous authenticated requests to your API service and reading the result.
> If someone is really determined they will either 1) turn off CORS with a browser extension 2) simply call your precious API from something other than an a browser
That's not what CORS is designed to protect against at all, of course you can hit an API separately and your HTTP client will ignore the CORS headers. Your HTTP client isn't a browser! It doesn't need to worry about CORS.
Similarly, users shooting themselves in the foot by disabling CORS are only hurting themselves. They are not the attacker here.
Being an engineer who doesn't understand CORS is ok; I've worked with a few good ones who struggled with it, so clearly CORS is not a very intuitive tech.
Not understanding CORS and making a comment like this is taking that ignorance to a new level though. Please read up on what you're talking about.
Please tell me more about how publically accessible by anyone REST API's need CORS. Are you actually suggesting people have to tell companies what domains they will be calling an API from, in order to add it to the list of allowed domains, in the code base?
Does your publicly accessible API provide contextual information based on client-state? It probably doesn't, in which case you're right, CORS isn't needed, and lo and behold, this is exactly what the Origin wildcard is for. Adding it isn't a big deal.
But no, it's not really a great idea to make every single privileged API in the world completely insecure just so the admins of public APIs can avoid adding a wildcard header to their servers.
I had a quick scan yeah. I'm sure historically it does but I've yet to ever need or encounter anyone that makes use of these features. I've never been in a team that particularly cares about any of it's supposed value either, it's merely a frustration to remove...
You should try out some connected home devices, including things like routers. Like I said in the article, many of them assume they're safe because they're on a local network, so their security is lax.
Also, I know quite a few public sites that serve debugging data if the request comes from the company's IP range. They shouldn't be doing this, but they do.
Maybe one day we can remove CORS for no-credential requests if we can detect that the destination isn't "internal", and we just decide that folks who serve debugging data by IP deserve to have their data leak. I've heard ideas around this for 10 years now, but maybe it'll happen eventually.
Did any of your teams put private data into HTML or JSON responses authenticated with a cookie, but which didn’t require anything like a CSRF header? If so they should have cared about it because without CORS policies any user logged into your site could have their data read by any other site they visited.
This allows me to type a staging/production URL into Chrome, and get the frontend and the backend from either my local machine or staging/prod. I can mix and match any combination by checking/unchecking a box in Proxyman.
This means there is no need to whitelist localhost for CORS, and other hoops. Another advantage is that you're experiencing the app with SSL, so you may notice bugs that you would miss if you're used to work with HTTP locally. I've had these bugs which "only happen on production" in the past, and it's a nasty thing to deal with because it will be in a rush, since it happens as a surprise, and impacts users immediately. It can also bypass QA if the QA environment is also using workarounds and not a prod-like setup with SSL and the likes.
I recommend giving it a try. It's a workflow I haven't seen promoted anywhere before.