> we are disabling or reducing the precision of several time sources
I am really happy to see this happening at long last. One can argue that almost every x86 SW side channel attack (Spectre and many prior ones that, for instance, leak parts of cryptographic keys from variable-time SW implementations) are being aided by userspace processes having access to high-frequency/high-res counters for which, in almost every case, they have no legitimate need. Perhaps future silicon should block direct access to the TSC & friends, except for processes that have a privileged flag set (hardly a new idea; IIRC some IBM PowerPC chips from the late 90s support that).
OS-level timing calls such as gettimeofday() could be degraded to a few microseconds resolution by default - poor enough to obscure delays caused by the state of various HW caches, branch predictors, etc.
Ultimately there is little reason to give every website js program access to a nanosecond-level counter by default, and many defense-in-depth reasons for not doing that. So kudos to the firefox team here and hopefully chrom[e|ium] quickly copies this.
Timer coarsening is a bad idea. Turns out [1], a simple timer thread is a good enough poor man's cycle counter. What are you going to do: ban variables? Even if you could coarsen timers, adding random noise just makes attacks take longer. It doesn't make them impossible; attackers can just collect more measurements.
You need to address the specific vulnerabilities, not shoot the clock_gettime messenger.
> adding random noise just makes attacks take longer
No. It's true that with a low-res clock source in some cases an attacker can still get the precise measurement they're after by repeating an operation 1000x+ more times. In some cases though that is not possible, because at that timescale the signal they're after is degraded by noise that they cannot predict or control: operating system interrupts, memory traffic from other threads, etc.
Anyway, even if a lower-res clock source helps only 10% of the time, on defense you should always prefer to make an attack complicated, slow, partially-reliable rather than trivial, fast, highly-reliable.
Developers who need a high-res clock source for profiling, etc, should of course still be able to enable one selectively.
> You need to address the specific vulnerabilities, not shoot the clock_gettime messenger.
You can and should do both, to gain some defense-in-depth against future attacks.
Random noise isn't necessarly a hindrance -- it can help.
Contrary to intuition, the presence of random noise can actually make detectable a signal that is otherwise below the minimum detection threshold. See, e.g. stochastic resonance. (Essentially, the noise occasionally interefers constructively to 'bump' the signal up beyond the threshold to make it detectable.) If you are able to introduce and control your own noise, you may also be able to take advantage of coherence resonance.
Randomness itself can be a very useful tool in many signal detection and processing systems, e.g. sparse random sampling in Compressive Sensing techniques can reconstruct some kinds of signals at frequencies beyond the Shannon-Nyquist limit of a much higher-density fixed-frequency sampling -- something thought impossible until relatively recently.
I would not be at all confident that such 'system' noise could not be filtered out statistically; it might even be used to an attacker's advantage.
> on defense you should always prefer to make an attack complicated, slow, partially-reliable rather than trivial, fast, highly-reliable.
Always? I would think it would depend on what the trade-offs are, what costs you are paying for doing that (in inconvenience or damage or cost to the non-attacking users and use cases; in opportunity cost to other things you could have been focusing on, etc) compared to how much you lessen threats. Security is always an evaluation.
Timing attacks are the worst though. I think this may only be the beginning of serious damage to the security of our infrastructure via difficult to ameliorate timing attacks.
> In some cases though that is not possible, because at that timescale the signal they're after is degraded by noise that they cannot predict or control: operating system interrupts, memory traffic from other threads, etc.
Wrong, those signals will average out over a long enough data collection time.
It's one of those things that feel utterly counter-intuitive.
But if you have a signal overlaid with random noise [1], and you know what your signal's looking like and when it's happening, you can correlate. For example, a small delay that occurs at certain known points (or not), will introduce a bias into a timer measuring it, no matter how noisy or coarsely quantized that timer is.
Similar techniques have been used in other fields for decades to pull useful signals from far below the noise floor (e.g. a lock in amplifier can go dozens of dB below the noise floor, because it, essentially, correlates frequency and phase and thereby eliminates all but a tiny sliver of noise. E.g. GPS signals are typically 20 dB below the noise floor.
[1] It doesn't have to be random.
——
So these mitigations just make the attacks harder, hopefully hard enough that they become not feasible to be exploited widely.
Then why can't we make sharp high resolution photos of distant planets? Shouldn't we be able to average out all the noise for every pixel if we just collect light long enough?
What type of stuff is between us and the planet and stays on the same pixel all the time? I would assume everything in the universe moves all the time. We move. The other planet moves. How can something block the same pixel of our view of the planet all the time?
It is still possible to exploit a timing attack with a low resolution timer; it simply takes more samples. Meanwhile, people have managed to perform timing attacks (such as against OpenSSL) across a network (see the link below where I link to the seminal paper, "Remote Timkng Attacks are Practical"), as while the latency is large and variable it can still be trivially characterized. So, unless you are willing to apply a truly unpredictable delay function (some kind of Turing-complete noise which sometimes might block for arbitrary amounts of time) or, alternatively, a delay long enough to reauire the user to sit on the web page for "too long" (maybe weeks is the right calibration? I often find sketchy tabs that have been open for days) to essentially everything--and, in particular, all JS-initiated network requests--then you are likely just engaging in security theater by removing useful functionality from an API because it makes you feel more secure.
There are other side channels where an attacker cannot cause the desired operation to be repeated thousands or millions of times back-to-back..
If Alice performs an operation only once, and it leaks sensitive data through a timing variation of 100-200ns, Mallory is in great shape with a 1ns clock source and in pretty awful shape with a 10us one.
The problem with coarsening timers is that many things can be used as one. Performance registers, received packet timing are just the obvious ones.
This problem has been studied a lot. The venerable TCSEC Rainbow Series dedicates an entire volume to covert channels (the light pink one iirc).
It is a statistical problem. Even if you reduce the timing precision or randomize it and effectively raise the noise floor, it just takes a little bit longer for the attacker to get his data.
A good analogy would be Differential Power Analysis (DPA). Measurements are collected over a period of time to enhance the signal.
JS code or containerized code running on shared hardware are one thing. But crippling all userspace applications seems like going too far. Especially considering this is just papering over the problem instead of fixing it.
Also note that if you reduce precision you might still be able to tease out the data simply by gathering more samples. An exploit that gets you 1000bytes per minute might take an hour instead, but that could be enough to find cryptographic keys.
> crippling all userspace applications seems like going too far.
Depends on the system IMO. I certainly want the TSC when I'm profiling something highly-performance-sensitive on my workstation. Yet I am hard-pressed to see why any userspace app on my Mom's chromebook requires a ~1ns counter.. some stuff there may be currently using it, but that is different from "actually requires it"/"should continue to have it from an overall cost/benefit point of view".
Except that a typical user understands the implications of giving camera or location access. I doubt most would understand the implication of giving clock access.
Simulations, real time audio applications (filters, effects, etc.), real time anything for that matter, performance profilers... just of the top of my head.
Well, since 1/20µs = 50 kHz, it strikes me that this threshold was probably chosen with audio in mind.
I agree that for a lot of cases, it's plenty. But if you're trying to account for phase differences in multiple data sources, this puts a major blindfold on.
Let's think about it in terms of lower frequencies. Imagine that you're sampling from a pool of events that occur roughly 100 times/hour. Let's say you want to sample 2 event per hour.
Well you might say "since you're only sampling at a rate of 2/hr, why would you need any better granularity than 30 minutes?"
Here's the catch: you need to get the event that's closest to the 15 minute mark of each hour and the one that's closest to the 30 minute mark; you will then put those two together using your special formula.
Imagine that all your events came into a mailbox on an half-hourly basis. Imagine a stack of ~100 envelopes, each labeled "8 - 8:30 A.M." or "8:30 - 9 A.M.". You're tasked tasked with the job of trying to pick the two closest to 8:15 A.M. and 8:30 A.M. Good luck! It's gonna be hard, if not impossible, to do this consistently with any accuracy.
(Even if you had 15 minute granularity, you'd still have to pick from each one from either 25 or 50 envelopes, depending on where the windows fall relative to the 15 minute marks.)
Now, to make matters even more complicated, imagine a situation where you're integrating each hour's result with the previous hours' result, e.g. taking a rolling sum or product, or computing some kind of feedback/delay/reverb filter. In cases like that, you're done for.
Full disclosure, I'm coming at this from the perspective of an experimental digital artist who some day would like to do interesting creative things with high frequency sources, like the sounds of bugs, dolphins, and higher frequency ambient sounds. I'm also interested in building applications that crowd-source and integrate audio from multiple smart-phones in a room. The web seemed like a perfect platform for this.
More interesting IMO is this part of the announcement:
> In the longer term, we have started experimenting with techniques to remove the information leak closer to the source, instead of just hiding the leak by disabling timers. This project requires time to understand, implement and test, but might allow us to consider reenabling SharedArrayBuffer and the other high-resolution timers as these features provide important capabilities to the Web platform.
Typically timing attacks do not need absolute time. So for high precision timer one can just use a counter loop in a separated thread. This is even possible in JavaScript with web workers.
That's a fair point, but let me note that it is still less desirable (to an attacker) than the TSC in at least one case: when system load is high enough that there are other non-attacker-controlled threads running on the same core. When the counter thread's not running, the counter thread's not counting, and this loss of accuracy could at least in theory complicate an attack.
>>this loss of accuracy could at least in theory complicate an attack.
It just takes more samples and in the end it solves nothing.
>>non-attacker-controlled threads running on the same core.
Again it might take more samples. However, worse: the system is like unresponsive at the time. Also the timeslices are long enough to carry the task, unless there are way, way too many and unpredictable context switches (which would be bad for performance).
----
Back in the days of old there were no built-in timers and people used to count cpu cycles to accommodate for external io.
The other operations, like posting messages, have much higher latency noise. SAB allows to observe counter changes effectively within few CPU cycles with stable latency.
Would you know why people suddenly started worrying about this now, when—as you pointed out [1]—cache-based timing attacks in Javascript were already practical in 2015? What's so different about these cases now? Did the earlier research just fly under the radar?
As far as I can tell there was basically a wink-wink nudge-nudge agreement between the various browser vendors/web standards authors (same people) that timing attacks that revealed things like the users browsing history, or that allowed tracking of users weren't a problem and not sufficient to block a feature from being added to browsers. See the many issues in [1][2][3][4][5] which stretch back over 6 years (and aren't all fixed) for an idea of how long this has been going on for (and a previous rant I wrote here https://news.ycombinator.com/item?id=10022315 ).
The cache attacks never received as much publicity for some reason (the papers being harder to read may be one). I do wonder what would have happened if this hadn't come out at the same time as meltdown. It's quite possible it would have been brushed under the carpet yet again.
There were token 'fixes' to these things (coarsening timers) in the past, but they never worked and everybody involved knew they wouldn't work. The introduction of features like SharedArrayBuffer revealed how (un)seriously they really took the problem. They knew it could be used to implement high precision timers but it got added to browsers anyway because it was central to the project of making the web an application platform.
They perceive a need to allow high precision timers (or features that can be used to implement them) because without that the web won't be able to do a lot of things that are possible in native applications.
I'd like to think that this is the moment that browser vendors come back to their senses and rethink what they are doing but I doubt it. Google is a multi billion dollar company based on the web as a platform, and running untrusted javascript on other peoples computers. Dropping the idea of the web as the platform to end all platforms would be an existential crisis for Mozilla. They are locked into this madness with no way to stop that wouldn't effectively be corporate suicide. Expect years of half hearted 'fixes' which don't fix the problem.
> because it was central to the project of making the web an application platform
This is very insightful. A lot of web standards are attempts to add what traditional native-app capabilities to the browser (e.g. 2D Graphics => Canvas).
What's very clear now is that native-like web capabilities imply native-like vulnerabilities, but delivered over the network.
Browser vendors rethinking what causes a user to launch complex Javascript on tabs they visit (or worse, in invisible iframes) would be a great start. It was one thing when all Javascript could do is style and manipulate the DOM. We can now compile vim into Javascript and that demands a completely different response.
This is not to stop progress on web standards, but if the web community takes this opportunity to level-up their security practices, it'll help them (and web application developers and users) in the long run.
Mozilla previously reduced the precision of performance.now() to 5µs. This was trivially defeated by just running performance.now() in a tight loop, as described here: http://www.cs.vu.nl//~herbertb/download/papers/anc_ndss17.pd... see the technique in section IV called time to tick (TTT):
>The idea behind the TTT measurement, as shown in Figure 4.4, is quite simple. Instead of measuring how long a memory reference takes with the timer (which is no longer possible), we count how long it takes for the timer to tick after the memory reference takes place. More precisely, we first wait for performance.now() to tick, we then execute the memory reference, and then count by executing performance.now() in a loop until it ticks. If memory reference is a fast cache access, we have time to count more until the next tick in comparison to a memory reference that needs to be satisfied through main memory.
>TTT performs well in situations where performance.now() does not have jitter and ticks at regular intervals such as in Firefox. We, however, believe that TTT can also be used in performance.now() with jitter as long as it does not drift, but it will require a higher number of measurements to combat jitter.
So, what stops this method from working, even with 20µs resolution performance.now()?
Maybe the clock can be made to drift? You can make the clock a random walk with the restriction that it never decreases, and doesn't deviate more than 20us? That is, each tick of the clock actually just adds a non-negative interval to the clock. Since the deviation from real time at tick depends on the deviation at tick-1 it would require more time to get good accuracy by averaging measurements.
> The resolution of performance.now() will be reduced to 20µs.
That will reduce the data rate of this particular covert channel, not prevent the attack altogether. Even adding random noise would not rule out the attack.
Presumably they are aware of this. If they reduce the data rate enough, the attack becomes less useful in practice. Presumably their engagements with security researchers and other browser vendors, along with internal experimentation, led them to believe that this resolution decrease offered significant mitigation.
Allowing infinite resources for remote programs is something we don't even do for local programs. Giving a ceiling to the JS runtime is a sound reasoning.
So a quick question. What is the alternative to SharedArrayBuffer? One of the web applications i work on use it a fair amount. Is our app now just going to break?
The phrasing in the post says "The SharedArrayBuffer feature is being disabled by default", so depending on your situation, you may be able to instruct your users to manually enable it, or you can at least enable it yourself to continue development in the meantime.
Of course, that will likely be a browser-wide setting, so telling others to enable it will put them more at risk of these attacks.
(As mentioned in another comment, the more robust solution is to use window.postMessage as a fallback, although it depends on exactly what you're doing.)
> In line with other browsers, Chrome 64 will disable SharedArrayBuffer and modify the behaviour of other APIs such as performance.now, to help reduce the efficacy of speculative side-channel attacks. This is a temporary measure until other mitigations are in place.
The window.postMessage and related MessageChannel APIs. You have to set up a listener and then post a message across the "boundary" (like an iframe or webworker).
Can the user be expected to make the right choice there? I have no problem deciding if Uber, a local restaurant or a news site need my location or camera data. I have no idea why they would need high precision timing data or even what "high precision" is. Maybe it's reasonable, maybe it isn't but it's not as easy to decide from the user's shoes as location/camera.
Most of the time, they won't need it. When they do, they'll need to make the case to you explicitly. Some phone apps currently do this when they need a non-obvious permission.
I think that the point is that most users can't be expected to understand the risks present in allowing a high resolution timer. It's a very nuanced decision, unlike camera and storage and location.
Most web apps are unlikely to need high-precision timers. For games, however, Date.now() offers only millisecond resolution, which can make it occasionally inadequate.
I'm developing a website that will require somewhat high frequency timing to account for network latency but I think standard getTime is accurate enough with millisecond accuracy which I think will satisfy my requirements (I still need to test this though).
I can't think of something that needs microsecond level timing though.
I agree; I would like to see a move away from trying to cram everything into a browser app, and towards adding better connectivity and community features to native apps.
Which is why you 1) want to ask, 2) default to "no", and 3) explain that the request is both unusual and potentially harmful.
The option to blacklist an app from such requests should also be present.
If the app itself absolutely cannot perform some function w/o high-precision timing, that becomes its problem to communicate to the user.
I've made a practice of denying application permissions, and deleting apps that make such requests. I'm moving to deleting Android entirely, which has proved overall to be a poorly-performing and functioning virus and attack vector, at cognitive, social, economic, political, and other levels. Also, frequently, software.
But that would be wrong. It's not your personal data. It's "do you trust x.com not to potentially serve you compromised JS that could be used to hack your computer" which 99.99% of regular web users do not have the necessary background to understand.
Given that you can apparently get high-resolution timing out of two threads and shared memory, and shared memory of some sort is needed in order to make existing thread-based code / paradigms work, this is going to be a hard sell. Essentially you're adding a permission prompt for threads.
I hope they find a better solution that keeps shared memory working.
This is basically an arms race. As long as the fundamental issue is not fixed and all silicon replaced, there will always be people coming up with new ways of exploiting the Spectre vulnerability in unexpected ways.
Shall we now have a new website permission for timers or at least high-precision timers? (It will be quite fun to get end-users to understand the ramifications of such a thing...)
Has there ever been discussion about a permission for "running as an app" that would make a distinction between web pages and apps? I like that the web gets more features that allow apps to run cross-platform, but I don't fully understand why we make no distinction between pages and apps.
I would strongly say no - while most users can immediately understand permissions to use the camera or microphone, understanding the permission for "high-precision timers" requires considerable unreasonable effort.
If instead of measuring time directly attacker would try to increment a counter in a loop in another thread, would they get enough accuracy and precision for the attack to work?
That's exactly what the shared buffer array thing is about. You can use a webworker to do exactly that and still get accurate enough timing to do accomplish this.
As I understand it, the only synchronous channel between browser-JS threads of any sort (workers, separate tabs/iframes) is a SharedArrayBuffer. Without that, you just have async things like postMessage or leaving notes for other threads in local storage / cookies / etc., so you can't read two different values from the counter within one timeslice without SharedArrayBuffers.
Not being familiar with the usage of SharedArrayBuffer, is disabling it the nuclear option (eg bad for everyone involved and clearly not the best option if others existed)?
I see over 100 pages of commits on github referencing it, which tells me it's not exactly rare?
It looks like SharedArrayBuffer was released in July 2017 in Chrome and August 2017 in Firefox [1], so I think it's new enough that almost any non-internal website using it would need a fallback anyway. So my impression is it's not quite as nuclear as other web technologies being disabled, but still pretty bad.
I personally used it for a experimental hackathon project in April (in Chrome Canary, behind a flag), and I believe it was necessary for one of the aspects of the project, so it's a shame to see it disabled, but my impression is it's more "future web tech" than "the way the web works today".
High performance applications need to share memory between threads. That feature let you do it in JS, and is the basis of the future shared memory multithreading feature in Web Assembly.
So if you want the next Photoshop or Final Cut Pro to be browser based we’ll need to make it work securely.
Seems like virtualization is the way to go for JavaScript. Hypervisors such as XEN in HVM mode with modern processors don’t result in a large performance penalty, so I imagine Chrome will gravitate towards it for running untrusted code.
That's what Site Isolation is, but it has significant drawbacks: it cannot always protect against cross origin cookie leakage, and pages can have a lot more cross domain iframes than a reasonable process limit can account for.
Does anyone know how trustworthy client time accuracies are accounting for clock skews?
After a bunch of google searches, I have not found anything suggesting how accurate clocks are for clients besides Charlies algorithm [0] (I want to know UTC time/epoch... whatever from each client so i can compare them) which I am concerned about adopting due to the wiki page assuming a high quality network.
These timing attacks are all about local time deltas, not absolute times, because it’s measuring the CPU caches, not anything to do with the network. Things like “this operation took 5µs, which suggests that the data it was accessing was/was not cached”.
No one can blame them for knowing… The real question is if Intel knew and were they in any way incentivised not to fix it sooner. It would be very “easy” to justify not fixing a bug that went undetected for let’s say 10 years…
If there ever was NSA hardware bug this is how it would look like. I doubt we’ll ever know.
The performance.now() patch changed one line of code, so backporting that shouldn't be hard. The SharedArrayBuffer change is just flipping a pref, so that should be easy as well (assuming it's even supported - isn't Palemoon a pretty ancient fork of Firefox? Even Mozilla's most recent ESR release doesn't have SAB enabled by default).
For the interim, may it be a wise idea to include an inlined script at the very top of any page which includes untrusted 3rd party code, in order to overwrite the Performance tools by a wrapper? By adding some random to the real numbers this may help mitigating effects ...
I am really happy to see this happening at long last. One can argue that almost every x86 SW side channel attack (Spectre and many prior ones that, for instance, leak parts of cryptographic keys from variable-time SW implementations) are being aided by userspace processes having access to high-frequency/high-res counters for which, in almost every case, they have no legitimate need. Perhaps future silicon should block direct access to the TSC & friends, except for processes that have a privileged flag set (hardly a new idea; IIRC some IBM PowerPC chips from the late 90s support that).
OS-level timing calls such as gettimeofday() could be degraded to a few microseconds resolution by default - poor enough to obscure delays caused by the state of various HW caches, branch predictors, etc.
Ultimately there is little reason to give every website js program access to a nanosecond-level counter by default, and many defense-in-depth reasons for not doing that. So kudos to the firefox team here and hopefully chrom[e|ium] quickly copies this.