Hacker News new | past | comments | ask | show | jobs | submit login
Heavy SSD Writes from Firefox (servethehome.com)
477 points by kungfudoi on Sept 23, 2016 | hide | past | favorite | 342 comments



Chrome, on my system, is even more abusive. Watch the size of the .config/google-chrome directory and you'll see that it grows to multi-GB in the profile file.

There is a Linux utility that takes care of all browsers' abuse of your ssd called profile sync daemon, PSD. It's available in the debian repo or [1] for Ubuntu or [2] for source. It uses `overlay` filesystem to direct all writes to ram and only syncs back to disc the deltas every n minutes using rsync. Been using this for years. You can also manually alleviate some of this by setting up a tmpfs and symlink .cache to it.

[1] https://launchpad.net/~graysky/+archive/ubuntu/utils [2] https://github.com/graysky2/profile-sync-daemon

EDIT: Add link, grammar

EDIT2: Add link to source


Launchpad is such a shitty website, aimed at Ubuntu and only Ubuntu, links to source code or more information are nowhere to be found...

Searching on Github, this seems to be it. Turns out, there's releases for Arch, Debian, etc. and it's even in the repositories. No need to add a ppa. https://github.com/graysky2/profile-sync-daemon

For Debian and co:

    $ apt-cache show profile-sync-daemon
    $ sudo apt-get install profile-sync-daemon


I checked the Activity Monitor on OS X to see if it's the same and if I should do something. It's quite surprising:

    bird	         11,38 GB   119,8 MB		
    kernel_task	          8,82 GB    23,0 MB
    launchd	          8,46 GB    77,7 MB	
    revisiond	          8,35 GB   105,8 MB	
    Tweetbot	          3,01 GB    50,2 MB	
    Safari	        879,6 MB     10,0 MB	
    mds_stores          726,2 MB    570,1 MB		
    systemstatsd        695,0 MB    775,0 MB
    nsurlstorag         685,3 MB      9,5 MB	
    Google Chrom        681,4 MB    303,5 MB		
    iTerm2	        653,7 MB      1,6 GB	
    cfprefsd	        509,1 MB     45,8 MB	
    Papers 3.4.7        465,9 MB    169,8 MB
    cloudd	        441,3 MB      9,7 MB
    ruby	        224,6 MB      1,7 MB	
    coreduetd	        217,5 MB		
    Reeder	        207,2 MB     28,5 MB 
    apsd	         65,7 MB     11	MB    
    Safari	         55,3 MB     12,6 MB	
I've long gotten used to "bird" doing it's thing (something with the cloud I guess). But how can a twitter client write 3GB (while I'm not even actually using it?)


You can check out what bird does. This Python script showed me that the Whatsapp desktop app was consuming a couple of gig: https://github.com/bwesterb/blame-bird/blob/master/blame-bir...


Caching tweets with all media, I would guess. Probably updates in the background all the time.


The link in question was to an Ubuntu PPA on Launchpad, but Launcpad is by no means Ubuntu-only.

See eg https://launchpad.net/openstack, https://launchpad.net/inkscape, https://launchpad.net/shutter etc


Launchpad is absolutely horrific to navigate and contribute to.


As someone who took the plunge into Debian from Ubuntu yesterday, thank you!


Arch Wiki describes how to do it manually: https://wiki.archlinux.org/index.php/Firefox_on_RAM


Thank you! I've been looking for this for ages


> Chrome, on my system, is even more abusive.

The OP is complaining about excessive write traffic, which wastes power, steals disk-time from other applications, and may wear out SSDs prematurely.

A large profile file does not imply excessive write traffic, so it's not clear from your report that Chrome is hitting the same problem at all. Definitely worth watching, but big-file != excessive-writes.


I think your point is valid, but even so the person you were replying to turned out to be correct.

There's an edit at the end of the article:

> Currently in the middle of a Chrome Version 52.0.2743.116 m test. We have been able to see a pace of over 24GB/ day of writes on this machine...

24GB is higher than the 10-12 he was complaining about from Firefox, even.


Been using on my system ever since I started running everything from my USB drive... to prevent excessive freezing twice a minute. On NixOS, add the following two lines to your configuration.nix:

    services.psd.enable = true;
    services.psd.users = ["your-user-name"];


beware! profile-sync-daemon erased my browser profiles upon first run under Ubuntu Xenial due to apparent bug in eCryptfs support (and how /home was mounted). Report was submitted to the developer.


Just for info, profile-sync-daemon is available for Jessie only if you have the backport repos.


What is the typical memory usage of this for normal web browsing?


`mount -l` shows a filesystem of type `overlay` for each browser (in my case, chromium, chrome, firefox). Range from 50MB to 200MB. I have it to sync to disc every 15min - so I reckon it empties out the ram whenever it syncs. I've never had a problem (8GB of ram on the machine I initially installed it on) so I haven't watched to see what the max size is. The developer [1] is great - issues are resolved very quickly.

[1] https://github.com/graysky2/profile-sync-daemon


Mine uses the following with firefox:

overlaid 2.0G 128M 1.9G 7% /run/user/1000/codemac-firefox-3ckuo4v7.default


this is the first I heard of this - why is this not a config flag ?


because not losing data is infinitely better than saving your ssd life a couple years. we've had ssds for a long time now. still waiting for one to die on me. meanwhile I'm on my 27th dead spinning disk.


Some people are unjustifiably worried about SSD endurance.

The reality is that even with fairly heavy use, most of them will far outlive the computers they're in. And their owners for that matter!

I purchased my current SSD in March 2015 - it's a Samsung 850 EVO 512GB. It's done 7332 Power_On_Hours, and 36486573259 Total_LBAs_Written, which works out as just under 17 TB written, or about 31.6 GB/day.

That kind of sounds like a lot. But let's put it in context. Even the previous generation of TLC NAND SSDs were recorded in endurance tests doing around 1 PETABYTE of writes before failing. The 850 EVO, with it's 3D V-NAND, should be capable of at least double that.

For arguments sake, lets assume it will last for 1 Petabyte. I've written 17TB in 1.5 years, or 11.3TB per year. At that rate, this drive is still going to last at least another 90 YEARS.

So is this really something to obsess over?


>The reality is that even with fairly heavy use, most of them will far outlive the computers they're in.

I'm not sure what the lifetime of a computer is. In my family, we are still occasionally running machines from circa 1995. They work fine. When should I stop using my MacBook? The only reason I'd stop is if it died in a way that can't be cheaply fixed.


I'm curious what useful work such a computer can perform, given that a typical 1995 computer had a 33 MHz processor, 8 megs of RAM and 1 gigabyte of hard disk space.


The same work they did in 1995. Word processing, games, internet. Computers weren't useless back then.


>I'm not sure what the lifetime of a computer is. In my family, we are still occasionally running machines from circa 1995.

Pretty sure it's not 21 years, c'mon...


1995 is a stretch for anything modern but I have kids playing Minecraft on Pentium 4’s (with decent graphic cards though - that is the critical part). These boxes would have been new around 2003 or so.


this is an incorrect assumption http://techreport.com/review/27909/the-ssd-endurance-experim...

Most of the SSD failed before a petabyte... and started encountering lots of errors before that (forget about the Samsung Pro series which is the only one that survived till its multi-petabyte end).


Those were all ~250GB devices, however. Endurance increases with higher capcity.

If a 250GB 840 EVO can get to 800TB without errors, then I'd expect a 500GB 850 EVO to reach 1 PB, easily.

A 500GB 850 EVO has newer-generation, more durable NAND endurance and twice the capacity to work with.


The warranty on your EVO is just 150 TBW. The older generations typically had better endurance, the newer generations are more dense (cheaper to produce) but less durable.

If you keep with 17 TB per one and half year your SSD should be OK for your for at least next 11 years.


The warranty on your EVO is just 150 TBW.

Yes, but endurance tests almost invariably show that the warranty ratings are extremely conservative. The 1TB 850 PRO (which uses exactly the same NAND chips as the 850 EVO, just a different controller) has been endurance tested to more than 7 Petabytes. See: http://packet.company/blog/

The older generations typically had better endurance, the newer generations are more dense (cheaper to produce) but less durable.

That was true when comparing MLC flash to TLC. But the 850 EVO and PRO use 3D V-NAND, which has significantly more endurance than previous-generation TLC. This seems to be confirmed by endurance tests.

your SSD should be OK for your for at least next 11 years

The warranty runs out after 5 years, anyway. But that does not mean it will suddenly stop working on that date!


No, you're wrong. The 850 PRO uses MLC while the EVO uses TLC. But given that they're both constructed on Samsung's 3D-VNAND technology, their endurance is still much higher than the competitors out there.


Yes, you're right. But I don't expect an EVO to last for a full 7 PB of writes like the PRO did. But even if it lasts a fraction of that (say, approx 1 PB, which previous generation 840 EVO was able to achieve) then it's still 90 years of life at current usage levels! A (1 TB) 850 PRO would last over 600 years!

In all likelihood the drive will never get anywhere near any of these values - it'll be replaced with newer, better tech at some point. It may then get used for a backups or secondary storage, but in those scenarios the daily writes will drop enormously.


> their endurance is still much higher

It's not "much" higher, actually, 3D MLC as implemented by Samsung is just "up to twice" better than the planar MLC given the same die area and capacity:

http://www.samsung.com/semiconductor/minisite/ssd/v-nand/tec...

"Samsung V-NAND provides up to twice the endurance of planar NAND."

But if the size of the cell drops, the number of P/E cycles drops. Samsung's endurance declarations are real and to be believed, they initially used bigger cells than some of their other chips (or competitors), but Samsung engineers know what they do, that "some tests" achieved "much more" can be either an accident or due to the errors in the test methodology (and I very much suspect the later, because it also allows the accidents to be taken as the "success"). At the time these SSDs appeared, Samsung declared twice less TBW than they do now, so now the declared endurance is surely not too pessimistic, but based on the real knowledge of what's inside.


I am one of those people who sets browser to dump all cookies, cache and and history upon closing, so I'm not really sure what data would be lost.


Do you ever miss the upside of all that persistence? I find my history useful, I find continually re-logging in a pain (obviously excepting the kind of sites where I don't want my login to persist i.e. financial etc).

However I also only close my browser once every couple of weeks as well. System updates etc.


that is the beauty of it. You can disable. Be glad you have firefox and not only IE or chrome (where you can't disable all the things)

But for 99% of the users, they want it.

Heck, anecdote for anecdote: the very first thing i do on new machines is to open up firefox, go to settings, set the browser to startup with my previous session's tabs.


This is my experience too.


Probably because most people don't want their profile to be broken if their OS crashes or their computer powers off unexpectedly?


If you miss a sync, your profile isn't going to be 'broken', just out of date.


if it happens during the sync you could be arbitrirly broken.


Only if the sync is the equivalent of:

    rsync [...] /dev/shm/.browser ~/.browser
Instead of:

    rsync [...] /dev/shm/.browser ~/.browser.new && ln -sf ~/.browser.new ~/browser
Or some such equivalent, i.e. doing the full sync and then atomically juggling a symlink around.


that creates many more writes to the ssd.


Does it? Consider this scheme: maintain two physical directories, one of which is current (and is named by a symlink) and the other of which is old. When doing a sync, you rsync the newest data into the old directory, then update the symlink. It'll probably result in somewhere between 1x and 2x the writes as the non-atomic single-directory scheme, depending on how well adjacent diffs combine. It will also result in twice the space used (barring some very clever filesystem).


He should have used --link-dest= on his rsync.

    PREVIOUS_BACKUP=$(cat ./prev_backup)
    NEW_BACKUP=$(date +%F_%T)

    rsync --link-dest=$PREVIOUS_BACKUP /path/source /path/dest/$NEW_BACKUP

    echo $NEW_BACKUP > ./prev_backup
This will copy /path/source to /path/dest/$NEW_BACKUP (which is a time stamped folder). But it will take into account the previous backup. If a file hasn't changed, it will create a hard link. If it changed, it will then copy the whole file.

And that's it.

Since it's the name is a time stamp. When you need to restore, just read ./prev_backup or just list the directory content, sort it and read the last entry.


This isn't really the sync daemon's fault; it's Linux's (or rather, ext4's, and the Linux VFS ABI's) for not supporting multi-inode filesystem transactions. NTFS has them; APFS will have them. Linux should add them too.


The sync daemon could sync with hardlink and then rename which is atomic.


That would require rewriting the entire directory.


hardlinks for the stuff not changed. Browsers of course, should move to append only file formats.


> There is a Linux utility that takes care of all browsers' abuse of your ssd called profile sync daemon, PSD.

I guess doing something like this on Windows is too much to hope for, but does anyone know how to do this sort of thing on the Mac?


I would be very interested to see how Chomebooks fare here.


Anyone actually noticing a speedup from using this?


Anything similar for macOS?


How do we switch it off on Chrome (Windows)?


I have been wondering and asked the same for chrome based browser... did not find anything so far.


Hi, I’m one of the Firefox developers who was in charge of Session Restore, so I’m one of the culprits of this heavy SSD I/O. To make a long story short: we are aware of the problem, but fixing it for real requires completely re-architecturing Session Restore. That’s something we haven’t done yet, as Session Restore is rather safety-critical for many users, so this would need to be done very carefully, and with plenty of manpower.

I hope we can get around to doing it someday. Of course, as usual in an open-source project, contributors welcome :)


How about this: 29 out of 30 times, save only a diff to the previous data. 1 out of 30 times, save the complete data in compressed form.

(I'm guessing there must already be functionality to diff a bunch of JSON somewhere in the millions of lines of code).

Though I'm sure this doesn't make usually make a dent in a SSD's lifetime. But there are still people running Firefox on low end Android phones with meager flash, and Raspberry Pis with SD cards.


> How about this: 29 out of 30 times, save only a diff to the previous data. 1 out of 30 times, save the complete data in compressed form. > > (I'm guessing there must already be functionality to diff a bunch of JSON somewhere in the millions of lines of code).

That is actually a good idea we haven't considered yet. A bit too brute force for my tastes, but relatively easy to implement. We would need to determine how much CPU is needed for a diff between two 300Mb JSON files, though (yes, some users have these).

Of course, we're back to the issue of manpower, but definitely worth trying out.

> Though I'm sure this doesn't make usually make a dent in a SSD's lifetime. But there are still people running Firefox on low end Android phones with meager flash, and Raspberry Pis with SD cards.

The implementation of Session Restore for Android is largely independent, so I'm not sure how it works these days.


Surely you'd diff the data structures in memory, not the serialized JSON; and would it really be faster to blindly write the 300mb each time than perform this diff?


To be benchmarked.


Why is there so much data in the session restore anyway? If the goal is to have the URLs of the currently opened tabs, I'd expect that just the given URLs should be enough? I think I've seen some unexpected stuff there like the images base64 encoded? Maybe there's enough users that would be satisfied just with the URLs? At least for them the "rewrite" would be seldom needed.

Or, maybe to reformulate, which wild scenarios does Firefox want to support now? I can imagine that the user's experience wouldn't match the wishes. Some people that use session restore claimed they "lost everything" from time to time, and I had to fish "just the urls" from their session store files which looked strange ("full of everything"), but automatically restored to nothing.


Session store contains open tabs, windows, history for each tab, form fields, referrers (so we can re-request the page correctly), titles (so we can restore tabs without re-fetching every page), favicons, charsets, some timestamps, extension data, some kinds of site storage, scroll positions, and a few other things.

The goal of session restore is to restore your session -- your open tabs should come back, the same pages should load, scrolled to the same place, and with the right content.


I'd wish they'd also restore themselves to the proper location on the proper virtual desktop. With hundreds of windows, typically organized by task, I dread having to reorganize things every time Firefox (or the computer) restarts.

Someday the pain may motivate me to try to learn enough about Firefox's internals to do it. :)


Definitely worth filing a bug in bugzilla if they are not on the right virtual desktop... it might take years to get fixed, but they do generally get round to it.


There already is a 10 years old bug about the issue: https://bugzilla.mozilla.org/show_bug.cgi?id=372650



Storing a lot of static images as JSON base64 encoded, every 15 seconds is certainly not something that users should be blamed ("some users have 300MB JSON files"), just the poor programming.

It would be interesting if somebody would actually analyze what takes the most of the mentioned 300 MB. I see a lot of base64 encoded stuff, if they are "favicons," come-on. There are so many caches in Firefox already, JSON files certainly aren't the place for these images.


Nobody is blaming the user for the 300Mb. It's just a factor we need to take in consideration.

But yeah, storing favicon in Session Restore would be pretty bad. I didn't remember that it was the case, though.


I don't see favicons too, but I've just did one "image extraction" attempt:

https://news.ycombinator.com/item?id=12569999

700 KB of binary images in a 1.7 MB session file, which can be compressed only to the 70% of its size.

I also see a lot of things like \\u0440 which spends eight characters for one unicode character (in another file, not from me). But that file was reduced to 37% of initial size with LZ4. It seems LZ4 is still worth doing, if the content remains easily accessible with the external tools, e.g. lz4cli.


As an Fx user, I'm glad that favicons are stored though. The icons are a much easier, quicker indicator of what the tab is, when I'm scrolling through dozens of tabs to see which one to click and load.


There's no need to store favicons in the session JSON. They're stored in the browser cache. If the cache gets cleared in the meantime, they can be redownloaded.


Agreed. If my computer crashes (once in a blue moon) I would be happy for it to just open the urls again. I don't care whatsoever about it keeping the current state of every tab. If any users do care about that, it should be optional (perhaps a tickbox in the preferences window).


My FF has crashed many times on OS X and while it does restore the URLs, it has never restored the state. I know it because sometimes I am in the middle of writing a comment and it crashes. On recovery, the comment is gone.



You could at least just compress the data with LZF or any of the really fast text compressors.

It'll compress to about 30% of the size, it's easy to do, and it shouldn't add more than a tiny CPU overhead over formatting the JSON itself.

It solves half the problem with like 15 minutes of work.


15 minutes of coding, perhaps.

Then an hour of writing good tests.

Then lots of manual and automated testing on four or five platforms, and fixing the weird issues you get on Windows XP SP2, or only on 64-bit Linux running in a VM, or whatever.

Then making sure you don't regress startup performance (which you probably will unless you have a really, really slow disk).

Then implementing rock solid migration so you can land this in Nightly through to Beta.

Then a rollback/backout plan, because you might find a bug, and you don't want users to lose their session between Beta 2 and Beta 3.

Large-scale software engineering is mostly not about writing code.


> regress startup performance

No, for example, LZ4 is unbelievably fast:

https://github.com/Cyan4973/lz4

almost 2 GB per second in decompression!

I've just tried compressing some backupXX.session file (the biggest I've managed to find, just around 2 MB) and it compressed to 70% of the original, probably not enough to implement the compression -- and I suspect the reason is that the file contains too much base64 encode image files which can't be much compressed?

So the answer to having sane session files can be first to stop storing the favicons (and other images(?)) there? I still believe somebody should analyze the big files carefully to get the most relevant facts. For the start, somebody should make a tool that extracts all the pictures from the .session file (should be easy with python or Perl, for example), just that we know what's inside.


So "somebody" was me at the end, I've rolled up my sleeves and extracted the damn pictures: in my 1.7 MB session file approximately 1 MB were the base64 encodings of exactly 57 jpeg pictures that were all the result of the Google image search (probably of the single search). There were additionally a few pngs, representing "facebook, twitter, gplus" images and one "question mark sign" gif too.


Thank you for doing that. This explains much.

For a while now I have been running a cronjob to commit my profile's sessionstore-backups directory to a git repo every 5 minutes.

This is because, occasionally, when Firefox starts, it will load tabs in some kind of limbo state where the page is blank and the title is displayed in the tab--but if I refresh the page, it immediately goes to about:blank, and the original tab is completely lost.

When this happens, I can dig the tabs out of the recovery.js or previous.js files with a Python script that dumps the tabs out of the JSON, but if Firefox overwrites those files first, I can't. So the git repo lets me recover them no matter what.

What I have noticed is that the git repo grows huge very quickly. Right now my recovery.js file is 2.6 MB (which seems ridiculous for storing a list of URLs and title strings), but the git repo is 4.3 GB. If I run "git gc --aggressive" and wait a long time, it compresses down very well, to a few hundred MB.

But it's simply absurd how much data is being stored, and storing useless stuff like images in data URIs explains a lot.


If I understood the intention of the programmers, they simply want to store "everything that can imitate to the server the continuation of the current session" even after the new start of the Firefox (like the restart never happened). The images were sent by Google, but probably remain in the DOM tree which is then written as the "session data" or something like that.

Like you, I also observed that exactly the people who depend on the tabs to "remain" after the restart are those who are hit by the bugs in the "restoration" and as I've said, I believe the users would more prefer to have "stable" tabs and URLs than the "fully restored sessions in the tabs" when all the tabs fully randomly (for them) disappear. Maybe saving just the tabs and URLs separately from "everything session" would be a good step to the robustness (since it would be much less data and much less chance to get corrupted) and then maybe, pretty please, an option "don't save session data" can be in the about:config too)?

Once there's decision to store just the URLs of the tabs as the separate file, the file can even be organized in a way that just the URL that is changed gets rewritten, therefore making the "complete corruption" of the file impossible and also removing the necessity for Firefox to keep N older versions of the big files (which then eventually still don't help the user like you).


> Maybe saving just the tabs and URLs separately from "everything session" would be a good step to the robustness (since it would be much less data and much less chance to get corrupted)

Yes, that would be very, very useful. I can get by if the tab's scroll position and favicon and DOM-embedded images--and even formdata--are lost. But if the tab itself is lost, and it was a page I needed to do something with, I may never even realize it is gone...


They already have Google's Brotli imported, they'd only need a small tweak to also include the encoder. Or use Snappy which is also in the codebase already.

Add the code that's able to load compressed session backups and leave it in for a couple versions.

Once enough versions have passed enable the code that writes compressed session backups.

It's really not that hard to do unless you want to enable it now.


Actually, we had a prototype doing this. In the end, we didn't because it broke some add-ons, but it might be time to try again.


Broke some add-ons that read or, worse, write data behind the browser's back, you mean?

Also, what do these add-ons do? The only use case I can think of is figuring out whether the user has a tab open to a given site, and that's going around the browser's security model, so breaking that would be a good thing.


You also need to handle all corner cases where one of the intermediary diffs gets corrupted (you can't generally assume in a program like Firefox that data you write out is going to be readable in the future because lol common hardware). Or where the diffs are larger than a fresh snapshot. And you absolutely can't get it wrong.

It's something that sounds easy until you actually try to get it coded up and shipping.

I explained below that this thing isn't a factor for Android because the program gets suspended.


In the general case that's a huge pain. In this case, where you're writing a single blob to disk, you can stick a checksum at the end and you're good to go.


If you read the bug discussion, you'll see this is one of the most problematic parts.


This bug? https://bugzilla.mozilla.org/show_bug.cgi?id=1304389

It doesn't talk about the difficulties in getting data safely to disk. There's just a worry that taking a hash of the entire session state is expensive. I'm skeptical that a fast hash would take long compared to the time spent serializing to JSON in the first place, let alone time spent diffing.


Firefox does generally assume that the filesystem is reliable (as long as you use fsync properly etc). Witness eg its reliance on SQLite for data storage.


SQLite is used exactly because it is so robust even on bad hardware. And you still need to check the data that SQLite is giving you.


Until you have a NFS homedir...


On the other hand, SQLite with concurrent local and SSHFS-remote updates (where the SSHFS server is run with caching disabled) is entirely stable.


I'm curious, could you expand on this comment? Do you mean that you run Firefox from a profile loaded over SSHFS? I can only imagine that being unbearably slow because of I/O latency. And shouldn't disabling caching in SSHFS make it much, much worse?


Does the lack of proper locking on NFS matter when you're only running a single instance of firefox?


As Microsoft have discovered in their binary patches through Windows update. If you've ever run sfc /scannow and then dism /online /cleanup-image /scanhealth then you are almost certainly seeing corrupted binary patches.


Long time Netscape/ firefox user here:

1. Thanks for your work.

2. How do I really contribute? Could you link to what I need to do to start working on this now? I'm affected by this problem ame want to figure out if I can fix it.


http://whatcanidoformozilla.org is a good starting point.


"Please enable JavaScript in your browser!"

That advice was a lot simpler than I thought it would be.


Why writing to disk if firefox is idle and its state doesn't change ? Is it repeatedly rewriting the same data over and over in a storage working like a log database or journal file ?


Unless there is a bug, Session Restore is not written if Firefox is idle. Of course, many applications are not idle even if you're not using them, and with the current architecture, just one tab using, say, DOM Storage, or updating a session cookie or a hidden form field, is enough to force a rewrite.


As an heavy user of session restore and tons of open tabs, I would be OK ignoring updates to tabs that have not been active for a while.

For example, let's say I have 10 tabs open but I have been using only the latest tab during the last 5 minutes. In this scenario I don't care about the cookies (eg due to ongoing ajax) nor the state of the other 9 tabs. If the browser crashes I'm OK with those 9 tabs being restored with a 5 minutes old snapshot.

So for instance FF should be smart to save the state of new/recent tabs, and should slow down progressively on old/inactive tabs.


don't generalize your personal use case


What would an updated version of Session Restore look like? I mean, what would be your vision for a new version?


I haven't worked on Session Restore in some time. But from the top of my head, it would be as follows.

On the back-end, I believe that Session Restore should be backed by a database, with each tab updated independently, rather than a big bunch of JSON data. The rationale being that:

- it's performance-critical;

- it's safety-critical;

- we have users with 300+ Mb of data in their Session Restore and JSON isn't meant for this scale of data;

- we wouldn't need to rewrite x Mb of data every 15 seconds, just a per-tab update;

- if we're using a relational database, it would be easier to trust the code to not screw up with the data;

- we wouldn't need to load the entire Session Restore upon startup.

(On the minus side, this might make backups a bit more complicated.)

On the front-end, we would need a high-level API for Session Restore, which would let us do things such as accessing per-tab data, (de)hibernating tabs, etc. Oh, and it would need to be accessible by WebExtensions.

In the middle, we would need to re-engineer Session Restore to make sure that we don't need to maintain this huge object representing the entire state of the session. We would also need improvements e.g. to cookie management, to avoid having to re-collect cookies so often.


Using a SQLite database for Session Restore seems to make sense. However, it would make it even more difficult to get data out of it when necessary. As ugly as a mess of JSON is, at least it's text, and I don't have to write a SQL query to grep for URLs.

Why not just use the filesystem? Imagine a layout like this:

    + profilename.default
    |--+ sessionrestore
       |--+ <timestamp>
          |--+ window1
          |  |--+ tab1
          |  |  |--- cookies
          |  |  |--- formdata
          |  |  |--- title
          |  |  |--- url
          |  ---+ tab2   
          |     |--- cookies
          |     |--- formdata
          |     |--- title
          |     |--- url
          ---+ window2
             |--+ tab1
             |  |--- cookies
             |  |--- formdata
             |  |--- title
             |  |--- url
             ---+ tab2   
                |--- cookies
                |--- formdata
                |--- title
                |--- url
One directory per timestamp, containing one directory per window, containing one directory per tab, containing one file per data type. All plain-text. Change a tab, just write to that tab's files, and update the timestamp directory name. Every x minutes, make a new timestamp tree, keeping y old ones as backups.

No serializing and writing entire sessions to disk at once. Only small writes to small files. All plain-text, easy to read outside of the browser. Easy to copy, backup, modify, troubleshoot. No complex JSON, serializing, or database code. Use the write-to-temp-file-then-rename-over-existing-files paradigm to get atomic updates.

Need to know if part of a session on disk is stale? Check the mtime for that file, see if it's older than the last time that data was changed in memory.

Simple stuff. Straightforward. No overengineering. No bloat.

Why not do this?


Not cross platform, without a tonne of extra work that, surprise surprise, things like sqlite already take care of. Overengineering is ignoring the off-the-shelf part that solves your problem because you "know" you could do it better yourself, in a simpler fashion.

It's a non-issue anyways, I have yet to see anyone show any actual evidence of an ssd dying prematurely or suffering degraded performance from this behaviour.


what operating system does not have filesystem support? :(


Have you considered minimum FS Cluser/Blocksize? Older versions of IE used to write cookies in plaintext files. So now you're wasting 4096 bytes for a few bytes of storage.


How does that much data get into Session Restore? Mine are only about 3KB per tab, and Firefox starts melting at more than a thousand tabs.


I seem to remember that DOM Session Storage and Session Cookies are pretty big offenders. Other websites (e.g. Google Docs) use lots and lots and lots of hidden forms to remember lots and lots and lots of data.

Also, people who keep Session Restore tabs open (you know, the tab that lets you restore your session, when you have crashed) and continue browsing – Firefox needs to store several nested Session Restore JSON files.


I've always been impressed by that, because it's essentially bullet-proof.


I wish it were, but every now and then, I restart Firefox and get empty, title-only tabs that go to about:blank when I reload them, completely losing the tab. I have to use a Python script to dump the session data (which still appears in the JSON file, even though Firefox won't use it), eyeball the list to figure out which tab was broken, and reopen that URL manually.


OTOH SSD lifetime is rather safety-critical for many users...


_Safety critical_? I doubt that...


How can I disable it completely?



I think you are safe to rewrite it. It is not very reliable anyway as I have lost my tabs after crashes more than once. It might have been possible to save them but after a crash first thing I would do is restart Firefox and it would sometimes just completely overwrite the recovery.js.

To mitigate this I have created a daemon to copy the file to a separate directory every few hours. I then delete those old versions manually every year or so.


I have had the same problem. I finally made a git repo in the sessionrestore directory, setup a cronjob to commit it every 5 minutes, and then use this script to dump the session contents when necessary:

https://gist.github.com/alphapapa/bc7f3b25025fb99dad56

The git repo grows in size rapidly, but can be compressed way down. (Today it was 4.3 GB, but "git gc" compressed it to 230 MB). Every now and then you can blow away the repo or filter-branch to get rid of outdated sessions.


Why not leave the existing code (SR1) as is and rebuild Session Restore 2 (SR2); keep using SR1 until it has nothing to restore and then use SR2 moving forward.


That sounds pretty hard to pull off. What's the expected benefit?


That's a pretty standard M.O. for server-side software development. But in the case of desktop software such as Firefox I see no gain.


> Session Restore is rather safety-critical for many users

Maybe you could give users the option? (until it can be fixed fully)


That's what the interval config setting is for?


I think brador is referring to the options page.

about:config shows a big warning message "This may void your warranty".


Which warranty?


Have you looked at about:config recently? :)


Woosh :-)


That's not SSD-only problem. HDD affected too. FF UI is just freezing when using HDD (linux; tried WD RE4, 7200rpm or WD blue 7200 rpm), several years ago I wrote script for myself to keep profile data on RAM disk and sync back to HDD, and problem gone.


Session Restore is already a little buggy. Better than it used to be certainly, but it occasionally and apparently irreversably turns the odd window into a window full of empty New Tabs rather than their original contents.


link to bugzilla?


I have been running Firefox for a long time with an LD_PRELOAD wrapper which turns fsync() and sync() into a no-op.

I feel it's little antisocial for regular desktop apps to assume it's for them to do this.

Chrome is also a culprit, a similar sync'ing caused us problems at my employer's, inflated pressure on an NFS server where /home directories are network mounts. Even where we already put the cache to a local disk.

At the bottom of these sorts of cases I have on more than one occasion found an SQLite database. I can see its benefit as a file format, but I don't think we need full database-like synchronisation on things like cookie updates; I would prefer to lose a few seconds (or minutes) of cookie updates on power loss than over-inflate the I/O requirements.


If my memory serves correctly, Session Restore doesn't use sync()/fsync() at all.

Edit Double-checked in the code: I'm right, it doesn't.

Edit If you're interested in how Session Restore improves the safety of its writes without using sync()/fsync(), the details are on my blog: https://dutherenverseauborddelatable.wordpress.com/2014/06/2...


So you're relying on the implicit write barrier many filesystems supply that ensures that the contents of a file hit the disk before a renaming of that file does. That's adequate, but I'd personally checksum the file instead of relying on corruption causing JSON to become structurally invalid.


Well, it's better when this write barrier happens (which is not true on ext4, iirc), but we rely mostly on the the fact that operating systems flush automatically within 20 seconds (I don't remember the actual numbers, but I'm almost sure it's actually within at most 5 second in practice).

Checksums/signatures would be more precise, indeed.


> I have been running Firefox for a long time with an LD_PRELOAD wrapper which turns fsync() and sync() into a no-op.

For completeness, this is probably the eatmydata command.


I don't think the use of SQLite is a problem, since it can be configured to be very careful or very reckless.

http://www.sqlite.org/pragma.html#pragma_synchronous


It's a problem if it's not configured appropriately. Remember the use of SQLite is entirely internal to the application; it's not for users to know how it's used, or even that it's used at all.

In the case of a Chrome (many versions ago; may have changed), we manually tweaked that pragma in all our users profiles. I recall something like an order of magnitude reduction of pressure on the /home server.


Would you be willing to post a how-to or example of doing this? I'm curious about how this trick works!


It's probably best to just disable sessionstore if you don't want it, as overwriting fsync would disable it for all operations Firefox might use, not just sessionstore.


That's exactly what I wanted to achieve; I didn't have the time or inclination to work out exactly which part of Firefox is responsible.

In my case it was not SSD wear I wanted to reduce, but the short pauses during interactive use.


Take a look at libeatmydata. I'm doing something practically identical


Serious question: Is 12GB a day really going to make a dent in your SSD's lifespan? I was under the impression that, with modern SSDs, you basically didn't have to worry about this stuff.


Samsung's low cost line is under warranty up to 40TB/year. Tests have shown that the actual drive lifetime is several times that.

So basically, no, you don't have to worry about this. Essentially, people are complaining that the disks they bought are being used for their intended purpose: making sure your data doesn't disappear on a power loss.


Not everyone buys Samsung, there are a lot of cheap SSD's out there that are rated for only a handful of TB's a year.

There is also a case of the silly SSHD's or w/e which combine a tiny SSD and a mechanical component which can die considerably faster.

Overall the amount of writes an SSD can handle is tied directly to it's capacity since it determines the amount of average writes per a given time unit that each cell will experience as long as the firmware actually does it's job (in Samsung's case it may very well not do it as they had more than a few firmware screw ups in the past).

This can also more severely affect you if you have a medium sized drive (say 128-256GB) but it's pretty full.

A good SSD's will move things around somewhat to even out the wear but it still increases the amount of writes you have.

If your SSD does have an "even wear" feature when you write 12GB to drive which is say 50% or more full it usually triggers more than 12GB of effective writes in the drive itself as the firmware would move already written data around to ensure that every cell has been written to approximately the same amount (if you say only have 20GB of free space and you write 12GB daily and delete it, your effective writes would be closer to the full capacity of the drive per day than to 12GB).

You also have a lot of embedded/mobile devices which use considerably shittier storage than modern SSD's, you can easily chew through a SD card or a similarly rated flash storage with anywhere between a few 100's of gigs to only a handful of TB's (or a only few times the capacity of the card if buy bargain bin memory cards).

Since these machines are more commonly used these days it's not a feature that should be so quickly obfuscated from the user since it doesn't seem like there is an "intelligence" behind how conservative or not the browser is with writes to local storage.

If you install firefox on an SBC/netbook/microcomputer or even a mobile phone (if this affects the mobile version also) with eMMC/MMC only rated storage 12 gigs a day can and will kill it rather quickly.


Android Firefox has this set to 10 seconds, rather than the 15 on a desktop.


Android suspends backgrounded applications, and will kill them at will, so that actually makes a lot of sense: you have less risk of data loss, and the write issue doesn't exist unless you leave the browser in the foreground 24/7 (and the phone unlocked with the screen on etc).


I think there's something wrong with OP's Firefox. Looking at IO_WBYTES in htop I have exactly 42MB written after 2h11min of operation with 5+ tabs opened at any given time. I have my sessionstore.interval set to 1min. Assuming that if it ran 4x faster (default) it would write 4x more it should be 168MB in 2h of operation, whereas OP reports 1GB every 2h.


5 tabs isn't very many. If the OP has 40+ tabs open at a time, following that math, it should hit 1GB pretty easily.


I don't know what this translates too for a mobile device, I don't know if it stores the same data, and how does the fact that mobile sites tend to be light affects it.

Other things like add ons, amount of tabs open and etc. could also affect it.

Would be interesting to see if some one has SBC's with either SD or onboard eMMC rated storage to run this test and see how much is being written too and calculate just how much of the lifespan of the storage is being reduced due to this.

IIRC even the iPhone 6s still uses TLC flash for it's storage, TLC is only good for about 500-1000 P/E cycles, so it needs pretty good capacity overhead and well "TLC" to live long.

In comparison SLC gets you over 100K P/E cycles these days, MLC can get to over 30K.

Also if anyone knows if mobile phones do wear leveling for their eMMC/MMC storage I would love to know that ;)


And Pale Moon (Firefox fork) to 60.


Firefox's session restore code was recently rewritten. Given that Pale Moon is based on an outdated fork, it's not clear the behavior is comparable, or that the setting means the same.


this was a very insightful comment, thank you!


No, "people" are not complaining about that; it's a fine attempt at a straw man, though. What the article complains about through implication is that Firefox is unnecessarily writing too much data to disk.


Why would it be unnecessary? The amount of data follows rather logically from the amount of website state it's capturing, and the interval seems totally like personal preference to me. Would you rather lose 60 seconds of work or 15? Uh, I'll take less data loss.


The problem isn't that data is being written (that's desirable, in fact). The problem imo is that data is being written unnecessarily.

Firstly it shouldn't be backing up every "N" seconds. It should be backing up after every change, and for performance reasons coalescing changes in time interval bins (eg 1-5 second blocks).

Secondly, it should only be writing the information that changes. Sure websites contain a lot of data, but most of it doesn't change very fast. So it's a big win to only back up the deltas (and reconstructing it is fast on SSDs).

If a page does have lots of transient data... write lots of data! That's partly how the time coalescing parameter is tuned.

Ideally it would only write out the entire page of data when it loads, or when the size of the deltas exceeds some threshold (possibly different for HDD vs SSD).


If the data hasn't changed during the last interval, then it doesn't rewrite the file. The article mentions loading a couple of dozen pages that are probably all pinging a dozen ad servers every 15 seconds, updating their cookies and localstorage with the results, etc. It all adds up. If you really want to solve the problem, use an ad blocker, or push the pages you visit to remove the bloat.


In a web browser?

You're acting like the browser is a major local storage application (like excel for example), almost everything I'm going to be working on is an online application, and it takes care of storing my data on the server infrastructure. So yea, 15 seconds or 60 seconds isn't going to be a major 'data loss' issue either way.


it takes care of storing my data on the server infrastructure

Do you manually hit the save button every 15 seconds?

Some webapps do this for you. Some do not. Firefox has got your ass covered even for those that do not.


You apparently haven't seen some of the SPAs that are in use out there... I can very validly consider an issue with losing even a minute vs 15 seconds being considered in favor of the latter.

It might be nice if this was tunable via a setting though.


Why would it be unnecessary?

Because you hear the data volume, the function, and you think there has GOT to be a better way


what information would you lose exactly?


If you can get a petabyte of r/w out of an SSD then 12 GB/day (of writes only) will exhaust the device in 228 years (1 PB * 1000 TB/PB * 1000 GB/TB / 12 GB/Day / 365 Days/Year).

If there is also an associated 12 GB/day of reads then half that estimate (unlikely to be equal due to the nature of the cache). Either way -- not a significant degradation.

Edit: Degradation or not -- i'd personally rather have my drive resources available. In the context switch from firefox the cache process could interfere with loading other data.


Times of samsung 830 using 27nm SLC are over, its <1000 cycles 16nm TLC now.

Drives that reach >1PB or write endurance are no longer on the consumer market. You can buy them (SLC), but you wont after seeing the price.

You can do nice tests that will last for hundreds of TBs .. until you pause for 24 hours and realize all the data is gone.


It was the 840 Pro that went 2.4 terabytes during a drive test that ended in 2015[0]. That was a 21nm tech. I think the endurance is better than you think.

[0]http://techreport.com/review/27909/the-ssd-endurance-experim...


afair this test was run 24/7 with no power down time. Earlier test (the one with 830? I think he moved computer mid test and had to power everything down) showed drives look healthy until you let them sit offline for a while, then it turns out all data evaporated.

840 is especially broken and dramatically slows down when you dont touch the data for >month. The "fix" is firmware silently rewriting everything in the background burning thru the rewrite cycles.


Some cheap drives (e.g. Crucial BX100) have an endurance rating of only about 40GB/day.


So even with a cheap drive, if you leave your browser open with tons of tabs, and the PC on 24/7, it'll still easily handle it?

Real lifetimes are often a multitude of the endurance rating, BTW.


You're still using 1/4 of your write unendurable for your fricking web browser. And exceeding the endurance rating can make you ineligible for warranty service/replacement of the drive.


I dunno but I think plenty of people consider their web browser a pretty vital application.


> I dunno but I think plenty of people consider their web browser a pretty vital application.

No one is arguing against that point. However, most people do not need their browser state to be needlessly backed-up that aggressively (every 15 seconds - which is about the same time it took me to type everything before this sentence). I'm OK with losing up to 20 minutes of my state in a browser. Most work done in a browser is usually synced to an upstream browser every few minutes - auto saved emails or multi-step web-apps.


And when you just spent 10 minutes finding the link you need, and your browser crashes, you're going to spend a lot more time again. There are plenty of cases where I'm really glad my browser re-opens all the tabs, with cookies in play etc.

If your job includes having to use a web application for significant portions of said job, it's even more invaluable. I've worked on several applications that account for more than 40% of some employees' time per day... This was back in the IE6 days, they bemoaned having to close/reopen once a day because of memory issues (IE6 had horrible GC across COM boundaries).


That link you need & just found is saved in your history, isn't it? Session Restore is different from history, I'm pretty sure.


TBH, I wouldn't even have thought about searching history... also, every change for every window would save history requiring a sync... And for some things, cookies, sessionStorage, localStorage, websql and the like are pretty important too.


The web browser is basically everything I do, though. My email is in it, my code review is in it, my IM is in it, on some of my machines my terminals are in it (via the Chrome mosh app).


Email/code review/IM/terminal doesn't justify 12GB of writes per day (and attendant hit to drive endurance/battery life). Those are lightweight activities. Emacs can do all those things and it doesn't write to disk unless you tell it to. Heck, even bloated beyond belief Outlook + Visual Studio + Skype + PowerShell wouldn't create anywhere near that kind of background activity.


What does "justify" mean here? My web browser does them in a lot more useful fashion to me than Emacs would - in other words, it is worth 12GB of writes per day for me to get these into my web browser instead of into Emacs. Am I wrong for thinking that this is a good tradeoff?


Did you actually decide to make this tradeoff? Did Firefox at some point explain it was going to do 12GB of writes per day and you clicked the button that said yes?


Those applications have the benefit that they know exactly what the user data is, what their own state is, and how much of that is due to user input. The browser doesn't have that advantage. For one, state is coming off of the network all the time.


Anybody buying a drive so cheap it can't handle Firefox is probably putting the drive in a machine so cheap that it will only be lasting 3-4 years maximum anyway.


The 56GB and 120GB version of the Intel 535 SSD have even less, just 20GB of host writes per day [1].

[1]: http://www.intel.com/content/dam/www/public/us/en/documents/... section 2.6


Anecdote; I have had one of the OG intel x-25 80GB MLC SSD drives and after 5+ years of usage first as a boot drive and now as a video editing scratch disk the wear leveling firmware has done an excellent job. Intel's monitoring software still claims the disk has 95% of it's write capacity remaining and I have not observed any changes to it's benchmark IO performance over the life of the drive.


You're lucky... mine died after the original 1 year warranty (a few days over, it detected as 10MB drive or some such), I replaced it before they had a fix for the issue... that said, every drive I've had since has been very solid, mostly using Samsung drives these days.


That has nothing to do with write endurance; it's a known bug in the firmware.


I know that it's (now) a known bug... unfortunately it wasn't known when it happened to me, and I'd long since replaced it when I found out.


The capacity error is fixable by using hdparm to secure erase the drive which as a side effect restores it’s original capacity.

You still lose all your data though :(


No, 12GB/day isn't going to make a dent.


Also, does this affect hybrid disks?


Hybrid disks are likely a much bigger issue, given the really small amount of SSD space and its' use as scratch cache.


Doing all this work is also probably burning battery life. An SSD can use several watts while writing, versus as low as 30-50 milliwatts at idle (with proper power management).


Can confirm closing chrome typically doubles my battery life.


+1. This is why I use PSD (mentioned above) to mitigate this. A RAM cache will use much less power than constant disk writes.


Even better just disable session restore entirely - Browser.sessionstore.enabled - Since Firefox 3.5 this preference is superseded with setting browser.sessionstore.max_tabs_undo and browser.sessionstore.max_windows_undo to 0.

As I understand this feature is there so if the browser crashes it can restore your windows and tabs - I don't remember having a browser crash on me since the demise of Flash.


> I don't remember having a browser crash on me since the demise of Flash.

Just try browsing reddit with the RES extension or streaming heavy data for a few seconds (like a server side PDF processor that takes more than 30 seconds to finish, it's the same old behavior of blocking JS dialogs from years ago, white screen of death). When I'm using a iGPU (512MB) and I open some heavy images in sequence the browser crashes after a certain amount of time but this time it freezes the entire computer for a few seconds, so I suspect it's OOM (iGPU). I never bothered looking for a solution, nowadays I don't even care when my browser ocasionally crashes. My overall desktop (software) experience gets worse every year so I have gone used to it.

But Firefox still is my favorite browser and I wouldn't trade it for anything else!


> RES extension or streaming heavy data

Yeah that's the difference - I only use few well known extensions like uBlock Origin, Self Destructing Cookies, Lastpass and Decentraleyes. No trouble with any PDFs either. Anyway this is getting off track with the crashes or not - my point was even if it crashes it is no big deal if you can't restore - just reopen will work in most cases.


> I only use few well known extensions

RES is more well known (in terms of number of downloads/users) than 50% of your listed extensions, for the record.


Replace Browser Crash with: Accidentally pressing Ctrl-w

And yes, there are browser crashes without Flash


Why should it save the whole page when it only needs to remember the URL? At most, maybe also any form data you have entered. Small enough to keep thousands of pages in memory.


It doesn't save the whole page. Just the url, scrolling position, form data and session cookies. If you're browsing with 500 tabs, most of which contain nested frames, including Facebook "Like" icons and the like, that's actually quite a lot.


And what if the server sends something different back when you visit the URL again?

The internet allows it to do that.


The job of a browser is to view the internet. Not to cache it.


It needs to save enough state to give you the same view.


Go disable your browser cache and try browsing the internet for a while and tell me if you really believe that.


Hmm I really haven't had FF or Chrome crash on me since a long time - I do browse a lot. I can't be that lucky. I am not saying it doesn't happen - I am saying it's so infrequent that it's just better to deal with it when it happens instead of killing your SSD with constant writes of 15GB per day. Most people don't do anything that needs to be recovered by a crashed browser.


I don't remember the last time Chrome crashed. Rarely one tab crashes (usually due to Adobe Flash).


Ctrl-W is not that bad, but I sometimes want to close tab by Ctrl-W but instead hit Ctrl-Q...


I've been setting browser.showQuitWarning to true for years to stop ^Q from killing the browser and with it my state.


Could have sworn that has crap all effect if you also have Firefox save your tabs on quit.



I use https://addons.mozilla.org/en-US/firefox/addon/cmd-q-catcher... to force me to hit it twice to actually quit. Dunno if it works on platforms besides Mac though.


Thanks, I do that all the time. Installed.


Or learn Dvorak. The Q and W are pretty far apart.


You're right. One second thought, learning Dvorak is the easier solution.


Ah that's what I mean


Who really cares about restoring browser state? Maybe I lose part of an HN or Facebook post I'd been writing, but that's it. The feature should be opt-in, IMHO.


Hadn't thought about it until now, but this feature may be why I don't see "I wrote you a long message but accidentally closed the tab / hit the back button / my computer crashed and I lost it, so here's the highlights I remember" emails from friends and family anymore. I would much rather discover this caveat and opt out than expect my grandmother to know it's even possible to opt in.


At this point, isn't that responsiblity larged handled by server-based autosave? GMail / Google Docs being the big ones, I assume FB and other social networking sites do the same.


Also, couldn't this session-restore feature be triggered for this specific tab _after_ I've started to type text in a box?


I certainly care about it, although it's possible that my 50-odd browser tabs would be considered "clutter" by other people.

I stopped using Edge because it only saves your tab session if you turn off the PC, not if you quit the browser.

Mind you, for a while I ran Firefox using a RAMdisc for the profile simply because all that flushing makes a big negative impact on PC performance.


I also keep 50-100 tabs open at all times. How do you manage your tabs?


You're in good shape. I'm at 337 tabs. ;-)


I made a simple firefox/chrome extension for people like you and me that horde tabs. You might find it useful to find tabs and quickly navigate to them by clicking on the link in the list. It's free and open source. The github page has a gif showing usage.

Chrome Extension: https://chrome.google.com/webstore/detail/tabist/hdjegjggiog...

Firefox Extension: https://addons.mozilla.org/en-US/firefox/addon/tabist/

source code: https://github.com/fiveNinePlusR/tabist

Let me know if you find it useful or have any suggestions.


Your chrome link is broken.

Anyway I used an extension like this for awhile, called session buddy. But now I just use control+D and bookmark all my tabs. Then you get bookmark syncing, and you can save your tabs when you export bookmarks, or wget them to have a local archive, etc. It's just more convenient, and one less extension I have to worry about stealing my personal information or bugging and corrupting my data.


That's fair and the extension is not for everyone. It's not really meant for saving sessions or anything like that it's more for finding lost tabs in a sea of windows like I typically have. This is more for tabs that are not worth saving to bookmarks but that you have open for a long time as a transient bookmark.

FWIW, Mozilla looks at all extensions before fully approving them and the app is fully open source and fairly easily vetted if people are worried about that sort of thing.

Thanks for the feedback!


this is brilliant. P.S. your chrome link is broken.

On suggestion - please give a button to save all of this as a "session" or something. I'm a tab hoarder and sometimes I just want to dump it somewhere and start afresh.

Please help me !


Thanks! the link is fixed now... I actually did just this thing the other day but didn't know if other people would want it. I will update the addon with it.


if you could do one more thing - allow me to search within the "sessions". that would be awesome (search should work both on domain and title)

thanks!


I had about ~250 tabs and i closed them now. Bookmarked before. This "feature" of FF is rly cucking. I have tons of GB written to my SSD without any benefit (i not a "safety critical user"). And the worst thing is, all of this data was written into 40 GB of SSD space.

this session restore feature, lets just save the urls in a list. that data structure cant be GB of GB right! ;) I think thats enough for the, lets say, typical FF user.

Now, that i know about this issue, i will bring down my tab usage to no more than 10 tabs.


I really care about it. I take advantage of it everyday. I close tabs and use ctrl+shift+t to reopen them, I reopen my browsers previous state after closing it, and after computer crashes I don't have to worry about losing any of my precious tabs.


Yeah, if you only write on Fb or HN I see why you're not missing nothing


Other things than a browser crash can cause the browser to unexpectedly quit. For example, a battery dying, or the power going out.


Right but that doesn't happen quite often either. Also if it does how important is restoring the tabs and windows? Probably not to a lot of people other than serious creators that use a web editor of some sort that'll lose work if the browser disappears. Even then it is flaky to rely on something like this. Much better to have periodic backups of whatever you're doing.


You say "doesn't happen quite often", which I'm assuming means something like "doesn't happen quite often to me". But if you consider hundreds of millions of users using Firefox every day, then even a relatively rare event is happening to multiple users every single day. It is absolutely worth it to them that Firefox does its best to preserve their data.


You have a point, but power outages are very correlated.


Sad that the crash feature (which I don't particularly like or use) is directly tied to the undo close tab feature (which I do).

I don't like the anxiety of opening Firefox in front of someone, and wondering if my previous tabs will pop up.


It happens rarely to me, but it sometimes does. Also, your machine can crash, someone can trip over a power line, ...

At work, I routinely have something like 30-40 tabs open, if my browser can restore the session after a crash, that is very valuable to me.


I actually find that webGL-heavy sites like Google Maps are quite good at crashing FF.


Yep, StreetView in particular. And don't get me started on Shadertoy, which seems to make "crashing FF" its mission statement.


I need to regularly kill firefox. I just did in fact. It didn't technically crash, but it becomes totally non-responsive for seconds at a time. Maybe it is a misbehaving plugin and not ff, but the effect is the same.


It'll also cover OS crashes or power failures. Agreed that those aren't very common either.


I haven't had a browser crash for a while, but fairly routinely Windows will reboot itself to apply an update. (Once every other month, at least)


Thanks for the tip.

I'll see if it has any affect on my laptop's battery life and fan usage.


It happens relatively often for me, maybe 1-2 times a month. No crazy extensions.


It’s always annoying when an issue like this is reported yet no bugzilla reports are mentioned. Has anyone else filed this already, or shall I?



If it's not already listed there, please do (and post the URL here!)

I was thinking the same and will do so myself, if no-one else reports doing so.


12GB/day is about 140kB/second, or one Apple 2 floppy disk every second.

It also is about single CD speed (yes, you could almost record uncompressed stereo CD quality audio all day round for that amount of data)

All to give you back your session if your web browser crashes or is crashed.

Moore's law at its best.


I've already moved all my browser profiles to `/tmp` and set up a bootscripts to persist them during boot / shutdown. E.g. for Arch Linux see https://wiki.archlinux.org/index.php/profile-sync-daemon

This is a far superior solution to fiddling with configuration options in each individual product to avoid wearing down your SSD with constant writes. Murphy's law has it such hacks will only be frustrated by next version upgrade.

And no, using Chrome does not help. All browsers that use disk caching or complex state on disk are fundamentally heavy on writes to an SSD. The amount of traffic itself is not even a particularly good measure of SSD wear, since writing a single kilobyte of data on an SSD can not be achieved on HW level without rewriting the whole page, which is generally several megabytes in size. So changing a single byte in a file is no less taxing than a huge 4 MB write.


Are these writes being sync'd to disk?

Because FF may die but the OS will save it later. That's fine

Not every write to a file means a write to disk


sessionstore does not fsync.


Maybe I am not understanding this right, but is this saying that Firefox will continually keep writing to the disk while idle? Does anyone know more about this? Why would this be needed to restore session/tabs? Seems like it should only write after a user action or if the open page writes to storage? Even if it was necessary to write continually while idle, how could it possibly consume so much data in such a short period of time?


Maybe I am not understanding this right, but is this saying that Firefox will continually keep writing to the disk while idle? Does anyone know more about this? Why would this be needed to restore session/tabs?

This is very much the feature working as intended: Firefox captures webpage state every 15 seconds, so a crash, power outage, accidentally closing the browser etc will not result in data loss for the user. For storing the data persistently, it uses the persistent storage, i.e. your harddrive. For the HN crowd, that's going to be an SSD. The SSD can easily deal with the resulting write traffic with negligible effect on its lifetime - I have no idea why people even think this is an issue.

Seems like it should only write after a user action or if the open page writes to storage

Webpages can change their contents or state without user interaction.

Webpages are also big and bloated. Storing a little data, all the time, 24/7, adds up over time to a surprisingly large number.


I understand that pages can change their state without user interaction, but shouldn't the browser only write to disk when that actually happens? The browser has to be able to detect when these changes happen right?

I understand that it's probably not an issue for most users/SSDs. However, if I open just a static HTML file and leave it idle, I don't think the browser should be continually writing to disk. If it is, then to me that is a bug.

EDIT: Also, it seems like when I remember restoring a browser session after a crash, all the tabs get reloaded, not restoring the page state, just the tabs/URLs (ie. browser makes new request(s) for each restored tab). I usually use Chrome though, not sure if this has happened to me on Firefox.


It won't write to disk with static pages. But most of the web doesn't fit that description, and you're bound to have a tab open that does something in the background.

Yoric pointed out here that not being able to save tabs individually is a weak point of the current implementation.


> but shouldn't the browser only write to disk when that actually happens

The problem is that it's happening continuously. Any page with analytics or ad engines (I'd guess easily 95% of the top visited web) is continually logging how long you've been on the page, scroll position, idle time, etc. And then setting a cookie to remember the session info/last time it was updated.

It would be interesting to see what installing an ad blocker would do to these numbers.


In Firefox, it might look like it's doing requests, however it'll show me a days-old HN newspage that it cannot possibly get from the server.


I have no idea why people even think this is an issue.

There is some element of slippery-slope. People say, "So what if your web browser uses 8GB of RAM for six tabs, you've got 16GB!". But I'd argue if every piece of software squandered RAM with such abandon, your machine would quickly choke.

It's also got some good shock value, like the one-paragraph web article that overloads your phone & generates 75MB of traffic.


Spotify does some pretty evil I/O as well. https://community.spotify.com/t5/Desktop-Linux-Windows-Web-P...


I observed something similar several years ago: http://www.overclockers.com/forums/showthread.php/697061-Whe...

I still think the worry about it wearing out an SSD is overblown. The 20GB per day of writes is extremely conservative and mostly there to avoid more pathological use cases. Like taking a consumer SSD and using it as the drive for some write heavy database load with 10x+ write amplification and when you wear it out demand a new one on warranty.

Backing up the session is still sequential writes so write amplification is minimal. After discovering the issue I did nothing and just left Firefox there wearing on my SSD. I'll still die of old age before Firefox can wear it out.


Yep, I have a brand new SSD drive that over the course of a few months accumulated several TERAbytes (yes - TERA) of writes directly attributable to the default FF browser session sync interval coupled with the fact I leave it open 24/7 with tons of open tabs.

Once I noticed that excessive writes were occurring, it was easy for me to identify FF as the culprit in Process Hacker but it took much longer to figure out why FF was doing it.


I have fixed this issue forever. I got a Thinkpad P50 with 64 gigs of ram. So, I just mount a tmpfs over ~/.cache.

I actually use a tmpfs for a few things:

    $ grep tmpfs /etc/fstab
    tmpfs	/tmp			tmpfs	nodev,nosuid,mode=1777,noatime	0 0
    tmpfs	/var/tmp/portage	tmpfs	noatime	0 0
    tmpfs	/home/zx2c4/.cache	tmpfs	noatime,nosuid,nodev,uid=1000,gid=1000,mode=0755	0 0


Fwiw, Session Restore doesn't hit ~/.cache at all. (Afaict, I'm the last person who changed how Session Restore writes its files).


Are these writes happening in ~/.cache?


I checked my system - Firefox wasn't writing much and what it is writing is going to my user directory on the hard drive instead of the program directory on the SSD, so that's nice. But still, I don't want my browser cluttering up my drive with unnecessary junk - history, persistent caching from previous sessions, old tracking cookies, nevermind a constant backup of the state of everything. I try to turn all that off, but there's always one more hidden thing like this.

If I want to save something, I'll download it. If I want to come back, I'll bookmark it. Other than those two cases and settings changes, all of which are triggered by my explicit choice & action, it really shouldn't be writing/saving/storing anything. Would be nice if there were a lightweight/portable/'clean' option or version.

When I tried Spotify, it was pretty bad about that too - created many gigabytes of junk in the background and never cleaned up after itself. I made a scheduled task to delete it all daily, but eventually just stopped using spotify.


It's overwriting a few files when the data they contain changes; it's not leaving a huge pile of unused data to clutter up your disk.


The interesting question here is, why is the browser writing data to disk at this rate?

If it's genuinely receiving new data at this rate, that's kind of concerning for those of us on capped/metered mobile connections. The original article mentions that cookies accounted for the bulk of the writes, which is distressing.

If it's not, using incremental deltas is surely a no-brainer here?


If it's not, using incremental deltas is surely a no-brainer here?

Might be surprisingly hard or inefficient given that you're trying to capture "webpage state". Think about React sites etc.


On a related note: also see http://windows7themes.net/en-us/firefox-memory-cache-ssd/

Just another firefox ssd optimization.

Edit: And see bernaerts.dyndns.org/linux/74-ubuntu/212-ubuntu-firefox-tweaks-ssd

It talks about sessionstore.


Theodore Ts'o wrote about a similar Firefox issue back in 2009: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/


i'm not seeing these numbers, using I/O columns in Process Explorer. i'm running Nightly Portable with maybe 80 tabs open/restored.


I'm not seeing these numbers either and I'm running Firefox 49. In the resource monitor, I see firefox writing to disk 10kB every 15 sec. I bet there is something else to the story of this guy.


Nightly Portable might have this particular setting changed, because the lifetime of random USB sticks is surely going to be much less than a real SSD.


i thought that too but the interval setting is at 15000 in the config, so maybe something else is optimizing it out?


Does firefox sync() the data? If not, these continuous overwrites of the same file may not even hit the disk at all, as it could all end up being cached.

Even if some data is being written, it could still be orders of magnitude lower than the writes executed by the program.

There are legitimate pros/cons of using sync() or not. Missing it out could mean that the file data is lost if your computer crashes. But if firefox crashes by itself, the data will be safe.


fsync() is a very different thing from sync(), and it's the former that is relevant here. ext3 configured in a manner that used to be popular had a side effect of making fsync() require nearly as much io as sync, which is probably where the confusion comes from.


sessionstore does not sync.


Using private mode and a RAM disk is a quick solution for this issue. Easy to setup on Linux and there is a free RAM disk utility on Windows as well.


Firefox has been terrible for disk access for many years. I remember I had a post install script (to follow, I never actually automated) that I would run through in my linux boxes back in about 2003 that would cut down on this and speed up the whole system.

Basically chattr +i on a whole bunch of its files and databases, and everything's fine again...


There's been some development the last 13 years.

Chrome was also pretty crappy in 2003. /s


well, i'd say it was crappy in 2003, it wouldn't even exist for another 5 years ;)


Oh there's been lots. Looks like it's still incredibly write-happy though.


I do wonder if their mobile version have a similar problem. I have noticed it chugs badly when opened for the first time in a while on Android, meaning i have to leave it sitting or a while so it can get things done before i can actually browse anything.


I've disabled images and JavaScript on mobile Firefox. When I'm only interested on reading I use Firefox. Otherwise I use Google Chrome.


Thats the thing, it has nothing to do with page rendering.

It kicks in about 10-15 seconds after initial launch of the browser, and basically churns it to a near halt for at least 30 seconds before it becomes usable.

Could be that this tablet of mine provokes the behavior by having a slow emmc that is nearly full, but it has resulted in me minimizing my use of a browser i have been using for close to a decade across various computers and operating systems.

And no, it has nothing to do with extensions. I still see it with no extensions active. Heck, i even see it with guest mode active.


Do you have Sync enabled?

If you can grab an adb log, come post it in #mobile on irc.mozilla.org.


Yep, i have sync enabled.

Grabbing a adb log may be a bit out of my league though.


So Firefox is also expensive to run in terms of energy consumption. No wonder the fans on my MacBook Pro always sound like a jet engine whenever I have several tabs open. Seriously!

Disclaimer: I dual boot (camp) windows 7 on my mac.


I doubt the cpu load of session restore is responsible for the fans kicking in...


Well I must say, I certainly changed my settings to 45 minutes and OMG! fans haven't spin up loudly since. Wow!


Simple solution change save time to 30s

Windows file compression cookies.sqlite => 1MB => 472KB sessionstore-backups => 421 KB => 204 KB

Move TMP cach folder to ram drive ie ImDisk


I'am sorry if I drop it, but it seems not many people know about cache poisoning. I have always kept the suggested settings since the age of javascript.


I continue to be impressed with the content and community at servethehome - it's slowly migrated its way into my daily browsing list.


Firefox is relying too much on session restore to deal with bugs in their code. Firefox needs to crash less. With all the effort going into multiprocess Firefox, Rust, and Servo, it should be possible to have one page abort without taking down the whole browser. About half the time, session restore can't restore the page that crashed Firefox anyway.


It has been months since the last crash on my machine and it has definitely improved over the years (3.0 used to crash much more frequently). And even on chrome which has been multiprocess from the beginning I've had the whole browser crash after closing a "sad tab".


IIRC per-tab e10s already exists, just isn't enabled by default yet because it has some rough edges.

The Rust part of your comment is irrelevant. There is Rust code shipping in Firefox, but nowhere near the amount needed for observable changes in behavior. Work is being done to put larger components in but nothing is shipping yet. Also, IME most crashes are asserts (in all browsers) and that's not a problem Rust fixes. If you get a segfault-based crash please report it, it could be exploitable. (report normal crashes too, really)


If process-per-tab is enabled, can each tab crash separately, or does the whole browser still have to go down?


It seems like the dom.ipc.processCount setting sets the total number of content processes to use.

Note that this means that tabs will share content processes, so a crash in one tab may crash a few others. You can reduce this by ramping up the number of processes, though that can lead to bloat.

No idea how stable this is right now. For me, on Dev Edition with regular multiprocess mode enabled, things work fine and I don't get crashes.


They should crash separately. I'm not sure of the details.


I use Opera on windows, No idea how to check or change the session storage interval.

Anyone got ideas on that?


Wow, that's really unfortunate.

I just built a new PC with SSDs, and switched back to Firefox. Even with 16GB of RAM on an i3-2120, Firefox still hiccups and lags when I open new tabs or try to scroll.

This new issue of it prematurely wearing out my SSDs will just push me to Chrome. Hopefully it doesn't have the same issues.


According to the blog post Chrome has the same issue, they just haven't investigated yet.


My home desktop is an i5-1620 with 64G RAM running Ubuntu 16.04, and Firefox just hiccups and lags, period. It is sad.

I need to find time to experiment with Chrome while blocking all the Google chatter to see how functional it remains without being able to tattle to the mothership (otherwise I won't use it).


Hiccups and lags are seen in all browsers and is due to bloated websites. Also, Chrome will write just as much to your SSD as Firefox


Why not Chromium?


I personally wouldn't recommend chromium because they've been known to do some shady things: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=786909

I'd imagine someone like the above poster would be concerned about that. At least I know I am


Try Vivaldi.


I have an old dual core with 4 gigs of ram and Firefox runs better than chrome.


Try out the Pale Moon firefox forks. I'm running it on an old core2duo 6400 (2.13 GHz) with 8 GB ram. I have over 700 tabs in my session right now with at least 300 of them fully loaded. It's using less than 3GB ram. And it's fast/snappy with smooth scrolling.


You need to enable multiprocess in Firefox, so the thing doesn't shit the bed on you constantly.

https://wiki.mozilla.org/Electrolysis


Look out for addons. Electrolysis will still not work with many of them. I force enabled it a few weeks ago, and discovered that it's disabled for good reasons.


Have about ten plugins, no issues here. It's worth actually having a browser that doesn't consume all available resource.


Nice idea but a load of useful extensions don't work with it. Of extensions I use noscript and 1password were the main ones, but around half were tagged as not compatible.


Heh. Tried that out and while it didn't affect stability it made switching tabs take 2-3s on a haswell system. By disabling all extensions with it I was able to somewhat match performance of non-multiprocess.


uBlock also keeps writing "hit counts" to disk all the time, as well as for some strange reason they've chose database page size to be 32k so each update writes at least 32kB.


Smaller block sizes caused large performance issues. SQLite isn't very good at keeping its data unfragmented, and not everyone has SSDs.


Sounds sweet, ill try it out. How is it comparing to ack (ack-grep)?



Can this be avoided in FF and Chrome with private tabs?


Using a display font for body text--


In Linux where should this be written? Inside the home folder?

Maybe moving this folder to a HDD should suffice.


Why would you not want to use your SSD for what it's meant for?


Some things are in the SSD, others on the HDD. This seems to wear the SSD while being seldomly used. Looks like a good case for HDD.


Maybe because they would rather not have their ssd constantly being hit? Seems reasonable to me to move this is you have concerns about the longevity, or maybe IO bottlenecks.


Or ramdisk if you have a stable system.


That would lose the point of a recovery system. I'd rather just turn it off.


I have easily more than 100 firefox crashes for each kernel crash, and I guess that is similar for other people(?)

So I think the session restore is almost always used in a case where firefox crashed, but the system is fine.


That goes to show how space/memory hungry and bloated browsers have become.



> goes to the point of installing weird programs to be "pro" about their ssd life

> failed to read the very first recommendation on every single guide for ssd life: use ram disk cache for browser temp files.

yeah, let's upvote this


For comparison ancient Opera Presto stores about 500 bytes per tab in Session file.


Observed similar behavior with Skype.


The whole "restore your session" thing is the one of the most user hostile behaviors there is.


Putting aside how this may not be all that bad for most SSD's, does anyone know when this behavior started?

Firefox really started to annoy me with its constant and needless updates a few months back; the tipping point being breaking almost all legacy extensions (in 46, I believe). This totally broke the Zend Debugger extension, the only way forward would be to totally change my development environment. I'm 38 and now, and apparently well beyond the days when the "new and shiny" hold value. These days I just want stability and reliability.

Firefox keeps charging forward and, as far as I can tell, has brought nothing to the table except new security issues and breaking that which once worked.

I haven't updated since 41 and you know what, it's nearly perfect. It's fast, does what I need it to do, and just plain old works.

Firefox appears to have become a perfect example of developing for the sake of.


> developing for the sake of

It's exactly the opposite of this. Have you read through the FF update notes? The web standards are a moving goalpost - constantly evolving.

On top of that they've been slowly landing incremental improvements to Electrolysis that have allowed them to turn it on for more and more users without big risks to stability. You can't do that for a browser with "big bang" releases every year.

In the last years I think it's been shown that the only model that works for browsers is "evergreen".


Much of the recent breakage was to introduce essential features that users have been loudly asking for for years, such as multi-process support and content processes in a security sandbox.

If you're intentionally running very old versions which by now have published exploits, then I can see that these are not things you would consider important.


In the end: Thank you for your comment : ) My annoyance with the plugin situation, and at first, no resolution for it, was fixed in 45: https://support.mozilla.org/en-US/kb/add-on-signing-in-firef...

I'm now running the latest version of FF, but still able to use my plugin.


Progress at what cost? Does maintaining security have to also mean breaking the systems that pay my bills?


I do not think you want to be paying your bills on a browser that's been compromised, no.


It's a dev box that only touches local resources.


This is what Firefox ESR is for IIRC.


This has been the case for a long time, I just googled and first hit was an article from 2008.

source: http://www.ghacks.net/2008/07/09/change-the-session-store-in...


Wikipedia claims the session restore features was added with 2006's 2.0 release. That's what I would have guessed, since that release was (famously) a giant step backwards in performance, which was never really addressed except by waiting for hardware to improve.

I'm holding out hope that whoever's responsible for Firefox abandoning "good, light, cross-platform web browser" as its differentiator stays away from Servo, at least for a while. That's a niche that's unserved currently, AFAIK.


I seriously dislike Firefox, but must use it at work due to browser incompatibility issues with Chrome and sites I use heavily. Anything that makes the experience better is much appreciated.


If is perfectly valid for you to dislike Firefox. But in the context of this post it seems that you assume Chrome is better than FF in terms of disc usage. Do you have any data to back that claim up?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: