Good news: Mozilla fixed their damn browser

jmillikin · on Dec 20, 2011

Not to discount the work that Mozilla has been doing, but it seems like the "improvement" they added is actually just a workaround for distros that configure filesystems incorrectly.

That is, his machine is configured to use an obsolete filesystem (ext3) in a mode with significant performance problems (data=ordered).

There are numerous ways that a machine can be mis-configured such that it suffers from poor performance. It would be better if Firefox detected these and warned about them, rather than just silently working around the broken-ness.

CrLf · on Dec 20, 2011

I find these statements amusing...

As it happens, neither is ext3 an obsolete filesystem (I have many machines using it just fine, thank you, with no intention to migrate to less tested filesystems just because they're "new"), nor "data=ordered" has significant performance problems.

More amusing than this just the fact that the filesystem is a bottleneck for a web browser... an application that should be network and CPU bound...

akkartik · on Dec 20, 2011

Alright, I'm going to start calling people on HN about this pet peeve of mine:

Please stop 'finding it amusing' when you disagree with people. It makes you sound arrogant and passive-aggressive. Even if the other side is totally wrong, hammering that fact in is rhetorically counter-productive.

jmillikin · on Dec 20, 2011

data=ordered certainly does have significant performance problems. Most notably, it causes fsync() to actually act like sync(), so flushing a single file to disk can potentially cause a flush of the entire disk buffer instead!

nknight · on Dec 20, 2011

That behavior for fsync() is permitted by POSIX (and incidentally matches a permitted, but not required, interpretation of sync()). It's also something that, given reasonable application design, should rarely be an issue.

Part of Firefox's problem was that practically everything a user would do resulted in fsync() being called -- that's unnecessary and unreasonable. It's one thing to fsync() when one adds a bookmark, or saves a password. Doing it every time a link is clicked is just insane, and would pose a performance problem regardless of ext3's eccentricities.

bzbarsky · on Dec 20, 2011

Browsers are CPU-bound, network-bound, GPU-bound, memory-bound (esp. memory bandwidth, though some people count that as "CPU"), and filesystem-bound. Depending on exactly what you're doing with the browser.

gravitronic · on Dec 20, 2011

Try comparing the deleting of a large file on ext3 and ext4.

On ext3 it can take 20 minutes to delete a file that in ext4 is deleted in a few seconds. That is including a followup "sync".

It is an obsolete filesystem.

nohat · on Dec 20, 2011

I would agree that calling ext3 obsolete is odd, but I don't know enough about the particulars of file system performance to understand why data=ordered is or is not problematic. This leads to the interesting problem of judging the opinion of several contradicting, but apparently similarly qualified people. I'm sure someone on HN is able to decisively explain: can someone do so?

yew · on Dec 20, 2011

Originally, ext3 filesystems maintained journals of all metadata and data changes. This substantially decreases the likelihood of the filesystem ending up in an irrecoverably inconsistent state, but it also has major performance implications.

Relatively early on ext3 was extended with "data=ordered" and "data=writeback" modes (with "data=ordered" being the default and recommended mode, as "data=writeback" is marginally faster but offers few consistency guarantees). In these modes only metadata changes are committed to the journal.

In "data=ordered" mode the filesystem attempts to guarantee that data will actually be written to the disk (not merely cached) before any related metadata writes occur, as opposed to caching and writing all data according to an outside schedule. Obviously, this also has certain implications for write performance.

I would say that Firefox was incontrovertibly broken in this case, but my views on the adoption of ext4 would probably be considered rather conservative. Regardless, ext3 isn't going away any time soon so this fix will be appreciated.

nohat · on Dec 20, 2011

Ah, I gather firefox assumed that it could write to a given file repeatedly and it would just stay in the cache. Thanks for the explanation.

yew · on Dec 20, 2011

Not quite, though that's part of what the problem was. There's a mechanism (fsync) for forcing the changes in a particular file to be written to disk. As a result of the consistency guarantees made by the "data=ordered" mode, synchronizing a single file often forces the filesystem to make other, unrelated writes to the disk as well. In the worst case the entire cache can be forced to disk. The end result is substantially more disk activity than would be expected.

rbanffy · on Dec 20, 2011

> neither is ext3 an obsolete filesystem

It's certainly not obsolete as in "you shouldn't use it anymore", but it certainly is as in "there is really no reason to continue using it". While moving to BtrFS may be premature for most people, not moving to ext4 is somewhat eccentric. Ext4 is more than good enough.

> More amusing than this just the fact that the filesystem is a bottleneck for a web browser...

Indeed.

erez · on Dec 21, 2011

Not an opinion about ext3, but "I have many machines using it" does not mean a system is obsolete or not, for obvious reasons. Besides, with companies still running COBOL/FORTRAN software on decades old hardware, what is "obsolete"?

RexRollman · on Dec 21, 2011

In my opinion, obsolete software is software that is no longer maintained.

jrockway · on Dec 21, 2011

The current version of the "extended filesystem" is 4, not 3.

nknight · on Dec 22, 2011

That doesn't mean ext3 is unmaintained.

http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Fl...

RexRollman · on Dec 22, 2011

Exactly, and both JFS and XFS are still maintained as well. In fact, this is where all file systems should strive to get: stable releases with bug fixes only.

CPlatypus · on Dec 20, 2011

The problem with data=ordered is that it imposes a global order, allowing one I/O stream to stall others. Imagine if waiting for data to be acknowleged on one TCP socket meant waiting for all data sent earlier on all sockets - even those to slower peers - to be acknowledged as well. That's completely FUBAR, but it's basically what fsync with data=ordered does to an entire filesystem. There are hundreds of performance reports showing this effect, and nobody cares if one user's I/O patterns are too lame for them to have noticed.

The problem is that, in ext3, not using data=ordered is also problematic - this time in terms of potential corruption and security issues. That's why even the people who brought you ext3 abandoned it years ago in favor of making ext4 do these things the right way. If you want reasonable fsync behavior, plus niceties like block-size awareness and trim/discard support, you need to use a modern filesystem.

A piece of software that embodies a broken approach to a solved problem is by definition obsolete, and that includes ext3. Please don't make claims about what's obsolete when you don't even understand the issues that would make it so.

sciurus · on Dec 21, 2011

Calling it obsolete is hyperbole. A standard definition of obsolete is something that was once common but no longer is. Just last week I had to revert a number of ext4 filesystems back to ext3 due to performance problems on RHEL5, so I'm afraid that ext3 will be in common use for years to come.

(The performance problems aren't present in Fedora 16, so you could argue that RHEL5 is obsolete.)

CPlatypus · on Dec 21, 2011

Yes, the RHEL5 kernel is obsolete in that it lacks the barrier support necessary to implement proper behavior with decent performance. The fact that obsolete ext3 runs better on also-obsolete 2.6.18 kernels doesn't really tell us much except that sometimes two pieces of software evolve together, and sometimes versions that were developed together work better than versions that were not.

talmand · on Dec 20, 2011

But this is a common theme with people who label software as bad. It's possible the problem is something other than the software being used, but since they only see the problem while using said software then obviously the problem is the software itself.

noelwelsh · on Dec 20, 2011

I've been running Aurora for a few months. It's been stable and gets the new toys quicker. Some add-ons break on some updates, but the few web dev. add-ons I use are always quick to update. I get questionnaires from time to time, but they don't take long to complete and are a trivial way to contribute to Mozilla. In short, give Aurora a try. It works as my full-time browser.

cpeterso · on Dec 21, 2011

I've been running Aurora for six months without any major problems. The only annoyance was some Add-ons were disabled, but Mozilla's Add-on Compatibility Reporter Add-on allows you to force-enable older Add-ons. That said, I think this Aurora release cycle will automatically opt-in all Add-ons.

zobzu · on Dec 21, 2011

Note that you can disable the questionnaires if you don't like them (you do, but some may not)

sciurus · on Dec 20, 2011

Can anyone point to the bugzilla entries for the fixes?

mbrubeck · on Dec 20, 2011

I'm not sure which exact bug was the culprit here, but this is a list of related to responsiveness problems caused by the "places" database: https://bugzilla.mozilla.org/showdependencytree.cgi?id=69150...

Some notable ones that were fixed recently include http://bugzil.la/690354 and http://bugzil.la/686025

joejohnson · on Dec 20, 2011

What are people's overall impressions of Firefox 9?

viraptor · on Dec 21, 2011

I don't remember anymore - running 11.0a now. I'm using nightly builds since ~2009 and haven't experienced any issues so far (really - not even a random crash). The difference is so huge in overall performance, I don't want to go to the "stable" release anymore.

The only problem are some less common plugins, but adblock and firebug get updated within days... and I just switch to the latest official release for quake live.

otherpope · on Dec 21, 2011

"I don't remember anymore - running 11.0a now. I'm using nightly builds since ~2009"

Likewise. I also have several extensions enabled and generally have 30+ tabs open all the time, sometimes many times that. I have had an odd crash but very rarely.

tytso · on Dec 21, 2011

So a couple of corrections that people are getting wrong over and over again in this discussion. The reason why data=ordered is the default is not because it's more likely for the file system to be recovered after a crash. It's default for security reasons. The "data-writeback" mount option has the tradeoff that there can be uninitialized blocks attached to files after a crash. These uninitialized blocks could contain someone else's love letters, porn stash, medical records, etc.

The main issue with "data=ordered" is not that it imposes a global order, but that if you do block allocations, fsync() requires a journal commit, and the security guarantees behind "data=ordered" require that all newly allocated blocks must be written out before the filesystem-wide journal commit can be allowed to complete. (Ext4 doesn't have this problem because it uses delayed allocation.)

This isn't an issue with other databases such as Oracle, DB2, MySQL, etc. They have no problems running on ext3. This is partially because they don't use fsync() at all. Oracle and DB2 use O_DIRECT to blocks that are allocated once (new blocks are only allocated when the table space file needs to be grown), and MySQL uses fdatasync() to files that aren't constantly being created and destroyed. SqlLite, because it is trying to pretend that it only needs one file, uses many, MANY temporary files which are being constantly created and destroyed. This is not efficient, and is why SQLite has all of these problems, but millions and millions of dollars worth of enterprise servers can use ext3 running on RHEL3, RHEL4, and RHEL5 on Oracle without hitting these issues.

The sad thing is most people like to use SQLite because it doesn't require manual scheme generation, not because it only uses one file. (In fact it doesn't; there are multiple temporary files, some of such must be there in case of an unclean shutdown. So if you copy just the single file after a program crash, you'll screw up the database.) If someone created a lightweight database that uses SQLite's interfaces, but used a single directory to hold all of the database's files, and then didn't constantly copy data back and forth between temporary files which were being constantly created and destroyed, but instead used the storage strategies used by the more sophisticated database systems, the result would work well on ext3, and for all file systems, it would use less data writes, which would save battery usage, SSD write endurance, and many other things. The last time I measured it, Firefox's "awesome bar" was consuming a third of a megabyte of write bandwidth to the disk per click; and it was updating at most a few hundred bytes of data. The rest was all overhead due to the catastrophic inefficiencies of SQLite.

P.S. And if you do implement this, please consider releasing it under the Apache license so I can hopefully convince the Android team to drop SQLite in favor of something that was a bit more written towards performance --- and Android users all over the world will thank you. :-)

bryanlarsen · on Dec 21, 2011

I think we can consider this authoritative: Theodore T'so is the maintainer of ext3 & ext4.

PissedOffHNer · on Dec 21, 2011

Fixed?

I ran Google Chrome open with 80-100 tabs for 3 WEEKS without anything going wrong. Sunday the 18th I finally closed it and restarted my computer (after 3 weeks of non stop running).

Monday I launched the latest build of FireFox and within 30 minutes it crashed. It then crashed again a few hours later.

Goodbye FireFox.

Athtar · on Dec 21, 2011

This is unfortunately my main issue with FF. I am type of person that has multiple tabs open in multiple windows for days at a time and FF tends to choke in that scenario after a few days. I was hoping the 64-bit version might fix the issue but it's even worse.

It's still my main browser but this one issue just kills the entire experience.

zobzu · on Dec 21, 2011

I ran Chrome for 4 years (yeah) and closed it today after this news. Started Firefox Aurora and it crashed at opening.

Damn. Oh and this is my 3rd post.

cpeterso · on Dec 21, 2011

To be fair, Aurora is Mozilla's not-even-beta channel. And today of all days was Day 1 of a new Aurora release cycle: in 6 weeks - 1 day, Aurora will be elevated to Beta channel.

PissedOffHNer · on Dec 21, 2011

I saw the downvotes and just chuckled. I literally DID have it open for 3 weeks. What I posted DID happen, no exaggerations.

jrockway · on Dec 21, 2011

I would blame Flash for this.

rhizome · on Dec 20, 2011

It would be more accurate to say "is (probably) fixing," possibly with an (on Linux) suffix.

redmethod · on Dec 20, 2011

Even with fixes, most people seem to be switching to Chrome. It's hard to beat Google's integration

wwweston · on Dec 20, 2011

Google's integration is actually the reason why I'd switch back to Firefox in a heartbeat if the performance kept up -- there's some serious privacy issues involved in browsing the web using a Google browser while logged in to a Google account.

happyfeet · on Dec 20, 2011

Absolutely. Most often I catch myself do private browsing in Chrome and 'feel' safer with Firefox. May be my fears are unfounded and yet I do it often.

pyre · on Dec 20, 2011

I like Chrome's "private browsing" mode better than Firefox's though. Mostly because I can have private and non-private windows open at the same time (and they're properly sandboxed from each other).

gruturo · on Dec 21, 2011

./firefox -P someotherprofile -no-remote

(assuming you created someotherprofile via Profile Manager - it's dead easy)

(Works on Windows, too)

pyre · on Dec 21, 2011

Right, but in Chrome it's as easy as Ctrl + Shift + n from another Chrome window.

edit: I'll note that I don't believe that the parent to this post should be down-voted. It was informative and on-topic.

gruturo · on Dec 22, 2011

Maybe I would have deserved a downvote for forgetting to mention that after opening the second Firefox session you still have to switch it to Private Browsing :-))))

redmethod · on Dec 20, 2011

True. I'm not saying that google's integration is necessarily a good thing, but many people like having everything easily accessible, and don't consider the downsides to putting all their private eggs in one basket. I keep wanting Opera to be faster and better on Mac OS X, and while I enjoy Firefox, I'm not a huge fan of the design of it on a Mac.

NARKOZ · on Dec 20, 2011

Use Chromium if you complain about privacy

gcp · on Dec 20, 2011

I don't see how Chromium solves any of the issues he was complaining about. It's not like it rips out the basic designs of the browser, which are still tightly tied to Google.

intranation · on Dec 20, 2011

Chromium and Chrome are identical except for one compile flag (which means you can at least check it yourself):

http://news.ycombinator.com/item?id=3034755

kakuri · on Dec 20, 2011

SRWare distributes a Chromium build called Iron that strips out privacy problems.

http://www.srware.net/en/software_srware_iron.php

otherpope · on Dec 21, 2011

I used to recommend Iron but I don't anymore. I became increasingly uncomfortable with the lack of documentation and lack of transparency in the project. They delivered their source code as two code dumps on rapidshare last time I checked. Something just doesn't feel right at all.

Zirro · on Dec 21, 2011

http://chromium.hybridsource.org/the-iron-scam may be worth checking out, as well.

otherpope · on Dec 22, 2011

Thanks. That confirms my suspicions.

bad_user · on Dec 20, 2011

What integration are you talking about?