You don't want to scale vertically once you hit millions of users and most other web companies like Google, Facebook, Yahoo and Microsoft have proven that horizontally scaling without big iron is the way to go. Currently the only way to scale a relational database such as MySQL or Postgre is by sharding and partitioning - these things ruin most good relational properties that your database structure may have.
> You don't want to scale vertically once you hit millions of users
Hundreds of the top websites in the world (the majority, I would hazard, though without the data to support it) have scaled vertically to millions and tens of millions of entities just fine. Far more than have scaled horizontally. It works. It's been done. And it doesn't give up the sort of transactional niceties that make problems like this easier.
> most other web companies like Google, Facebook, Yahoo and Microsoft
You're confusing "the very biggest web companies" with "most other web companies." Most other web companies continue to use commodity products, and Google/Facebook/Yahoo!/MS certainly would (and do) insofar as it's possible at that scale. Expending resources now to be as horizontally scalable as Google is wasteful premature optimization.
Notably, Yahoo! runs the largest PostgreSQL installation in the world, and Google and Facebook both continue to use MySQL.
> horizontally scaling without big iron is the way to go.
You can get 32-core machines with 128GB of ram from Dell (a mildly tweaked R910) for $30k these days. Is that big iron? How does its price compare with the amount of developer salary and benefits you'll have to spend to grok a non-relational data store, migrate your data to it, and reimplement the ACID features of a relational store in the code for your app? How many developer-days will you spend maintaining that code and how many developer-nights will you spend triaging a crashed site because of the complexity and likely bugginess of that reimplementation? How many users' feature requests will you have to reject as "too difficult to implement" because you feel the need to scale to Google/Facebook levels despite having only a few million users now and predicted growth which shows you'll never in a million years catch up to them?
> Currently the only way to scale a relational database such as MySQL or Postgre is by sharding and partitioning
It will be years before the vast majority of startups exhaust reasonable, cost-effective options for vertical scaling. The recent fervor for non-relational, horizontally scalable data stores is simply the new way of scratching the intellectually masturbatory premature optimization itch that programmers have had since ENIAC.
Scaling depends a lot of how much data you got, how much data you generate and how much you plan to grow. Given the size of Basecamp and 37 Signal's future projection I doubt it would be wise to hope that they can scale vertically - - because once you hit the limit you are pretty screwed and need to buy _much_ more expensive hardware or rewrite most of your database related code and do lots of migrations. (And rewriting database related code to support sharding is usually error prone since you can't use joins, foreign keys, need to copy data around etc.)
Do note that I am not saying that small websites should shard or scale horizontally - - but big sites with millions of users and tons of data should not scale vertically (it can't payoff and at some point they'll hit the limit).
> Scaling depends a lot of how much data you got, how much data you generate and how much you plan to grow.
No doubt, but 37Signals shouldn't have a lot of relational data. The bulk of their per-project bytes, it seems likely, is non-relational stuff like attachments.
> Given the size of Basecamp and 37 Signal's future projection I doubt it would be wise to hope that they can scale vertically
Isn't it less wise to pay a cost you don't yet need and may never need to pay? You pay a significant price in development velocity by forgoing a relational database and using a non-relational data store. Certainly any reasonable organization should be able to project when they will actually need to pay that cost.
> big sites with millions of users
Single digit millions of users isn't that big.
> and tons of data should not scale vertically (it can't payoff and at some point they'll hit the limit).
It can certainly pay off if you never actually need to convert to a non-relational data store. The limit is a lot higher than you seem to think: banks and financial institutions process billions of transactions for hundreds of millions of users daily on the same ACID, relational data stores that you're saying a site like BaseCamp will hit the limit of. I don't buy it.
Millions of users is not really that many. It's certainly within the realm of what can be reasonably vertically scaled.