We have 4000 clients atm but hope to increase that quickly (20,000 within a year). That's TB's of data a month. SAN was considered but it too expensive to scale :(
EDIT: well, not over the top expensive. But commodity servers with software is cheaper/better for us.
Also we at some point need a HTTP interface to the outside world.
Your use-case doesn't sound like rocket surgery so I'd say you're on the right track. Write once, read many is trivial to scale. Just build it according to your needs and throw in memcached hosts as needed.
If your access pattern allows for good caching you could also skip all the hassle of maintaining physical spindles and just move your stuff to S3. Their storage fees are quite affordable, 100T would set your back around $15k a month.
The killer with S3 is in the traffic bill, though. So that's only an option when most of your requests can be served from cache.
We have 4000 clients atm but hope to increase that quickly (20,000 within a year). That's TB's of data a month. SAN was considered but it too expensive to scale :( EDIT: well, not over the top expensive. But commodity servers with software is cheaper/better for us.
Also we at some point need a HTTP interface to the outside world.