I feel like I need to note the pricing. $0.01 per 1,000 queries. That doesn't so...

itcmcgrath · on March 16, 2017

To add extra color, for about $3M/month @ list prices of Cloud Datastore [1], you can, in a Multi-Region active-active synchronous replication configuration, run a workload with the following profile:

Reads: >1.1M entities/second Write: >380K entities/second Delete: >190K/second Storage: 100TB

And that's if you don't use any of the nearly free optimizations like Projection queries & keys-only queries, which any large scale customer does.

That's not pre-provisioned usage, it's actual pay-as-you-go usage - so if you have no traffic, you have no costs (except for what's already stored). It's been that way for 8 years too.

[1]: https://cloud.google.com/products/calculator/#id=e21b61d5-4a...

(PM for Cloud Datastore - if you'll looking at 1M+ QPS workloads feel free to message me)

evanweaver · on March 16, 2017

Huge scale is what FaunaDB On-Premises is for; the pricing model is different. That's what NVIDIA uses for example. Nevertheless, we will have volume discounts and reserved capacity in Cloud too.

I see where you're coming from. People make the same argument against using cloud services at all when you can buy hardware yourself and operate it. The lack of flexibility is the hidden cost.

Our cloud pricing is competitive with other vendors, most of which require you to massively over-provision in order to get high availability, especially global availability, as well as predictable performance. In traditional cloud databases, you have to provision for peak load. Usually this is an order of magnitude difference from average load. An order of magnitude difference happens to matches your Spanner example exactly; however with Spanner, you still have to manage your capacity "by hand".

Architecture docs are on the way.

mdasen · on March 16, 2017

You're right that it's was a bit unfair to compare a flexible FaunaDB to Spanner which you'd need to provision for peak traffic. But even if it's an order of magnitude more, $16M vs $630M is still quite a gap. It really doesn't match the Spanner example. And if you're able to handle incredibly spiky loads, information on how is kinda important. If I go from a steady state of 100 QPS to 15,000 QPS for a 20 minute period, will that just be pain?

You've said that Spanner makes you manage capacity by hand, but the marketing copy says, "FaunaDB is adaptive, because it lets you change your infrastructure footprint on the fly. Dynamically shift resources to critical applications, elastically add capacity during peak events, and replicate data around the world—all in a unified data fabric." So, if I'm expecting a burst of traffic, do I have to "change my infrastructure footprint" manually? How quickly can one "elastically add capacity"? I mean, I've seen plenty of systems that one can add capacity to that, well, get humbled when copying data to new nodes. Like, you had 10 nodes and now you want 15 because you're being hammered. And wonderful, it's trying to copy data to the new nodes while it was already having capacity issues and only making response times worse and errors go up. I'm not saying that will happen to you, but there's no information to make me think that problem is addressed.

Honestly, people involved in FaunaDB seem to know enough about databases that I'd just expect more real information on the website. When Kudu came out, they published a paper that basically read like, "well, we created a column store kinda like one would if you'd read the C-Store paper and these are the trade-offs and we seem to have done reasonably" and I came away from reading it thinking, "ok, these people know the score. It may or may not be executed well enough, but there's an understanding." They led with a paper that might not have been revolutionary, but really showed that they understood the space and explained how it was designed such that someone with databases knowledge could see that it was reasonable.

Introducing your database with so much, well, non-information doesn't help you (in my opinion). Without digging, it looks like another DB vendor that promises everything will be perfect and that it's great for any workload.

The whole "About FaunaDB" page doesn't tell me much. Like, there's a comment in here that tells me you're using logical clocks, I can see from Daniel's Twitter that you're using some of his research, etc. I mean, you actually have cool technical details to highlight - details that make your DB seem a lot more real. But the page makes it feel like you don't have cool technical details - that you're trying to hide information because it's not good. I mean, adding in some details about how things are achieved make a product seem a lot more real. I know what logical clocks are. Calvin is a research paper I can read. I mean, finding that makes FaunaDB seem way more real - there's something substantive. Like, I can read Calvin tomorrow and some of the ways you're achieving things will come to light and I might be impressed.

But right now, it's really hard to find the information that would impress technical readers.

evanweaver · on March 16, 2017

I'm with you. That level of detail is coming soon.

vgt · on March 16, 2017

If we're on the topic comparing Spanner, here's the 15-second live demo of resizing Spanner from 70 to 99 nodes at [0]. The act itself is quite unremarkable, but the complexity abstracted away is awesome.

Both Spanner and Datastore do quite well in the cloud for "huge scale" as fully managed services. And with any deployment on-premise, one certainly must manage their own capacity "by hand".

(work at Google Cloud)

[0] https://youtu.be/kwnWfHq2EfQ?t=11m48s