I am having trouble understanding what a "Serveless Database" is. When I do a se...

256DEV · on Nov 16, 2021

Basically I understand the idea of serverless in this context as abstracting away everything related to managing & maintaining the database - i.e. anything like VMs, containers, OS, DB processes etc.

So you can set up a PlanetScale MySQL DB and use it just like a normal instance of MySQL, but also keep adding data from one small set of records all the way up until you have gigantic petabyte volumes of data without having to do anything beyond sending the data through your MySQL connector.

In theory it should just keep working in a performant way from 100 user records for your new startup to the scale of running parts of Slack. No choosing bigger and bigger AWS RDS instances as you scale, no need for autoscale strategies in case of traffic surges or worrying about replicas for perf etc. etc.

As someone who is honestly quite resistant to parts of the serverless paradigm this offering actually appeals to me. I prefer running my own fleet of VMs and traditional PHP/Nginx type stack but have already moved to AWS RDS to abstract away some of the replication complexity required to achieve high availability DB with minimum hassle. This seems like the logical next step and despite being allergic to kind of hype you mention finding this is something I'd definitely try out before moving other parts of my infrastructure onto anything like Lambda.

nixpulvis · on Nov 16, 2021

Ironically, they've only hijacked our ecosystems.

Now we need seperate access controls, seperate networking tools, seperate monitoring and diagnostics. It's becoming apearent to me that this kind of stuff is the scam of the century.

soamv · on Nov 16, 2021

There is no clear definition with universal agreement. It's a hype-y term applied with... varying levels of rigor.

However, roughly speaking, "serverless" rolls together 3 features:

1. Fine grained pay-per-use (e.g. pay for a query by the number of rows scanned)

2. The pricing dial goes down to zero when usage is small enough.

3. You generally don't control VM/instance-level scaling but something closer to the abstraction level of the product being claimed as "serverless". For example in planetscale you get no control over how many mysql instances actually run your queries. This is great for reducing operational complexity but not so great for controlling performance. Performance tends to be quite opaque -- for example there's nothing I can find in Planetscale's docs about latency and throughput. The operational benefits are real, though. It's a tradeoff.

thor24 · on Nov 17, 2021

This is good info thanks. I have some cloud Infra experience so I am interesting in knowing how does they keep the data stored and remove the "query" servers when not in use.

Possibly some kind of EBS equivalent storage which is attached to the VM when it's booted up? I wonder that creates more failures at the cost of operational simplicity?

herodotus · on Nov 16, 2021

Thanks

topspin · on Nov 16, 2021

"Serverless Database"

It's just a database service. You don't run the servers. You pay someone to do that and you just connect to it and use it and someone else maintains the servers, storage, scaling, backups, patches, etc., according to some SLA and other terms.

There are indeed servers somewhere. "Serverless" is misleading cloud speak for the otherwise easily understood concept of a service.

alvarlagerlof · on Nov 16, 2021

They shut down when not in use.

haggy102 · on Nov 17, 2021

They almost never shut down the actual VMs, they are simply re-allocated. (Semi)Auto scaling exists behind the scenes but once a provider becomes popular enough the VMs become more expensive to stop.

romero-jk · on Nov 16, 2021

DyanmoDB is truly serverless, they give you a http endpoint and that's it, you don't have to deal with database connections.

ignoramous · on Nov 17, 2021

> ...deal with database connections.

I think, you mean connection pooling?

Amazon Aurora has http endpoint too, for its serverless offering.

I use S3 with duckdb files in 'em as a sharded OLAP database of sorts. No transactions or joins though, but my workload is write once, ready plenty. Hoping Databricks' serverless SQL is a good fit once it emerges out of beta.

moralestapia · on Nov 16, 2021

You don't provision and manage the infrastructure associated with your DB. Someone else does it for you and you just focus on querying/storing the data and whatever else your business requires you to do.

zild3d · on Nov 16, 2021

Not provisioning & managing is part of it, but the pricing model is important too. e.g. RDS and Mongo Atlas are managed database services so you're not provisioning and managing the infra, but you are paying for dedicated machines and their sizes, etc.

moralestapia · on Nov 16, 2021

>but you are paying for dedicated machines and their sizes

I see this as provision & managing as well, but yes I see your point. Ultimately it's about the user only caring about its data and nothing else (to the extent possible).

jedberg · on Nov 16, 2021

RDS still requires you to provision servers even if you aren't running them.

brazzledazzle · on Nov 16, 2021

I think Aurora Serverless may be the one exception to that.

redwood · on Nov 16, 2021

Atlas offers serverless today

mupuff1234 · on Nov 16, 2021

"Serverless" just seems to be the trendy rebranding of "as a service".

zild3d · on Nov 16, 2021

With a dedicated Mongo DB cluster (server-full), you are paying for a certain cluster size, per hour. It doesn't matter if you read or write any data to it. You're paying for a machine with a specific amount of storage capacity and cpu. Use it or lose it.

DynamoDB (considered serverless), you're charged based on the read and write throughput, and how much you have stored (GB-month). If you don't store any data, you're not being charged for unused storage capacity, like a dedicated cluster. You don't think about the instance size, amount of memory, etc.

- https://www.mongodb.com/pricing

- https://aws.amazon.com/dynamodb/pricing/on-demand/

sahaamity · on Nov 17, 2021

DataStax also launched Serverless Cassandra (Astra) recently.

https://www.datastax.com/products/datastax-astra/pricing

https://thenewstack.io/the-serverless-database-you-really-wa...

Disclaimer : I work for DataStax.

ldehaan · on Nov 17, 2021

Serverless means infrastructure as code and you're in as much control of the system as they want, which can be near 0, and they'll tell you it's for your benefit to have no control. Why use a bank vault when you can have a crypto wallet.

DeathArrow · on Nov 16, 2021

Probably it can be defined to be "database as a service".

pm90 · on Nov 16, 2021

“Pay for only what you use” database.

mbesto · on Nov 16, 2021

I'm curious about what kind of workloads people would be using a serverless DB for?

If I understand it correctly, let's say my DB is only being used to process 1M transactions from 9am-1pm, I'm basically only paying for the load during that time, versus a managed DB where it's being paid to be on 24/7. With most serverless there is a penalty though - cold startup.

So is it purely an economic play for esoteric DB workloads? If my DB is realistically going to be churning for 24/7 anyway, why would I ever use serverless DBs?

spelunker · on Nov 16, 2021

Lower operational burden - if you're using a serverless db maintaining servers is one less thing to think about. Same with any infrastructure - you don't have to worry about orchestration, load balancing, authentication, etc.

mbesto · on Nov 16, 2021

I mean, I can essentially do this with any PaaS DB, no? RDS, Azure SQL, Cosmos, etc.

freen · on Nov 16, 2021

Scales up and down automatically.

leros · on Nov 16, 2021

As I understand it, they've extracted the storage and compute parts of the database and are running them in a scalable way such they can automatically add more compute or storage as needed.

unfunco · on Nov 16, 2021

Nothing to manage, and it should scale to zero cost when not in use.