I wonder how much their setup costs. Naively, if one were to simply feed 100 PB ...

francoismassot · on July 11, 2024

Good question.

Let's estimate the costs of compute.

For indexing, they need 2800 vCPUs[1], and they are using c6g instances; on-demand hourly price is $0.034/h per vCPU. So indexing will cost them around $70k/month.

For search, they need 1200 vCPUs, it will cost them around $30k/month.

For storage, it will cost them $23/TB * 20000 = $460k/month.

Storage costs are an issue. Of course, they pay less than $23/TB but it's still expensive. They are optimizing this either by using different storage classes or by moving data to cheaper cloud providers for long term storage (less requests mean you need less performant storage and usually you can get a very good price on those object storages).

On quickwit side, we will also improve the compression ratio to reduce the storage footprint.

[1]: I fixed the num vCPUs number of indexing, it was written 4000 when I published the post, but it corresponded to the total number of vCPUs for search and indexing.

rcaught · on July 11, 2024

Savings plans, spot, EDP discounts. Some of these have to be applied, right?

Onavo · on July 11, 2024

At this level they can just go bare metal or colo. Use Hetzner's pricing as reference. Logs don't need the same level of durability as user data, some level of failure is perfectly fine. I would estimate 100k per month or less, maximum 200K.

onlyrealcuzzo · on July 11, 2024

A lot.

1PB with triple redundancy costs around ~$20k just in hard drive costs per year. That's ~$2.5M per year just in disks.

I'd be impressed if they're doing this for less than $1.5M per month (including SWE costs).

Obviously, if they can, saving $1.5M a month vs BigQuery seems like maybe a decent reason to DIY.

BiteCode_dev · on July 11, 2024

Why per year? If they buy their own server, they keep the disk several years.

The money motivation to self host on bare metal at this scale is huge.

TestingWithEdd · on July 11, 2024

Remember they’d want to run raid, maybe have backups, and manage disk failure. At that size it’ll be a daily event (off the top of my head).

onlyrealcuzzo · on July 11, 2024

> Why per year? If they buy their own server, they keep the disk several years.

The cost per year is much higher - that's using a 5-year amortization.

BiteCode_dev · on July 11, 2024

Seems high.

You can get a spinning disk of 18TB (not need for SSD if you can parallel write) for 224€. Let's round that to $300 for easy calculations.

To store 100 petabytes of data by purchasing disks yourself, you would need approximately 5556 18TB hard drives totaling $1,666,800.

Of course, you'll pay more than the disks.

Let's add the cost of 93 enclosures at $3,000 each ($279,000), and accounting for controllers, network equipment ($100,000), and power and cooling infrastructure ($50,000, although it's probably already cool where they will host the thing), that would be a about $2.1 M.

That's total, and that's for the uncompressed data.

You would need 3 times that for redundancy, but it would still be 40% cheaper over 5 years, not to mention I used retail price. With their purchasing power they can get a big discount.

Now, you do have the cost of having a team to maintain the whole thing but they likely have their own data center anyway if they go that route.

ddorian43 · on July 11, 2024

> disk of 18TB (not need for SSD if you can parallel write)

Do note that you can put, like, at most?, 1TB of hot/warm data on this 18TB drive.

Imagine you do a query, and 100GB of the data to be searched are on 1 HDD. You will wait 500s-1000s just for this hard drive. Imagine a bit higher concurrency with searching on this HDD, like 3 or 5 queries.

You can't fill these drives full with hot or warm data.

> To store 100 petabytes of data by purchasing disks yourself, you would need approximately 5556 18TB hard drives totaling $1,666,800.

You want to have 1000x more drives and only fill 1/1000 of them. Now you can do a parallel read!

> You would need 3 times that for redundancy

With erasure coding you need less, like 1.4x-2x.

pas · on July 11, 2024

quickwit seems to be designed such that it prefers to talk S3 to a sweet storage subsystem, so by running Ceph you can shuffle your data around evenly

ddorian43 · on July 22, 2024

Try to read again what I wrote. It doesn't matter on the software ceph,etc.

rthnbgrredf · on July 11, 2024

For this purpose you would likely not buy ordinary consumer disks but rather bullet proof enterprise HDDs. Otherwise a signifcant amount of the 5556 disks would not survive the first year, assuming the are under constant load.

pas · on July 11, 2024

quickwit's big advantage is that you can target it at something that speaks S3 and it will be happy. so ideally you delegate the whole storage story by hiring someone who knows their way around Ceph (erasure coding, load distribution) and call a few DC/colo/hosting providers (initial setup and the regular HW replacements).

JackSlateur · on July 11, 2024

HDD have terrible IOPS

the_arun · on July 11, 2024

DIY also comes with the cost of managing it. We need a team to maintain, bug fix etc., not hard but cost

AJSDfljff · on July 11, 2024

Good question. I thought it would be a no brainer to put it on s3 or similiar but thats already way to expensive at 2m/month without api requests.

Backplace storage pods are an initial investment of 5 Million, thats probably the best bet you could do and on that savings level, having 1-3 good people dedicated to this is probably still cheaper.

But you could / should start talking to the big cloud providers to see if they are flexible enough going lower on the price.

I have seen enough companies, including big ones, being absolut shitty in optimizing these types of things. At this level of data, i would optimize everyting including encoding, date format etc.

But i said it in my other comment: the interesting questions are not answered :D

orf · on July 11, 2024

The compressed size is 20pb, so it’s about 500k per month in S3 fees

francoismassot · on July 11, 2024

Indeed. They benefit from a discount, but we don't know the discount figure.

To further reduce the storage costs, you can use S3 Storage Classes or cheaper object storage like Alibaba for longer retention. Quickwit does not handle that, so you need to handle this yourself, though.

jcgrillo · on July 11, 2024

Logs should compress better than that, though, right? 5:1 compression is only about half as good as you'd expect even naive gzipped json to achieve, and even that is an order of magnitude worse than the state of the art for logs[1]. What's the story there?

[1] https://news.ycombinator.com/item?id=40938112

AJSDfljff · on July 11, 2024

I would probably build my own storage pods, keep a day or a week on cloud and move everything over every night.

Daviey · on July 11, 2024

"Object storage as the primary storage: All indexed data remains on object storage, removing the need for provisioning and managing storage on the cluster side."

So the underlying storage is still Object storage, so base that around your calculations depending if you are using S3, GCP Object Storage, self hosted Ceph, MinIO, Garage or SeaweedFS.

Aurornis · on July 11, 2024

They provide some big hints about the number of vCPUs and the size of the compressed data set on S3:

> Size on S3 (compressed): 20 PB

There are also charts about vCPUs and RAM for the indexing and searching clusters.

gaogao · on July 11, 2024

Yeah, doing some preferred cloud Data Warehouse with an indexing layer seems fine for this sort of thing. That has an advantage over something specialized like this of still being able to easily do stream processing / Spark / etc, plus probably saves some money.

Maybe Quickwit is that indexing layer in this case? I haven't dug too much into the general state of cloud dw indexing.

fulmicoton · on July 11, 2024

Quickwit is designed to do full-text search efficiently with an index stored on an object storage.

There are no equivalent technology, apart maybe:

- Chaossearch but it is hard to tell because they are not opensource and do not share their internals. (if someone from chaossearch wants to comment?)

- Elasticsearch makes it possible to search into an index archived on S3. This is still a super useful feature as a way to search punctually into your archived data, but it would be too slow and too expensive (it generates a lot of GET requests) to use as your everyday "main" log search index.

BiteCode_dev · on July 11, 2024

Click house does have it, but it's experimental.