More

qoega · 2025-06-18T19:22:22 1750274542

It is meant for single reader/writer workload so not meant to be used as a service

qoega · on May 29, 2024

In ClickHouse it is just `INSERT INTO t FROM INFILE 'data.csv.gz'`. Any supported format, any encryption, autodetected from file name and sample data piece to get column types, delimeters etc. Separate tools to convert CSV are not necessary if you can just import to db and export as SQL Statements.

echo "name,age,city John,30,New York Jane,25,Los Angeles" > example.csv

clickhouse local -q "SELECT * FROM file('example.csv') FORMAT SQLInsert" INSERT INTO table (`name`, `age`, `city`) VALUES ('John', 30, 'New York'), ('Jane', 25, 'Los Angeles');

qoega · on March 4, 2024

Are you guy comparing 16vCPU/32GB vs 8vCPU/32GB and say yours is only 1.6 faster?

smythe123 · on March 4, 2024

Hi there CEO of Tablespace here, no we are comparing 16CPU/32GB which Tablespace used to 24CPU/96GiB which ClickHouse cloud uses.Tablepsace is 1.6x faster running the benchmark even though we are using a much smaller compute shape.

qoega · on Jan 24, 2024

Now you rarely use basic MapReduce primitives, you have another layer of abstraction that can run on infrastructure that was running MR jobs before. This infrastructure allows to efficiently allocate some compute resources for "long" running tasks in a large cluster with respect to memory/cpu/network and other constraints. So basically schedulers of MapReduce jobs and cluster management tools became that good, because MR methodology had trivial abstractions, but required efficient implementation to make it work seamlessly.

Abstraction layers on top of this infrastructure now can optimize pipeline as a whole by merging several steps into one when possible, add combiners(partial reduce before shuffle). It requires whole processing pipeline to be defined in more specific operations. Some of them propose to use SQL to formulate task, but it can be done using other primitives. And given this pipeline it is easy to implement optimizations making whole system much more user-friendly and efficient compared to MapReduce, when user has to think about all the optimizations and implement them inside single map/reduce/(combine) operations.

qoega · on Sept 28, 2023

You can even migrate your zookeeper to ClickHouse keeper. It requires small downtime, but you will have all your zookeeper data inside and your clients will just work when your keeper will be back

qoega · on Sept 27, 2023

Did not expect to see issue I created

qoega · on Aug 18, 2023

Open-source ClickHouse also allows both real-time and large historical data.

nhourcard · on Aug 18, 2023

QuestDB, kdb+ and others mentioned are more geared toward time-series workloads, while Clickhouse is more toward OLAP. There are also exciting solutions on the streaming side of things with RisingWave etc.

qoega · on May 27, 2023

I think atwong just promotes his product https://news.ycombinator.com/threads?id=atwong

qoega · on April 13, 2023

It is nice to have an image of expected dashboard in readme.

qoega · on March 21, 2023

Passion and experience