Hi, I've been loking something like this!
Any of your custumer has success story migrating off bigquery to your platform?
And how do you compare to motherduck? (Looks like you built some of ypur stack on top of duckdb)
Yes, we've had many bigquery / snowflake converts. The reality is, most companies don't have 100tb of data (which is what those platforms are optimized for). Motherduck has a good post[0] on this:
> There were many thousands of customers who paid less than $10 a month for storage, which is half a terabyte. Among customers who were using the service heavily, the median data storage size was much less than 100 GB.
I'm a fan of what motherduck is doing. We're building something different (opinionated, instant data stack), but yes, we both use duckdb under the hood.
Anyone have tried comparing with Qwen VL based model?
I heard good things about its performance on ocr compared to other self hostable model, but haven't really tried benchmarking its performance
Our initial approach was to implement periodic full table re-syncing. We're starting to work on CDC with logical replication for incremental syncing. Here is our roadmap https://github.com/BemiHQ/BemiDB#future-roadmap
Just earlier today I wanted to check if exp(inx) is an orthonormal basis on L^2((0, 1)) or if it needs normalization. This is an extremely trivial one though. Less trivially I had an issue where a paper claimed that a certain white noise, a random series which diverges in a certain Hilbert space, is actually convergent in some L^infinity type space. I had tried to use a Sobolev embedding but that was too crude so it didn't work. o1 correctly realized that you have to use the decay of the L^infinity norm of the eigenbasis, a technique which I had used before but just didn't think of in the moment. It also gave me the eigenbasis and checked that everything works (again, standard but takes a while to find in YOUR setting). I wasn't sure about the normalization so again I asked it to calculate the integral.
This kind of adaptation to your specific setting instead of just spitting out memorized answers in commonn settings is what makes o1 useful for me. Now again, it is often wrong, but if I am completely clueless I like to watch it attempt things and I can get inspiration from that. That's much more useful than seeing a confident wrong answer like 4o would give it.
My best bet for now will be dlt if you have dedicated DE team, but sling will get you a long way for moving data around your warehouse
reply