You're comparing completely, utterly different results here, and it's really hur...

srpeck · on Feb 2, 2017

Have a look at the benchmarks here: http://kparc.com/q4/readme.txt

Also: https://hn.algolia.com/?query=http:%2F%2Fkparc.com%2Fq4%2Fre...

geocar · on Feb 2, 2017

> You're comparing completely, utterly different results here, and it's really hurting any point you're trying to make.

Then argue with the point you think I could be making instead of the point that you think I'm making[1]

[1]: http://philosophy.lander.edu/oriental/charity.html

> you would have to compare KDB and Java/Spark both running on the Xeon Phis, and/or running both on 11x m3.xlarge AWS instances - and even then, if Java/Spark does poorly on the Xeon Phi test...

If Spark can solve the business problem in less real-time in another way, I think that would be worth talking about, but it's my understanding that a bunch of mid/large machines connected to shared storage is the typical Spark deployment, and the hardware costs are similar to the Phi solution.

So my larger question still stands: What is the value in this approach, if it's not faster or cheaper?

tobz · on Feb 2, 2017

If "this approach" is using Java/Spark, instead of something that is a smaller binary, then there are some easy answers to your questions:

- people don't want to write C (or K, or whatever yields a small binary)

- the cost of switching languages is not worth the speed-up

- it's already fast enough

I don't think you're wrong, overall, that, specifically, kdb can be much faster than an equivalently sized Spark cluster, but simply being faster does not invalidate other approaches, which is what you seem to be arguing for.

geocar · on Feb 3, 2017

I'm not arguing for anything: I'm asking what do we get for this cost.

It sounds like you're suggesting we get:

* Not having to write in SQL (note KDB supports SQL92)

Maybe something else? I'm not sure I understand.