Kibana and histograms are reporting. Now the snark is even more confusing, since...

dozzie · on Nov 14, 2017

> Kibana and histograms are reporting. [...] you’re doing exactly what I say is > a poor fit, but claiming it’s not your use case.

You must be from the species that can predict each and every report before it's needed. Good for you.

Also, I didn't claim that I don't use reports known in advance; I do use them. But there are cases when preparing such a report for just seeing one trend is an overkill, and there's still troubleshooting that is helped by the query language. Your defined-in-advance reports don't help with that.

> I spend what time I can trying to show those very same sysadmins you’re talking about why ES is a poor architecture for log work, particularly at scale.

OK. What works "particularly at scale", then?

Also, do you realize that "particularly at scale" is a quite rare setting, and "a dozen or less of gigabytes a day" scale is much, much more common, and ES works (worked) reasonably well for that?

jsmthrowaway · on Nov 14, 2017

You should read the Dremel and Dataflow papers as examples of alternative approaches and dial down your sarcastic attitude by about four clicks. You don’t need to define reporting ahead of time when architected well; it’s quite possible to do ad-hoc and post-hoc without indexing per record. At small scale, your questions are quite infrequent and the corpus small, meaning waiting on a full scan isn’t the end of the world.

A dozen or less gigabytes a day means: use grep. This is just like throwing Hadoop at that log volume.

This was an opportunity to learn from someone with a different perspective, and I could learn something from yours, but instead, you’ve made me regret even saying anything. I’m sorry, I just can’t engage with you further.

(Edit: I’m genuinely mystified that discussing alternative architectures is somehow arrogant “pissing on” people. Why personalize this so much?)

dozzie · on Nov 14, 2017

So, basically, you have/had an access to closed software designed specifically for working with system logs and based on that you piss on everybody who uses what they have at hand on a smaller scale. Or at least this is how I see your comments here.

I may need to tone down my sarcasm, but likewise, you need to tone down your arrogance about working at Google or compatible.

But still, thank you for the search keyword ("dremel"). I certainly will read the paper (though I don't expect too many very specific ideas from a publication ten pages long), since I dislike the current landscape of only having ES, flat files, and paid solutions for storing logs at a rate of few GB per day.

> A dozen or less gigabytes a day means: use grep. This is just like throwing Hadoop at that log volume.

No, not quite. I do also use grep and awk (and App::RecordStream) with that. I still want to have a query language for working with this data, especially if it is combined with easily usable histogram plotter.