Hacker News new | past | comments | ask | show | jobs | submit login

> I'm talking about the internals of couchjs.

_list functions stream, and was specifically designed to be able to stream. However, streaming does not help much in this case, moreover too small chunks dramatically reduce already awful _list perf.

Taking this in account I see no value for views to stream. Sending all emitted KVs for the doc to Erlang bits in one turn seems both much more predictable and safe.

> Multiple CVEs have been reported including a full RCE, which was patched in later versions

Last CVEs has nothing to do with Spidermonkey. BTW they may be patched without upgrade, with 5LOC long design document in Erlang.

The main reason of last CVEs is inconsistency in parsing improper JSONs by Erlang parser. Namely, most JSON parsers process '{"abc":1, "abc":2}' as {abc:2}, but old jiffy parses it as {abc:1}. BTW severe inconsistencies in parsing JSON are pretty common across implementations, please read http://seriot.ch/parsing_json.php for more details.

> Citation needed. There's an issue specifically about this in the CouchDB issue tracker if you'd like to read more.

I‘ll give you no cite, sorry, because we discovered the effect during internal tests. Reason is simple: accessing values in deep branchy JSONs is generally faster in JS, because it‘s native format. JSON in Erlang is a monster like {[{<<"abc">>, 1}]} for {abc:1}, which, when has a lot of levels and nodes at each level, performs bad for selecting random node.

> Curious what kind of database sizes you're working with too

We sometimes measure number of docs with M postfix ) Not very often, however. In my humble opinion, if you plan to have CouchDB with, say, 100M docs in a single bucket, you probably chose wrong solution.

BTW, same for large kitchen DBs for buckets with, say, 10K docs.




> I‘ll give you no cite, sorry, because we discovered the effect during internal tests

We discovered the issue in internal tests and reported it upstream where it was confirmed; there's nothing to discuss here.

> BTW they may be patched without upgrade, with 5LOC long design document in Erlang.

I don't have to hand-patch other database systems. Further, as of now there are no functioning packages for multiple versions of Ubuntu. Multiple competitors do not have any of these problems.

> Last CVEs has nothing to do with Spidermonkey

It was a full RCE that didn't have vendor packages ready in time.

Sorry, but using seven year old language runtimes is daft. It might be fine for you, but it's not appropriate in environments where you care about security and performance, or care about the overhead of making your team reason about these things unnecessarily.

> We sometimes measure number of docs with M postfix

Yeah, we did this in 2003 on commodity hardware, and even it built indexes faster than CouchDB builds map/reduce indexes. Fix the external view server protocol or be honest with people and kill it off – the status quo is unacceptable.

Finally, here's a five year old issue that's still open admitting what I just explained: https://issues.apache.org/jira/browse/COUCHDB-1743


100M docs? Of json format? On commodity hardware? In 2003?

Hahaha, hello long waited friend from parallel universe!


Easy enough to convert on the way out, using a mature database system without these performance flaws. The app got done and performed well, and we weren't at the mercy of an unresponsive community that leaves seven year old dependencies in critical paths.

Keep going; it'll bite you eventually. Don't say you weren't warned.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: