Curious about the architecture here. Where does the 20x speedup come from ?
Recently had a look at Tantivy as well, although compared to raw lucene, their perf is actually inferior. Wonder if there are specific benchmarks here which measure performace and if they compared tail latencies as opposed to averages.
The speedup comes from a number of architectural and low-level performance optimizations in Manticore Search.
Manticore has a modern multithreading architecture with efficient query parallelization that fully utilizes all CPU cores. It supports real-time indexing - documents are searchable immediately after insertion, with no need to wait for flushes or refreshes.
It uses row-wise storage optimized for small to large datasets, and for even larger datasets that don’t fit into memory, there's support for columnar storage through the Manticore Columnar Library.
Secondary indexes are built automatically using the PGM-index (Piecewise Geometric Model index), which enables efficient filtering and sorting by mapping keys to their memory locations. The cost-based query optimizer uses statistics about the data to choose the most efficient execution plan for each query.
Manticore is SQL-first: SQL is its native syntax, and it speaks the MySQL protocol, so it works out of the box with MySQL clients.
It's written in C++, starts quickly, uses minimal RAM, and avoids garbage collection — which helps keep latencies low and stable even under load.
As for benchmarks, there's a growing collection of them at https://db-benchmarks.com, where Manticore is compared to Elasticsearch, MySQL, PostgreSQL, Meilisearch, Typesense, and others. The results are open and reproducible.
If I had to guess, I would say it’s the 20x smaller feature set compared to Elasticsearch.
We built a custom search engine on top of Elasticsearch. Our query builder regularly constructs optimised queries that would be impossible to implement in any of the touted alternatives or replacements, which almost always focus on simple full text search, because that’s everything the developers ever used ES for. There’s a mindboggingly huge number of additional features that you need for serious search engines though, and any contender will have to support at least a subset of these to deserve that title in the first place.
I’m keeping an eye on the space, but so far, I’m less than impressed with everything I’ve seen.
What are the missing features though ? Autoshard, something related to ranking ?
Also curious, why not go with algolia which as I understand kinda built for product facing search use cases ?
I didn't dig into the docs, but now having seen the "create table whatever(name string)" makes me super paranoid: does your mention of "dynamic mapping" as a missing feature mean that if a document shows up with <<{"name":"Fred","birthday":"1970-12-25"}>> it'll drop the document?
There's a JSON field type that lets you use any schema, but it doesn't support full-text filtering. If you are not using it and if your next document has a different schema, it will cause an error when you try to insert it.
It's somewhat smaller, but I believe not 20 times smaller. Among the major features, probably only authentication and auto-sharding are missing. Both are already in progress. On the other hand, the main feature missing in Elasticsearch is proper SQL support, which many Manticore users really appreciate.
Complex boolean queries work great. Manticore also supports over 20 full-text operators — a lot more than Elasticsearch. That's one reason it's popular in areas like patent and legal search, where strong full-text matching is especially important.
Recently had a look at Tantivy as well, although compared to raw lucene, their perf is actually inferior. Wonder if there are specific benchmarks here which measure performace and if they compared tail latencies as opposed to averages.