The storage mechanism isn't fast/flexible enough to do anything really great. It's fairly easy for us to do precise matches (with some caveats), but given everything is cached at the database level we can't easily compute results based on your search query.
Effectively what we do is:
SELECT * from groups where ID IN (
SELECT group_id FROM index WHERE key = :searchKey AND value =:searchValue
UNION [...]
)
(This is a simplification but it should give you an idea of how the queries are built)
Because of the way the model works, its very hard to do certain things like exclusion queries, and more importantly all of the results you're seeing are still cached. The biggest pain point here is if you're searching for e.g. "ConnectionError environment:production", you really don't want to see anything related to non-production. We're solving that problem immediately, but its just the tip of the iceberg.
Next year we're kicking off a large project to overhaul the key infrastructure which powers the stream/search functionality, with some pretty big ambitions.
Other services in the space generally use something like Elastic search, which can provide some of this out of the box. We've always been built on SQL/Redis, and given that Elastic has its own set of problems we've decided that it's likely best for us to move to a columnar store format that doesn't cache results (e.g. counts), but rather computes them in real-time much like Scuba.
I'm super happy to read that you plan on improving search!
I really like Sentry, the UI is pretty good, but still a bit counter-intuitive or limited in some way. For example when looking at an issue, I click on a release in the "last seen" section on the right it takes me to the page dedicate to this release, but when I click on the release tag for the event (at the top in the main section) it link to the global search, from which if you click on an issue it won't show you the event for the release. There is also no easy way to see the event for an issue that only matches a specific release (you have to do the exact search yourself).
I also have trouble with the release management and the regression detection but that's because we have multiple production branches in flight at the same time. I guess we're a bit outside the expectation here :)
Sentry 6.x experimented with throwing various attributes of the event object into ES. Have you ruled out using ES in the future?
FWIW, I'm still running a version of 6.x where I put the tags for every event into Splunk of all things. The Splunk events then link back to the corresponding Sentry event. It's slow and klunky, but it gives the Sentry users at $dayjob a much better search interface and ability to slice and dice on tags.
(Note, I'm not suggesting you use Splunk in your backend!)
Effectively what we do is:
SELECT * from groups where ID IN ( SELECT group_id FROM index WHERE key = :searchKey AND value =:searchValue UNION [...] )
(This is a simplification but it should give you an idea of how the queries are built)
Because of the way the model works, its very hard to do certain things like exclusion queries, and more importantly all of the results you're seeing are still cached. The biggest pain point here is if you're searching for e.g. "ConnectionError environment:production", you really don't want to see anything related to non-production. We're solving that problem immediately, but its just the tip of the iceberg.
Next year we're kicking off a large project to overhaul the key infrastructure which powers the stream/search functionality, with some pretty big ambitions.
Other services in the space generally use something like Elastic search, which can provide some of this out of the box. We've always been built on SQL/Redis, and given that Elastic has its own set of problems we've decided that it's likely best for us to move to a columnar store format that doesn't cache results (e.g. counts), but rather computes them in real-time much like Scuba.