They don't mention BM25, which still outperforms much of semantic search. A fun exercise is to watch the benchmarks of the latest semantic embeddings models and see that they still struggle to match good 'ol BM25.
BM25 uses the relative statistical frequency of words to identify relevant material, along with some adjustments. It doesn't use ML at all, but it works very well, especially for technical content.
SPLADE is capable for some areas but is slow, and often times it doesn't present much of a benefit (or is worse) versus BM25 for technical searches, where specific technical words don't have many synonyms that it would be able to pull.
The best search systems today use a mix of semantic search and BM25 or SPLADE, depending on the type of material and the speed required.
I've had pretty good success with BM25 + stemming, or even easier, BM25 with trigram tokenization. If the index isn't too big, the whole search can be done client-side and is lightning fast.
Isn't the big problem that BM25 (and friends) will help you find (and rank) exact search terms (or stemmed varieties of that search term), whereas semantic search can typically find items out-of-dictionary but "close" semantically? SPLADE, on my reading of it, seems to do a "pre-materialization" of the out-of-dictionary part.
It's various measures of recall rate. Recall@500 means what percentage of the time does the target document show up in the top 500 results from the retrieval system.
I found BM25 and everything resembling it (like TF/IDF) to be near useless. It was (back in the day) really necessary to use external semantic info, or at least data gathered by examining the whole document set for stuff going beyond term frequency. I was excited by the first part of the SPLADE article because I thought it was going to use LLM's to somehow find concept embeddings in documents and let you search for those. But as someone said, it turns out to be a version of synonym search except the thesaurus is generated automatically. I remember someone did that with Word2Vec some years back and it was sort of useful, but generally the problem with search systems is too many results rather than missing some that are relevant.
This feels like a big missing piece at this stage of AIs evolution. I just searched on Amazon for five inch chair casters. They used to have them but don't anymore. But that took me a long time to find out. Instead it just dumped all of the chair casters and let me read the details to find out the hard way that none of them were what I wanted across 10 pages of results. But I've been spoiled by modern chatbots. I want it to read the product copy and figure it out, and just tell me "We don't have any of those. But here are some others you might like..."
It seems inevitable that search boxes will become for prompts rather than just keywords, and become conversational and include the context of previous searches.
> This feels like a big missing piece at this stage of AIs evolution. I just searched on Amazon for five inch chair casters. They used to have them but don't anymore. But that took me a long time to find out. Instead it just dumped all of the chair casters and let me read the details to find out the hard way that none of them were what I wanted across 10 pages of results. But I've been spoiled by modern chatbots. I want it to read the product copy and figure it out, and just tell me "We don't have any of those. But here are some others you might like..."
Why would a company want to provide this? Modern-day Amazon doesn't win when you find out that what you want isn't there; they "win" when their search is so bad that you spend a long time browsing counterfeit or not-quite-there results that you might buy. The future of an Amazon-designed chatbot that I see would be for it to try actively to snowball me into buying an inappropriate product, not to help me quickly discover that what I want isn't there.
Have you seen Miracle on 34th Street, where Kris Kringle convinces Macy's and Gimbels to tell their customers that their competition has what they're looking for? That isn't as fictional as Santa Claus. I was in a Walmart yesterday when I heard a clerk tell a customer that the hardware store down the street has the thing that Walmart didn't. It happens a bazillion times a day.
> Have you seen Miracle on 34th Street, where Kris Kringle convinces Macy's and Gimbels to tell their customers that their competition has what they're looking for? That isn't as fictional as Santa Claus. I was in a Walmart yesterday when I heard a clerk tell a customer that the hardware store down the street has the thing that Walmart didn't. It happens a bazillion times a day.
One of my favorite movies, yes! And I absolutely believe that a clerk will do this; if there was ever a loyalty of staff to the retail companies for which they worked, it was probably misplaced then, and is definitely misplaced now. But that's a matter of human discretion that overseers haven't yet been able to engineer out, whereas here we're talking about something that would have specifically to engineer in. The unwillingness of Amazon even to try to confront the massive review-gaming and counterfeit-items problem leads me to believe that, to the extent that they ever viewed their mission as getting the customer what they wanted, they don't now—and so will not change their search in a way that would facilitate that.
Just wanted to share a fun experience I had when visiting Pompeii last month: we were walking around the city (the archeological park, that is) and I wanted to see one of those roman public bathrooms. So I googled "pompeii latrine" and got for search results links for how to find the bathrooms (as in, the guest facilities). I was initially confused as to why such a clear query got me the completely wrong answer, until it hit me that bathroom and latrine are semantically similar, but not in my context.
At least that's my headcanon, who knows. And it seems like the cool preserved latrinae were in herculanum anyway. Still, fun to think about
This article is a fancy way of saying "we take keywords and break them down to synonyms", bundling "machine learning", and acting like it's a magical solution. Not to be dreary it's just not exciting.
BM25 uses the relative statistical frequency of words to identify relevant material, along with some adjustments. It doesn't use ML at all, but it works very well, especially for technical content.
SPLADE is capable for some areas but is slow, and often times it doesn't present much of a benefit (or is worse) versus BM25 for technical searches, where specific technical words don't have many synonyms that it would be able to pull.
The best search systems today use a mix of semantic search and BM25 or SPLADE, depending on the type of material and the speed required.