Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Why do recommender systems not seem good?
47 points by subharmonicon on Feb 26, 2024 | hide | past | favorite | 63 comments
I personally find recommender systems on platforms that I am on to be very poor.

I would expect with all the effort that has gone into these, and all the progress in machine learning, these systems would be fantastic and provide recommendations that I really enjoy. But they don’t.

YouTube seems to have a massive recency bias and music, film, and TV recommendations rarely end up being things I enjoy.



I think that the issue is that it's always assumed that similar things are a better recommendation. The issue is at least for myself and other people I've talked to about that a lot of times we're looking for something different from the last thing. For example if I spend a week listening to vintage surf rock the recommendations that could be given might be 60s pop, or more surf rock. But what actually I wanted was to listen to experimental jazz with a retro funk twist on it. How could they anticipate that? Talk to anyone who's deep into some art form, movies, tv, music, and you'll see recommendations given that maintain vibes more than just similar things.


I think Pandora is one of the better systems out there for this. You prime it with a few examples of stuff you like but then it seems to continually drop slightly different stuff on you to see how you react.


> Pandora

That's a name I've not heard in a long time. I had no idea they were still in business, I may give it a try.


> But what actually I wanted was to listen to experimental jazz with a retro funk twist on it. How could they anticipate that?

Supposedly ML should be able to figure that out, by monitoring millions of other people's listening habits. We are not as unique as we think we are. Apparently the models they use are not very good.


IMO the best recommendation algorithms don't bother recommending things they think one will like; instead they recommend novel things one won't dislike.


You buy a $1000 scooter from Amazon and then it keeps recommending to you other good scooters for months, even after the return window for the first one is closed.

Yeah, you'd expect more from ML at this point. I wonder how much of ML research actually gets utilized in industry.


“Why is Amazon trying to sell me an X? I just bought an X. Idiots!”

Amazon is not an idiot:

> You need two insights here:

> 1) Conditional probability is a mathematical technology that does exist.

> 2) Buying X is not entirely random across the population.

> The ways X is not random vary based on the good [math]. Consider refrigerators.

> You probably buy one every ten years. If I don't know where you are in your refrigerator cycle, my prior estimate should be 2.7e-4 probability of you buying it tomorrow.

> Suppose I know you bought yesterday. In my life on earth, I have realized that buyer's remorse is a thing.

> What's a SWAG for how often a purchase immediately goes wrong? Not right color? Fridge DOA? Shoot I mismeasured my kitchen? Wife just hates it? Call that 2%. If I fix it within a week, then 2% / 7 = 2.9e-3 probability of purchasing a new fridge.

> That's a 10X relative risk.

Source: patio11 (Patrick McKenzie)

https://threadreaderapp.com/thread/982208307057246209.html


That's kind of a dismissive way to say "maybe it's optimized for buyer's remorse or returns?"

Maybe it is, but I'm not totally convinced. For one it doesn't explain why Amazon would recommend literally the same SKU so often. Also, does Amazon really want to incentivize returning large items like fridges?

And the poker thing from the original post

> People who are compensated strictly based on their ability to predict the future, like poker players... tend to be much better at high school math than Twitter users.

Professional gamblers in games like poker usually don't try to predict the future. They just have a small set of hands with known probability and mainly have to focus on things like sizing their bets. (Poker players also have to read human signals, but I'm not sure they explicitly assign probabilities to these).

Another explanation is just that the cost of recommending you a fridge you already own is lower than the cost of tuning the algorithm better or having multiple algorithms depending on tuples of factors (price, customer, customer behavior). If that's true, then we should expect Amazon to do less of this in the future. If Patio11's explanation is correct, then we should expect Amazon to continue to recommend these items even as the algorithms are updated to the more recent generation of AI.


The reason is much simpler, Amazon is correctly optimizing for the person who buys like 10 scooters. This doesn’t sound rational but it’s the same principles as video game IAP. God rolls a 100 sided dice and if it’s a 1 or a 2, you’re born someone who just buys an absolutely ridiculous amount of stuff. These people are rare but the system is not wrong.


Sounds like the argument that social media is basically abusing mental illnesses and people with impulsive spending behaviors. Many of the "whales" in online gaming, gambling, shopping, etc. are people who YOLO their future away due to underlying illnesses, and the MINT people abuse it while telling themselves it's creating value for the customer.


patio11 has interesting insights, but does give off an "angry nerd" vibe

Whether poker players are predicting the future or not, I would argue Amazon is doing the same thing:

- Each potential customer can be considered a hand with a certain probability.

- Amazon is sizing their "bet" (ad budget) according to the probability that customer will convert.

I also agree with this observation about poker, and can imagine Amazon using a similar strategy: https://hw.leftium.com/#/item/39462238

Successful marketers were already teaching this decades ago: https://thegaryhalbertletter.com/newsletters/direct_marketin...


Amazon ARE idiots.

So many people complain about this behavior of recommender systems and here comes Patrick dropping some math and saying: “Well, actually there’s this and this probability for this to happen”.

Don’t piss on me and tell me that it’s raining. Before I make a purchase, especially online, I do my research and make a choice, that’s it.

I’ve never returned anything online and I’ve never needed to buy a second item after buying the first one.

If you insist that your math is right, give me a button I can press so I don’t have to care about your probabilities.


I fully agree with you. It feels like a local maximum that at the same time repulses a lot of people from further shopping (like us) and then the engineers point at the data and say: "look at these probabilities, nobody else is buying anything" - yes, because we're upset with the unempathetic UX of your website and we closed the tab.


This sounds like something written by Nassim Taleb.


I find this endless quoting of patio11 a bit exhausting, people just repost his ideas like it's some sort of religious artifact and eternal truth. He's a clever guy but I get a bit sceptical when a clever guy spends too much time writing clever thoughts instead of building nice things...


Yes, absolutely hilarious, that after you buy a good air fryer, one that will last 5 years+ : These company's break every shred of moral and legal lines to hell and back to make sure that what happens is "WE HEARD YOU LOVE AIR FRYERS! here is 10 more!!!"


Was puzzled by this too, but realized that chances of a repeat buy are probably higher than a near blind shot at another category. So does make sense on some level


It's weird but tiktok is the only one that seems to do a good job. Seriously. The tiktok recommendation algorithm is so good. And then for ads, instagram seems to be the only one that does a good job. I regularly get ads for things I actually want from instagram, never from any other platform.


You might reconsider your thoughts when you recognize the fact that TikTok's first priority is not to make money but to deliver pro-russian and pro-chinese propaganda messages into your brain.


Not sure how that's related to the quality of recommendations the algorithm gives.


Because the algorithm has been shown to recommend nice videos which show that the russians have very good reasons to annex ukraine, or that the chinese economy is going so well, or that this right-wing fringe party has actually very good answer to the speech of a government official.

But yes, it's all just a coincidence and does not affect the quality of recommendations.

TikTok recommends 9 good videos and then 1 where they explain the political topic of the day from their perspective because you are located in Europe.


I live in Europe, I browse TikTok regularly I have yet to see a political post. It's mostly gardening and cycling content on my feed. Maybe this is a "search and it will find you" type of situation?


Often they just use "people who watched X also watch Y", which actually selects for the most popular stuff instead of the most similar.


Most similar has some hilarious pitfalls too though. I saw a talk by someone who worked on the google play music, or whatever it's called these days, recommendation system and his canonical example was a pure similarity metric would recommend the all-women band "Lez Zepplin" to fans of Led Zeppelin. How's that going to score amongst listeners?


I think a lot of recommendation system take only the positive signals into account but don't take the negative signals that seriously.

So If I like and disliked 10 movies just don't show me movies from users who also liked these. First, filter or downgrade all users who liked what I disliked and then create my recommendations.


You and your enjoyment is no longer center to the recommendations. Ads/money, engagement, and your time are the only metrics that matter.

Results these days seem worse then the old days of Altavista and Lycos.


But this could also be self-reinforcing a local maximum while alienating more and more users. And then your data shows much better metrics for the cohort which puts up with the bad experience, because everyone else churns. This might work for some time, but I don't think it is sustainable.

Right now it feels everyone is ready to ditch Apple products, Twitter, Tesla, Amazon, YouTube, Gmail once something better comes around because of all these small quirks and weird UX things where you notice that the vendor's interests are just not aligned with you as a user/customer.


Mostly conjecture on my part, and an anecdote: for some things, I have very specific and niche tastes that come in the form of things that are difficult to label.

Example: rap music and hiphop. For the most part, I don't enjoy it that much. There are a few things though that will make a track palatable to me (or instantly turn me off despite anything else positive about it):

    - Sentimentality or romance in the lyrics
    - Backing track or samples that are harmonically interesting
    - No egregious sexism, misogyny, glorifying of violence, thug/gangbanger culture, etc
    - Beats featuring stereotypical trap hi hats kind of annoy me
 
I've enjoyed tracks like Deja Vu by Post Malone, or Lucid Dreams by Juice WRLD. Browsing the rest of their discography consistently disappoints me though, because tracks like these are few and far between.

The way I assume recommendation systems are traditionally designed does not account for this. It sees me listen to these tracks, and thinks I'll probably like something by similar artists or the same artists. As far as I'm aware, Spotify's recommendation system is not aware of things like tempo, meter, tonality, themes of the lyrics, harmony, etc. and so there's no way it can pick tracks like this out from the crowd.

And why would they bother? Those are all much more technically difficult things to implement than forming correlations between IDs in a database.


This was the promise of the Music Genome Project, the database for Pandora's recommendation engine.


That was supposed to be the value proposition of Pandora and the music genome project. I don’t think the Pandora algorithm is very good, I’d guess they’re using something simpler.


> YouTube seems to have a massive recency bias

This is likely intentional to encourage more content creation. Competing with two decades of content is almost impossible, so they make them compete with just 2 weeks of content.


I have a theory that some people's likes are based on things that are too subtle, intangible, or unrepeatable to identify, predict, or data mine. At least that's what I've figured is why recommendation systems don't work for me and why I run into so many dead ends when I try to use clear aspects of what I like to find other likes.


Scale. Providing accurate recommendation algorithms for thousands+++ of people across thousands+++ of data items is surprisingly expensive in compute and electricity. For any one user, sure you can do whatever you like. When you divide your resources across your userbase the prices get larger and larger.


Sam Altman said something about Instagram which is interesting. He said at midnight, the quality of the recommendations is higher because few people are on the platform.

Does it mean at peak hours IG switches to a simpler algorithm?

Otherwise, why would the quality of a RS drop during those hours?


I bet their recommendation algorithms use some kind of iterative refinement to keep recommendations "fresh" and that at midnight it's just done with lower latency.


Did Mr Altman happen to elaborate on what time zone he had in mind? ;)


Because recommendation is psychology-complete.

It is extremely hard to predict human behavior beyond simple schemes such as most popular items (or most similar items to those you’ve seen before).

(bio: six years of xp in a leading recommendation company)


I think the problem might be the quality of the data. For example on Steam the problem is that the tags are used very liberally to the point where these tags lose any meaning and thus the recommendations suck as a result.

Let's say for example that you've enjoyed the recent hit game Baldur's Gate 3 and you'd like to play something similar. You check out the Steam page, see that the game is tagged as a "RPG", so you click the tag and expect to get something similar. What you get instead are games that are not only very different but also so far removed from the genre that no one will ever list them in a forum thread talking about RPGs. Examples include titles such as Dota 2, Warframe, Palworld and Horizon Zero Dawn. There are genuine RPG games as well but the fact that there are so many titles that you need to ignore is pretty bad.

Tags aren't the only way Steam recommends new games. Going back to Baldur's Gate 3 Steam page there's a section called "more like this". I'd expect it to match more closely to BG3 and in many cases it does. But when it doesn't, it shows up ridiculous recommendations like The Sims 3 or Tom Clancy's The Division - games that have nothing to do with what Baldur's Gate 3 is.

And all of this is for an extremely popular game that at the same time doesn't do anything revolutionary. Trying the same approach with a more unconventional title that you've liked is a quick recipe for failure. I've just checked the recommendation page for Undertale and it's full of random games that have nothing to do with the title.


Not an expert, but as a learning experience I wrote a board game recommender years ago based on data from boardgamegeek.

There were a lot of little things that added up:

  1. Everyone interprets the 1.0 - 10.0 rating scale differently.
  2. Most users just rate the same, universally known games.
  3. For the other users, the games they've played are usually really different.  It's a sparse matrix.
Every attempt at game-to-game analysis flopped. User-to-user analysis seemed to work better.

I managed to find a few dozen similar users. Found some hidden gems by going through their pages manually. Fewer than I would have hoped though.


Because to make good, personalized recommendations they need to know what you don't know, and there isn't a good source of information about it.

For example, if you have not seen The Shawshank Redemption, chances are you will like it. It's #1 among IMDB top 250 list. But a recommender does not know if you've seen it. If you've seen it already, it's a bad recommendation.

So the same recommendation for the same person can be good today and bad tomorrow, depending on something recommender engine does not see. That makes it very difficult to tune and measure performance.


Because the recommendations aren't for you - they are what generates the most amount of ad revenue for them, which roughly correlates to the amount of money advertisers are willing to spend to reach people like you. That is in no way a guarantee that the recommendations will be good for you, because the incentives are not aligned.


It's a tricky balance. The weaker the quality of recommendations, the more likely a user is to switch to another platform.


This falls apart when you look at platforms like Netflix. They recommend their original content because they don't pay royalties. The platforms aren't interchangeable though. If you want Disney movies, you aren't getting them anywhere else but Disney plus.


Sure, but what I mean to say is that, at a certain point, it's very easy to watch a Disney movie instead of a Netflix movie (or neither).


This a lot. They aren’t optimizing for the user most likely.


Do you think these are optimized for what YOU want or for what the company wants.

YouTube: Is that recommendation for good content or for the highest value content you will consume?

Netflix: The more you use it, the less they make. There is a perverse incentive to put just enough good content in front of you to stay subscribed but not use it more.

Amazon: They dont give a fuck what you buy, the sellers are now in a race to the bottom and that business pays for it self. AWS makes all the money.

Find the perverse incentive and optimize for that.


I have found YouTube to be good, but it gives you exactly what you ask for. Its like a yes-man for better or worse (this seems to be extended to people who are making videos now, too). But, once you are aware of that, it is not so bad, you just have to search for and like the right stuff. I like new, low view count videos like people doing $hobby with no commentary. That is mostly what I get now mixed in with some other things that I like.


Excellent question. All the streaming providers only seem to suggest movies from my generation and absolutely nothing new. It feels like they have created an ultra generic advertising profile for me and only use that to recommend content. It's incredibly depressing. Youtube is slightly better but still extremely siloed. No amount of "algorithm training" seems to help.


When I login to video platforms, I often spend some time to make bookmarks of videos I want to watch.

When it comes to relaxing and watching something, I don’t like _my own recommendations_.

Maybe recommendation is like that friend who invites you to watch a movie—you know it’s a gamble. Haha.


I think it's partly just a reflection of how complex and poorly understood taste is.


Have some new ideas for that. Have a Ph.D. in math, and derived some math for that, wrote it out with theorems and proofs in TeX. Should do better than current AI. Have Web site code, running as intended, for that. Collecting data.


This is very exciting. Do you need beta testers? How do we track your progress? Ship it!!!


Thanks! Intend to announce here at HN. Yes, beta testers will be very welcome. For the feedback, guessing a Facebook page?


Have you used the tiktok one? It's too good for my liking. It can make weird inferences that no other algo can (this is actually why I deleted tiktok, far too addictively good)


For non-tiktok users, what types of media is it useful as a recommendation engine?


They don’t seem good because they aren’t good.


They work well if there are lots of people like you (ie you are a normie).

All machine learning algorithms struggle at the edges — they’re very good at predicting aggregate behavior.

If you have eclectic tastes there’s probably not enough data on your demographic.


Do you use ad-blocking technology?


Agenda. They want you to consume what they want. Have you tried using search on YouTube? First three results somewhat related to what you want, followed by a section of shorts and the rest is complete bullshit.

...and I'll never stop saying this: they have some sort of monetized recommendation system in place. Can't prove it but I can see it working almost every week. A video of a big company, "celebrity" or TV channel that I'd never watch, find it's way in my feed.


Would LLMs change anything?


They are not optimized for what you enjoy, or even want.

A passable analogy: you buy a car and get hassled, often hard-sold for a pre-paid maintenance package, tire insurance, financing insurance, undercoating, bla bla. You don't want any of it but it's what they push the hardest.


That is biggest problem to explain to people that companies don’t do what’s best for their customers.

It is counter intuitive but yes companies do what is best for them. Often it also happens to be aligned but also quite often not.

Like loyal customers more often are ripped off because of vendor lock where new customers get huge discounts.


> It is counterintuitive

Is it? This is how any (public) company works - they will do whatever it takes to make the shareholder return the highest. This inherently disregards what's best for the "customer." Any beliefs contrary to this is a naive belief that doing what the customer wants == the most revenue, e.g. "the customer is always right."

Maybe in some fields, but definitely not in circumstances like this where the "customer" is the product (or more accurately - the advertisers are the real customers).

By extension, "companies don't always do what's best for their customers" also applies to their employees, maybe even more so. I'm very certain that the largest US companies would kill their own employees if it was legal and resulted in the most cost savings/profit for them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: