Search engines tend to produce neutral garbage, not harmful garbage (i.e. small ...

lolinder · 2024-10-31T20:22:18 1730406138

The problem I see with this "cover for each other" theory is that as it stands having a good search engine is a prerequisite to having good outputs from RAG. If your search engine doesn't turn up something useful in the top 10 (which most search engines currently don't for many types of queries) then your llm will just be summarizing the garbage that was turned up.

Currently I do find that Perplexity works substantially better then Google for finding what I need, but it remains to be seen if they're able to stay useful as a larger and larger portion of online content just AI generated garbage.

DrammBA · 2024-11-01T15:29:53 1730474993

> Search engines tend to produce neutral garbage, not harmful garbage (i.e. small tidbits of data between an ocean of SEO fluff, rather than completely incorrect facts)

Wasn't google AI surfacing results about making pizza with glue and eating rocks? how is that not harmful garbage?

eviks · 2024-11-01T03:04:30 1730430270

That's not a plausible imagination that such a prefect complement exists

faizshah · 2024-10-31T21:56:29 1730411789

You just described the value proposition of RAG.

lottin · 2024-10-31T20:48:15 1730407695

Maybe it's just me but I have no interest in having a computer algorithm interpret data for me. That's a job that I want to do myself.

swyx · 2024-10-31T20:52:32 1730407952

then you are blissfully unaware of how much data is already being interpreted for you by computer algorithms, and how much you probably actually really like it.

orthecreedence · 2024-10-31T22:10:21 1730412621

> how much you probably actually really like it.

This comes off as condescending. As things have gotten more algorithmic over the last two decades, I've noticed a matching decrease in the accuracy and relevance of the information I seek from the systems I interact with that employ these algorithms.

Yes, you're right that there are processing algorithms behind the scenes interpreting the data for us. But you're wrong: I fucking hate it, it's made things worse, and layering more on top will not make things any better.

lottin · 2024-10-31T21:07:56 1730408876

What do you mean?

esafak · 2024-11-01T00:31:02 1730421062

Ranking the results is a prerequisite for any search engine, and that's interpretation, isn't it?

lottin · 2024-11-01T06:22:30 1730442150

No, I wouldn't say that. I mean they could try to influence what sources I check first, but I'm the one doing the interpretation.

esafak · 2024-11-01T12:54:42 1730465682

So summarization is interpretation but ranking and filtering are not. Most would disagree.

lottin · 2024-11-01T16:07:28 1730477248

I don't think anyone can disagree. If you ask someone to give you an interpretation of the works of, say, Allen Ginsberg, or of the theory of relativity, and they come back with a pile of documents ordered in some fashion, you won't be satisfied because that's not what you asked for.

Kiro · 2024-11-01T06:57:39 1730444259

99.99% of all data is complete garbage and impossible for a human to sift through. Most spam email doesn't even end up in your spam inbox. It gets stopped long before that.

sethammons · 2024-11-01T11:15:29 1730459729

You prefer the spam laden inboxes of yore? I sure a shit don't. Gmail's spam filtering saves so much crap from my inbox

crakenzak · 2024-11-01T01:00:22 1730422822

Sure, then just do that. No one will hold you at gunpoint and force you to use any kind of tool.

lottin · 2024-11-01T06:09:32 1730441372

The criticism isn't that they are forcing me to use this tool.