The Fall of Stack Overflow

MontyCarloHall · on July 25, 2023

These figures show an extremely precipitous (and permanent) decline in traffic over the course of a few days in May of 2022 [0], during which the number of daily new visits dropped from ~1M to ~300K, the number of total daily page views dropped from ~20M to ~14M, and the number of daily sessions dropped from ~9.4M to ~6.1M.

However, there is no commensurate decrease in posts/votes during the same time period. Posts/votes remained relatively constant through 2022 (modulo normal seasonal fluctuations), until February 2023 when both fell off a cliff (I assume due to the rise of LLMs). Traffic data are sourced from Google Analytics, while post/vote data are computed internally by StackOverflow [1]. I wonder if the apparent precipitous drop in traffic in May 2022 is simply an artifact of Google Analytics suddenly changing how it tracks traffic/visitors.

[0] https://i.imgur.com/qMj7Lge.png

[1] https://news.ycombinator.com/item?id=36856249

pasc1878 · on July 26, 2023

From the comments on this answer https://meta.stackexchange.com/a/391625/136010 it is suggested and agreed by staff that the chnage in May 2022 was the role out of a proper cookie consent form. If you don"t have performance cookies SO can"t work out the analytics. Fro staff member Catija

"@JourneymanGeekOnStrike Yeah, if you go back further, the "traffic" numbers see a 40M/week drop between April and May 2022, which is when the cookie tracking changed, and then normalizes again until December. So, prior to the cookie changes, traffic was about 140-150M per week. But, to be clear - this is stuff we're aware of and have "corrected" for, I guess."

dehrmann · on July 25, 2023

Drops like that are either due to how bots are tracked, a Google algo change, or a major internal design change.

notatoad · on July 25, 2023

or else bad measurement.

the post doesn't explain where they got these traffic numbers, and it seems unlikely they have access to stackoverflow's real traffic stats. they're using some sort of estimation here. there's always a chance that their estimates are wrong - especially if they're showing implausible shifts like this.

mananaysiempre · on July 25, 2023

> it seems unlikely they have access to stackoverflow's real traffic stats.

The question and user profile view stats (among other things) are public on the Data Explorer and included in the dumps:

https://meta.stackexchange.com/questions/2677/database-schem...

Not sure what they’re counting as a “visit”, though.

CobrastanJorji · on July 25, 2023

That makes sense. "New visits" are first time users, likely young coders who are looking up answers to things on a search engine, find what they're looking for on Stack Overflow, maybe click on an ad, and leave. They probably don't vote or post much. A sudden die-off there suggests something very bad happened to organic traffic (change in Google? Terrible new SEO scheme? A sudden stop in ad buys?)

The new content rate been dropping at a dismally constant rate for a long time, but the first few months in 2023 were awfully grim. I wonder what might've corresponded to that.

AtNightWeCode · on July 25, 2023

If SO was worried about that drop I think they would have bought back some of that traffic. More likely something has changed how they count the visits or they blocked some bad traffic. Traffic data is often sampled as well.

The fall in the beginning of 2023 may be the introduction of ChatGPT. A more worrying idea is that the numbers reflect not just the decline of SO but a decline of the whole IT business.

moneywoes · on July 26, 2023

Perhaps they focus more on their enterprise offerings now?

SantalBlush · on July 25, 2023

Or a StackOverflow-scraped site peeled off users looking for quick solutions, while those with SO accounts still used it for posting and voting.

leblancfg · on July 25, 2023

I’d say the peak of Github copilot adoption might have been during that month as well, IIRC the VS Code extension got released late March

palata · on July 25, 2023

Which would make sense, right? You are more likely to get an answer on StackOverflow for questions that touch very common technology (because more people are likely to answer). And that is exactly where Copilot probably shines too (I don't use it): because that is where there is a lot of training data.

I personally used to like StackOverflow as my last recourse: I grew up in those years where we had to RTFM, and I kept the habit. So if I go ask on StackOverflow, it is a tricky question. It used to be fine, and I was getting an answer eventually (sometimes after adding a bounty).

But in the last few years, I have had legit questions downvoted or even closed, and it was obvious that the people voting to close it did not even understand it. I agree that the moderation culture on StackOverflow is toxic. If everytime I contribute something, I have to fight to not get downvoted or closed, then I will slowly stop contributing.

Semaphor · on July 25, 2023

The most help I ever got from SO for questions not already there, was because of their (perceived) strictness. The process of writing a high-quality question, with a minimally viable example, clearly lined out thought-processes, and other things tried, solved the question for me in most cases without me ever having to post it.

djbusby · on July 25, 2023

Rubber Duck.

https://en.m.wikipedia.org/wiki/Rubber_duck_debugging

Eduard · on July 25, 2023

> The process of writing a high-quality question, with a minimally viable example, clearly lined out thought-processes, and other things tried, solved the question for me in most cases without me ever having to post it.

Nevertheless post the question and provide an answer. Everybody wins: you reap the upvotes, and everyone else benefits from the shared knowledge.

seanthemon · on July 25, 2023

"marked as duplicate" pointing to a 5yo answer with links that 404

einpoklum · on July 25, 2023

An answer relying on links rather than its own text is actually against policy, and should not have stayed up.

Unfortunately, there is a moderator strike now so I can't tell you to flag such an answer...

JimDabell · on July 25, 2023

You don’t mark answers as duplicates, you mark questions as duplicates. And if it’s a duplicate question, the new answer should be posted to the old question. So it’s correct to mark the question as a duplicate. Otherwise all the people arriving at the original question won’t see the new answer.

akho · on July 25, 2023

Upvotes are worthless, and SO is unnecessary in this flow. Write a blog post or a gist, don't deal with SO mods.

Semaphor · on July 25, 2023

Pretty much, yeah. Forgot about that term despite only encountering it in a novel 2 days ago :D

lispybanana · on July 25, 2023

I hope you still posted (and self-answered).

Semaphor · on July 25, 2023

I probably should have, but I never felt the question was worthy.

dgb23 · on July 25, 2023

It’s likely that your question was worthy, especially if it required effort to answer.

lawgimenez · on July 25, 2023

My SO account is almost 12 years, with just over 2k reputation and I don't really care. Until now I am still somewhat helping answer some basic questions in the mobile development area tags, my only gripe with SO is the hostile nature of some mods with large reputation. Some seem to get a kick out of this and forgot that reputation does not translate to expertise.

For 12 years, they have not figured this one out. New users will ask a very valid question and then won't respond anymore. I have seen this one played out every single day. Back in the day, users were generous with the upvotes even for a simple basic question, this is not the case anymore today.

culopatin · on July 25, 2023

I think that with the rise of push notifications, no one really goes to a site to check notifications anymore. So the new user may have not developed the muscle memory to go back to SO and participate. I suspect this also has something to do with the decline of forums. Reddit still works because the app sends 200 notifications a day, but without it, I don’t think it would be as popular.

Also SO is participation hostile unless you’re a pro, so as a newbie I’m not going to do anything other than ask and lurk, because I’m not worthy

galaxyLogic · on July 25, 2023

At least part of the reason for the hostility is that SO is a game. You get points, but you can also prevent others from getting points by voting down or removing their questions and answers.

On SO this hostility is pronounced because participants believe that if they get a lot of points they have easier time finding a well-paying job.

stulentsev · on July 26, 2023

> participants believe that if they get a lot of points they have easier time finding a well-paying job

It’s true, though. I have a high-rep account and I’ve had a few jobs fall in my lap because of that.

kzrdude · on July 25, 2023

I don't know if we all do it the same way. I don't use push notifications for barely anything, because I don't want to be disturbed by random sites (least of all linkedin or SO.)

culopatin · on July 25, 2023

But you’re not the average person.

kzrdude · on July 25, 2023

I didn't know that. Well, I suspected it, because something is clearly wrong, yeah, but.. you know what I mean?

hinkley · on July 25, 2023

Since SO is often used in a professional capacity, that problem could have easily been fixed by dev tools providing a formal way to link to SO traffic for topics that are relevant to the team.

It's just been a while since anyone has started trying to integrate tools with each other, outside of the established players.

metabagel · on July 25, 2023

Unless you post your own Q&A, it’s difficult to get the initial mod points to even be able to comment.

firemelt · on Aug 1, 2023

these peasants with high reputation thinking they are johnskeet

reputation is meaningless and bloated in stackoverflow now there are many 100k reputation people because asking or answering basic shits on javascript/python/pandas/git

raincole · on July 25, 2023

Every time I posted on SO(or other SE sites) I always have to clarify my question with something like "I know it's probably not good idea to do A, and I understand B could be a better solution, but in my specific situation I really want to do A."

Then people will still try to close my question because it's a duplicate of B.

sixothree · on July 25, 2023

I've literally included the search terms I used to ensure it wasn't a duplicate. Other times ice explained why this is clearly not a duplicate . Nope. Closed for being a duplicate.

lopis · on July 25, 2023

Sometimes it can be as simple as "version 2 of this software does things this way, but I'm using version 14, how can I do this?". "Closed as duplicate: [question from 12 years ago]".

pasc1878 · on July 25, 2023

I now only use a smaller sites and you just mark these to be reopened.

The issue is that all users are moderators and the newcomers are the ones not reading questions etc.

I would flag this sort of thing for a real moderator.

bunga-bunga · on July 25, 2023

People just skip that part. The flow is: read title, skim code example, post answer.

Then you point out that they completely disregarded what you wrote and blame you for misdirecting them. Then you get the downvotes.

The SO way.

is_true · on July 25, 2023

I think the problem is google losing the fight with spammers. It's being a while for me that I have to put "stackoverflow" in the search query to avoid sites with scraped content

Nextgrid · on July 25, 2023

Google is not "losing" any fight. Google is deliberately letting spam thrive because that spam may contain Google Ads/analytics and increases engagement on the SERP page as people who click on the spam go back to try something else (potentially one of the sponsored results). All these contribute to Google's bottom-line.

hinkley · on July 25, 2023

It is difficult to get a man to understand something, when his salary depends on his not understanding it. --Upton Sinclair

Nextgrid · on July 25, 2023

Problem is that in addition to people whose salary depends on it, there seems to be plenty of people out here defending Google and spreading misinformation despite having no obvious profit motive.

whstl · on July 25, 2023

Yep. It will also will push companies into paying for Ads in order to beat the spam itself.

TillE · on July 25, 2023

That's my guess too; I'm sure Google drives the overwhelming majority of SO traffic.

A few years ago, my programming-related queries would hit Stack Overflow as the first or second result. Now it's very frequently spammy garbage in the top 2-3 slots.

andirk · on July 25, 2023

What kind of spam do you get when searching something specific and technical? Who is trying to SEO their way to the top for "how to set redis max memory"? A lot of comments here saying the spam is beating out SO, but what spam and from who and why??

cjs_ac · on July 25, 2023

There are many websites that scrape Stack Overflow's content and serve it up again with their own ads.

pasc1878 · on July 25, 2023

I don"t see those on google but then again I use a search blocker mainly for those sort of sites.

dylan604 · on July 25, 2023

>to avoid sites with scraped content

especially YouTube links. sadly, it would not surprise me if these people are earning a decent enough money from ads to make it worth their while to be "content creators" solely from search results from Googs

abathur · on July 25, 2023

And, IME, these scraped sites often manage to rank above the canonical source.

heisenbit · on July 25, 2023

It is both infuriating and sad that Google can‘t figure out a way to compensate for this SEO spam. Is there an easier problem than doing it for SO (and yes, coding is a big enough problem for Google imho to be worth investing a little here).

hinkley · on July 25, 2023

Which means they aren't applying any sort of primacy to the information.

If three segments of the internet think the same piece of information is relevant, that should affect the score of all 3 copies, not just the largest segment.

abathur · on July 25, 2023

I'm not sure I'm reading you right--you're suggesting it should work this way?

When content republished on some bizarre/sketchy/unaccountable ~adfarm outranks the site where it first appeared, users of Google's search service end up at higher risk of getting phished or infected with malware.

Is there some benefit you see here that outweighs this downside risk?

hinkley · on July 25, 2023

Applying SEO to a copy of someone else's content gets you highly ranked on Google. I'm saying that at this point Google is doing enough processing that they should be able to detect duplicates after a fashion, and weight the oldest copy more heavily than duplicates.

harry8 · on July 25, 2023

As soon as you consider google no longer does search but instead is a "suggestion engine" it all makes a lot more sense. Sadly.

galaxyLogic · on July 25, 2023

Could it be that Google prefers their own AI-provided answers over SO?

Sosh101 · on July 25, 2023

Those answers are >50% total junk though.

Gibbon1 · on July 25, 2023

Well if they served up a high quality site it you'd just go there and might not have ads even. Where the dozen SEO garbage sites they do serve up are all hosting ads google gets a cut of.

Sosh101 · on July 25, 2023

That's a very short sighted business strategy if true. Simply liquidating their reputation. Those junk AI results have certainly led to me using Google less.

still_grokking · on July 29, 2023

> That's a very short sighted business strategy if true. Simply liquidating their reputation.

Why should they care? They're too big to fail…

Google controls almost the whole end-user realm through Chrome & clones, and Android, the dominating end-user OS by a wide margin.

At the same time end-users are completely helpless and can't do anything against Googles liking because they don't understand anything about IT tech.

Computers are black magic to most people so they're trapped. This never changed! Especially millennials and gen-z are completely clueless as they didn't had the chance to use personal computers ever, where you had at lest some control over the device and needed to know at least some basics about its inner working. All the younger people know are the tightly sealed black-box devices you don't have any control over, called mobiles, which are fully operated by big-tech. Google search + Android apps are "the internet" for most people. They mostly don't even know there is something else beyond that, so Google can do whatever they want, and this will have exactly zero consequences for them by now.

Google's move regarding rolling out "browser DRM", the next "trusted computing" initiative, regardless of what anybody thinks about is is very telling.

Now they will violently reap the fruits of their monopoly, and likely nobody will be able to stop them in the next decade. People where warned about the consequences of this monopoly for many many years. Nobody cared. Now it's payout day for Google.

OkayPhysicist · on July 31, 2023

When do you think millennials were born? The very youngest millennials were in their tweens when the first iPhone came out, and the oldest were pushing 30. They definitely experienced pre-smartphone computing. In fact, it's probably the defining characteristic of the generation: millennials grew up with modern computing, but before the smartphone. Gen Z grew up in a world were smartphones were ubiquitous.

lordnacho · on July 25, 2023

There's a ublock for those spam sites IIRC

hliyan · on July 25, 2023

I think we're rapidly approaching a point where any content that come with ads is suspect. The fact that only Wikipedia has managed to largely escape deterioration (or as some call it, "enshitification") is testament to this. A search engine that can selectively search non-sponsored content or soft-paywalled content would be potentially quite popular. However, monetising such a service without ads will be a challenge.

iJohnDoe · on July 25, 2023

What’s interesting is that Google was known for how hard it was to figure out the Google algorithm.

Remember, when people were hired because they knew the secret sauce on how to get the best Google ranking. Google experts?

Well, it turns out that the person at Google that was responsible for keeping the algorithm fresh and the search results fresh retired and everything went to shit when they left.

Actually, I’m betting that person did leave the company, but the real damage happened when someone came along and convinced everyone they knew the real trick to better search results and we have the shit that is now Google. Nice work new guy! Let me rephrase that. Nice work to the guy that thinks they are smarter than everyone else and still thinks their approach is the best, yet evidence to the contrary.

Blackarea · on July 25, 2023

Really sounds as if your made up story is deeply rooted in your own experience. I am sorry if something like a new guy taking your position and claiming to be smarter has happened to you but creating imaginary stories is not quite what this comment section needs and you'd probably be better off dealing with this in a different way

iJohnDoe · on July 25, 2023

Not at all. Nothing personal. Although, it looks like you are the one self-projecting here.

It’s simply how times change and people with it. Knowledge is lost when people move on and the reasons why certain decisions were made are not transferred.

At any rate, I imagine people at Google are trying to figure out why there is such a negative opinion on their search results lately.

choppaface · on July 25, 2023

Matt Cutts was instrumental in community outreach and helping SEO differentiate from spam. When he left, Search pivoted to stuff like using Twitter data and lifting content directly from websites into results. While it’s probably hard to attribute all the changes to one person, Matt Cutts made a huge impact on the product.

hsbauauvhabzb · on July 25, 2023

My bet is revenue went up for google with the second guy, so who is smarter is a matter of perspective.

hans_castorp · on July 25, 2023

I left SO because I was downvoted to oblivion for an answer that took me 2 minutes to write - but I had answered a similar question several years before (which I actually didn't remember). Searching for my own answer would have taken way more time than it took to write a new one.

When I pointed out that it's not the responsibility of the one answering to search for dupes, but for the one asking, I was told that I should still invest the time or otherwise don't answer at all.

pasc1878 · on July 25, 2023

Yes especially if you know you have answered the same thing before. You look for your original.

Remember all users are moderators. There are some explicit moderators but they don"t close or downvote often they deal with other problems - or on smaller sites just use normal user powers to vote and close.

metabagel · on July 25, 2023

OK, but the person you’re replying to left SO over this policy, and you didn’t seem to register that at all, just repeated the policy.

The fault finding culture of SO is toxic.

pasc1878 · on July 25, 2023

Then SO's reasonms for this policy need to be explained more.

The aim of SO is to provide answers to a question.

You do not want many questions with the same answer as if you have a new answer or a comment on this duplicate answer you need to then add it to all the questions. Thus we want to collapse all these multiple questions into one.

Also the person who I was replying to did noit seem to understand that they were a moderator, moderators are not a separate set of people to users.

captaincaveman · on July 26, 2023

They are a user, they choose to moderate or not, forcing people in this way to contribute only in a very specific way means, less users ...

pasc1878 · on July 26, 2023

But better more consistent data - which is what I want from SO.

However SO Inc wants money and so more users.

_wf2l · on Aug 2, 2023

citation needed that it actually provides better more consistent data. all it leads to is a flood of closed articles in Google search results. no curation whatsoever

pasc1878 · on Aug 2, 2023

If a question is closed as a duplicate it provides a link to the question it is a duplicate of and you can see the answer there.

You can"t expect the answerer to answer each individual question which is the same, especially as the answer might have been given over 10 yuears ago?

What do you expect a good answerer to do with a question asked for the hundredth time?

rmbyrro · on July 25, 2023

That's why most developers now prefer ChatGPT. It's way faster and more convenient. Like a super smart search engine.

SO will remain as a library for unseen problems, which is what's supposed to be good for.

mewse · on July 26, 2023

I'd also like to highlight "non-hostile" as a reason why folks might prefer ChatGPT.

Stack Overflow has a lot of stridently opinionated jerks contributing to it, and if I can just ask ChatGPT a question and get an answer that works rather than having to deal with being belittled by those people, then I'm probably having a much better day as a result.

tylerneylon · on July 25, 2023

This post, to me, is about the rise of ChatGPT — but I do think over-moderation is a huge problem.

I had a hard moment on the gamedev stackexchange where I was stuck trying to learn how to do something in OpenGL. A moderator immediately closed my question as a duplicate because there was a similar question about OpenGL ES, which is a (related but) different API. I tried to plead my case, but was shut down.

Shortly after that, I gave up on the game I'd been working on for a couple years. The mod's decision contributed to that.

I felt stuck by a wall between me and answers to some of my game programming questions. Over-moderation is more than an inconvenience. It can destroy the ability of users to get things done.

lph · on July 25, 2023

The graphs in the post show the traffic decline starting around May 2022, months before ChatGPT was available. I'd wager the cause is a change in Google's algorithm. Most of the time I end up on Stack Overflow, it's because I've typed a question into a search engine.

consp · on July 25, 2023

The top search resuls used to be either a SO answer, or a forum post or the actual docs having the answer to the question. These days it's either a dupe site copy pasting ad verbatim, a recurgitated and slightly modified variant of the former, or a "AI" generated answer, all full of ads. And to make it worse, none of them are useful as they obfuscate the answer or are simply wrong.

Looks like Google started to prioritise ads even more than actual useful results is what changed mostly.

coffeebeqn · on July 25, 2023

Anecdotally it seems like Google is favoring spam sites that have crawled answers from SO

redsaber · on July 25, 2023

I had to use ublock origin to start blocking those sites

https://github.com/quenhus/uBlock-Origin-dev-filter

erdeibit · on July 26, 2023

Thanks, I've been looking for this kind of filter for ublock. Those copycat sites are a pest.

tylerneylon · on July 25, 2023

Oh, you're right about the timing. It's probably not ChatGPT.

pharmakom · on July 25, 2023

This is a problem on the other side of the experience spectrum too. Sometimes I want to ask an advanced question and interact with other experienced users on SO. However I have to battle the mods (who clearly don’t understand my question) to keep it open.

JuanPosadas · on July 25, 2023

My questions usually go unanswered for years with several "me too"s and "did you ever figure it out?"s nailing my inbox.

I do typically self-answer if I figure it out, but you know, if I'm going to be ignored maybe it should be a github issue so I can get the sweet zero replies and that juicy 90 day auto-close from inactivity.

pentagrama · on July 25, 2023

I remember trying to learn front end development around 2013, was fascinated by responsive web design and twitter bootstrap. Asked some questions on that site, was mostly ridiculed for my amateur questions several times, never touched the site again and also never learned front end. So this is my story with that site.

wvenable · on July 25, 2023

I had the opposite problem; I'm actually quite skilled in areas but never had enough "karma" to answer any questions so I've never ever been able to.

suzzer99 · on July 25, 2023

Same. I got put in StackOverflow jail for posting my contribution an answer because I didn't have enough karma to post a comment on a previous answer (or maybe it was the other way around, I forget). Never mind that I was earnestly trying to help the original poster, and pointed out a legitimate mistake in one of the answers. I broke protocol and had to be punished.

lordnacho · on July 25, 2023

It seems like there was a time window when you could actually get upvotes, and if you missed it you'll never be able to use the site as a normal user.

pwdisswordfishc · on July 25, 2023

Not quite true, though indeed getting upvotes seems not to be as easy as it once was. I still get an upvote on some of my relatively recent answers once in a couple months, although it depends on the answer.

The quick way to accumulate reputation is through bounties. However, it's very much a lottery: bountied questions are often about ultra-specialised niche topics. You may need to hunt for a long time to find something you actually know something about.

pasc1878 · on July 25, 2023

You don"t need any karma to answer any logged in thing can answer. I say thing as ChatGPT is being used to produce a load or crappy wrong answers now.

Thus I don"t understand your issue. This is a XY-problem :) I know enough about the subject to know that your issue is not the actual issue as anyone can answer, if you had issues then there is something else.

ptspts · on July 25, 2023

On StackOverflow.com you are allowed to answer (most) questions even if your reputation is low.

th3byrdm4n · on July 25, 2023

There is some barrier to entry.

My friend, a much less skilled developer, was much better at the craft of SO. I just lurk/solve my own issues

andirk · on July 25, 2023

I went to a university that basically just handed me the Comp Sci manual and said RTFM. I almost failed out, but then I learned it.

I have only had great experience with SO, but regardless, don't get dissuaded by those with less than mediocre tact.

alecco · on July 25, 2023

Besides suffering from mod hell, I stopped answering SO questions mostly because:

1. it takes time to write a good answer;

2. fast copy-paste answers, often wrong, get upvoted anyway;

3. picking the accepted answer, out of many, is done by the person who asked it... who usually doesn't have the knowledge to do this!

4. also many times no answer is accepted, like the person who asked it stopped bothering.

And the hard blowback for that old post on how to game the SO point race, instead of SO trying to figure out how to fix their broken system.

Just like SO was a move out of Experts Exchange and others, this is a good time for the community to start a post-SO Q&A site.

alecco · on July 25, 2023

Ask a dumb question about a trendy JS framework and you get hundreds of votes.

Answer a difficult question on a barely documented part of software (e.g. low-level) and you'll get a couple of votes, at most. And you're lucky if the answer gets accepted.

There are a few unsung heroes on certain hard/obscure SO tags. They dedicate a lot of time and get little reward. Whatever follows SO should find a way to fix this.

JohnMakin · on July 25, 2023

The problems I’ve noticed with Stack Overflow are a few and hard for me to narrow down but basically:

- google used to return really relevant results for SO, and it stopped doing so at some point a while ago

- moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.

- because of the previous bullet, oftentimes the best answer is buried in comments and has very negative feedback despite answering the exact question

Due to a combination of these things, filtering against the noise for what I wanted became increasingly more difficult and often the solution to my problem was easier found searching github comments or random blogs.

sznio · on July 25, 2023

Your first point seems to be most important.

> - google used to return really relevant results for SO, and it stopped doing so at some point a while ago

SO might be horrible now, but it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now? Google's current recommendation of going to w3schools or - even worse - geeks4geeks or any other content farm is and always will be worse than stackoverflow. I don't have a clue what their algorithm is doing but it's surely trying to kill Google search as fast as possible.

Another joke is the fact that searching for "[language] [symbol]" also brings me to these content farms instead of the documentation. You seriously can't find useful anything these days using Google.

technion · on July 25, 2023

This whole situation just shows as a lie everything we hear about SEO. Stack Overflow has the exact text, it loads incredibly fast (should be commended more for this), doesn't require ten meg of Javascript to render as far as I know generally meets HTML standards.

These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them, if you open devtools you'll often see pages of warnings about deprecations and/or invalid html, and despite having the same scraped text always score higher on Google.

mountainb · on July 25, 2023

It has long been a mantra in SEO land that user generated content sites in general and forums in particular are to be aggressively down ranked. The reason for this is that industrial strength spam farms otherwise spin up tens of thousands of forum domains to pass link juice to what they are targeting. This naturally penalizes real forums, which often contain the best content for a query.

This is why Google has basically surrendered and why so many search result categories are now dominated by whatever sites Google has arbitrarily declared the winner through editorial decision making. In many search categories, we are effectively back where we started with Yahoo directories and hand-picked search rankings. What you see on the first page SERP is the best that they can do under the circumstances based on the fundamentals of how search works.

The web was a fun idea while it lasted, but if you are using it as a primary information resource, you are wasting your time.

wiz21c · on July 25, 2023

funny, I just rented an introductory book about signal processing and was (re) amazed to see how much the information is well explained, with tons of example, and a real plan to guide you in the ton of knowledge you have to master.

I for one welcome back our new library overlords.

daveguy · on July 25, 2023

Come on now... Don't hold out... Citation?

wiz21c · on July 25, 2023

Understanding Digital Signal Processing, 3rd edition

    Richard G. Lyons

ajmurmann · on July 25, 2023

I think there are a number of things that could be done to improve this, but I'm sure Google won't do:

1. Heavily penalize presence of ads. The content farms makes their money with lots of shit ads. This will wreck their business model.

2. Just manually block out heavily penalize content farms and boost known good sides like SO and MDN.

Fanmade · on July 25, 2023

Well, those sites are often running ads probided by Google, so it's understandable why Google doesn't really have a good incentive to follow your first suggestion.

mountainb · on July 25, 2023

They do penalize heavy ads. There have been shenanigans on this front also (penalizing non-Google ad networks and favoring Google ad networks). They do penalize some content farms and favor others. The issue they are concerned about with SO and MDN among other user generated content sites is covert seeding at scale for the purpose of manipulating search results.

There's just a lot of fraud on the internet related to search and advertising manipulation, but it's under-policed in part because of the internationalized nature of it and because it is hard to bring fraud cases in the United States because of the particularized pleading standard. That should not stop the feds from bringing criminal cases, but generally the feds care about large dollar value frauds (as they probably should be) rather than on policing very large numbers of small dollar value frauds that have a major aggregate impact on the online economy. They like going after the guys who steal $100 million from deaf children with Lupus rather than doing 200 $500k fraud prosecutions.

Xeamek · on July 25, 2023

Except those adds are often provided by Google itself. So penalizing them would be self harm for google revenues

DeusExMachina · on July 25, 2023

Anecdotal, but my website lost a lot of search traffic after Google's core update in March, which seems to have affected SO as well looking at the first chart.

If I look at Google's guidelines, my articles follow all of them: in-depth, well-researched, demonstrating personal experience, better than other articles appearing in the search results. And yet, they were "penalized" by this update for who knows what reason.

I looked into it and some other websites benefited from the update, so who knows what changes they made and why.

johnny_reilly · on July 25, 2023

Again anecdotal, but I lost the majority of my SEO traffic late last year around the time of a core update. I've spent the best part of a year attempting to repair it, on the assumption I'd committed some heinous SEO crime. The more time that passes, I'm starting to think that the issue isn't mine so much as Google's. It's baffling. I wrote about it here:

https://johnnyreilly.com/how-i-ruined-my-seo

slashdev · on July 25, 2023

I don’t see what you did wrong, it must have been the algorithm change. My parents had a business that was killed by a Facebook algorithm change. My brother took a significant hit from an Amazon algorithm change. Building a business around any of the big tech companies seems very risky.

I think Google search has just declined a lot. I guess they’re losing the constant cat and mouse game with SEO. It seems worse than it has ever been, I’m relying more on ChatGPT and copilot now.

I can only imagine that LLMs will be the end of any content based search ranking. I don’t know how they’ll adapt to that.

DeusExMachina · on July 25, 2023

Looking at your post and the HN discussion, I am also of the same impression.

verteu · on July 25, 2023

Maybe Google (semi/permanently?) penalized your domain when spammers started using your GA4 tag?

johnny_reilly · on July 26, 2023

Gosh that would suck. Sounds plausible though

pjc50 · on July 25, 2023

> These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them

Only if you're not googlebot. The crawler sees a much nicer site.

gingerlime · on July 25, 2023

which should — in theory — get them penalized for cloaking. But obviously it doesn’t. Reinforcing GP’s point.

NavinF · on July 25, 2023

Google has gotten pretty lenient about that: https://developers.google.com/search/docs/essentials/spam-po...

"If you operate a paywall or a content-gating mechanism, we don't consider this to be cloaking if Google can see the full content of what's behind the paywall just like any person who has access to the gated material and if you follow our Flexible Sampling general guidance."

I wonder if they just gave up

post-it · on July 25, 2023

Hypothesis: Search, being Google's oldest product, is no longer prestigious to work in. It's in maintenance mode.

zerkten · on July 25, 2023

Does Google run other indexers for the purposes of catching cloaking? Are there other strategies that can be used? One of the problems of SO is that most of the valid content is out there and easily available without having to scrape the site which may mean penalizing for bad content is harder.

raverbashing · on July 25, 2023

And the fact that google is not detecting those is damning (to google)

XCSme · on July 25, 2023

Does it even make sense to serve different content to a bot than what a human would see? Isn't the search engine trying to rank content made for humans?

pjc50 · on July 25, 2023

It's an adversarial process. The search engine is, in theory, trying to rank by usefulness to the user, and the site owner is trying to maximize revenue by lying to the search engine. And the user.

AlexandrB · on July 25, 2023

I'm generally puzzled by Google's reluctance to do manual intervention in these cases. It's not like this is a secret. Just penalize the whole domain for 60 days every time a prominent site lies to the crawler.

sulam · on July 25, 2023

There are very many sites where the content you see as a non-logged-in user is different from what you see if you have in your possession an all-important user cookie.

gtirloni · on July 25, 2023

If Google's support is any indication, Google doesn't like to involve humans in their processes. There probably isn't enough humans to do this manual intervention you propose.

XCSme · on July 25, 2023

Then, maybe the "crawler" should be an actual PC navigating to the browser, taking a screenshot (or live feed) of the page and processing it with AI.

pjc50 · on July 25, 2023

Eh, Google choose to be identifiable as googlebot and to obey robots.txt for other reasons of "good citizenship", because not everybody wants to be crawled.

Scarblac · on July 25, 2023

It makes sense if you know your content isn't nice for humans (e.g. full of ads and tracking stuff) but you want it to rank high anyway.

soco · on July 25, 2023

I wonder what will I see if I change my browser's user agent?

Shrezzing · on July 25, 2023

Google is really failing hard in this regard, and I'm fairly sure it's intentional on their part. Searching "Typescript array" has obvious intent from the user, and an obvious "correct" first result. Google returns the documentation page in the 3rd result, but it's a link to a deprecated version of the page. The rest of the above-the-fold links are websites that contain Google ads.

Duck-Duck-Go returns the up-to-date documentation link 2nd, and the MDN result in 3, with W3Schools in 1. Bing returns actual content on the results page, describing exactly what you need to understand a TS Array.

Google have the incentive to push the poor sites, because they earn revenues from doing so. Bing and DDG don't have that incentive, and return much more relevant and useful links. That doesn't feel like a coincidence.

RyanHamilton · on July 25, 2023

I spent years learning a programming language well then further years delivering a training course, iterating and then providing sections of the course on the website free online. Both as advertising and to get new people started. Your "typescript array" returns one of the sites in the top 5 that basically copy-pasted via thesaurus many of my articles. I checked and it turns out they offer $50 for people to submit content for any language / technology. So you have someone in a cheap country paid to go copy content and reword it on that site. Then they rank higher than you, as they do this over many languages thus seeming more authoritive. Even more worryingly with chatgpt, they won't even have to pay the $50 any more. So the whole internet may become like this. Leaving me little incentive to publish material except that which solely entertains myself, mmm facebook/twitter = not a good outcome.

withinboredom · on July 25, 2023

I have a friend that does something similar, but only does video with the text gated behind a paid-only site. He makes pretty good money, but for the exact reasons you listed is why the site is paid-only. They have a much harder time stealing (as in posting as their own content) the video.

freediver · on July 25, 2023

Results on Kagi for comparison:

https://kagi.com/search?q=typescript+array&r=us&sh=Qa5cXHvwj...

We simply downrank sites which display a lots of ads on them and also use community blacklists for dev site clones.

sotix · on July 25, 2023

I also notice and appreciate that Kagi returns older results while Google continues to push newer webpages. I have found so many useful results from perfectly fine content on older webpages. At this point, I’d be extra happy if Kagi had a Web 1.0 filter that focuses on basic html websites.

eitland · on July 25, 2023

For those who want this it exists on https://search.marginalia.nu

geysersam · on July 25, 2023

They should just add Google ads to all technical documentation pages. Problem solved.

coliveira · on July 25, 2023

Yes, google search is nowadays, like everything else, run by AI. What nobody tell is that the AI is trained to maximize Google's revenue. That's why they figured out it is better to put these ad sites on top.

NoMoreNicksLeft · on July 25, 2023

> Google is really failing hard in this regard,

Failing at what though? Is it anything they care about, that they want to do?

If not, then it's not so much failure as it is a change of plans on their part. They don't want to do that anymore, and there's no one else to pick up the slack.

geysersam · on July 25, 2023

There are browser extensions for blacklisting domains from your Google searches. I've been so incredibly happy using one if them. If I see one of those despicable content farms I just blacklist it and move on. Often when I search on Google for technical stuff I only get 2 visible results on the first page, 1 SO and 1 documentation. Soooo relaxing.

The business reasons why Google doesn't take steps to remove the bad content and make their product pleasant to use again is so far from my understanding it might well be aliens running the company for all I know.

lolc · on July 25, 2023

My understanding is that Google has an incentive to send people to content farms because those farms will show Google's ads. Stackoverflow doesn't. So they can increase ad exposure.

Thinking of it, it would be an interesting test to compare the ranking of two similar sites, one with google ads, another with ads from another provider. Might be good evidence for antitrust litigation. But what do you do if they just prefer sites with more ads? Because due to their market position, that benefits them, but it isn't anti-competitive against other ad-pushers.

geysersam · on July 25, 2023

Maybe you're correct. I've heard that explanation before but it just seems too incredible that they'd undermine their monopolistic global billion dollar business for a measly share of the revenue of geeks4geeks.

rightbyte · on July 25, 2023

Google is a self playing piano with clueless leadership. There is probably no plan involved.

Just managers doing what they get more money for or devs hunting promotions by increasing ad revenue by 0.01% in the short term one sting at a time.

permo-w · on July 25, 2023

the way you phrase it there makes it sounds miniscule, but you scale that up to the size of the SEOified internet and the numbers are surely into the billions

martin_a · on July 25, 2023

I was thinking the same. Taking in consideration the vast amount of such SEO farms, there's surely a lot of ad money to be spend/earned if you prioritize the "right" sites.

marcosdumay · on July 25, 2023

Last time I saw, Google gets much more revenue from ads on search than from the entire 3rd party ecosystem.

permo-w · on July 26, 2023

a big number that's much smaller than a big number is still a big number

arp242 · on July 25, 2023

I don't think it's an intentional decision anyone has taken, or that they intentionally made the search engine the way it works now, but more of a "there's nothing wrong here from out perspective, so what's there to fix?" kind of thing.

dmje · on July 25, 2023

I've been using Kagi[0] for a while now and it's pretty great in general - but also has options to boost up / down / totally ignore certain domains. It also has "lenses" that let you set a context (example: I'm searching for code stuff so just include sites a,b,c).

It's really good and IMO more than worth the price.

[0] https://kagi.com

viraptor · on July 25, 2023

Yeah, my Kagi list of content farms / SO clones which are completely dropped from all results keeps growing. On the other hand, searching just SO from Kagi still seems to give decent results.

rightbyte · on July 25, 2023

Your experience matches mine. Spend two or three weeks blacklisting sites as you hit them and they disappear.

Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

It takes time to build ranking.

The underlying reason is probably that the spam sites use Google Ads (revenue which is tied to 1000s of PMs and managers bonuses) and that Google as an org is deeply dysfunctional at this point.

ncruces · on July 25, 2023

Yeah, but then they're editorializing.

And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”

tivert · on July 25, 2023

>> Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

> Yeah, but then they're editorializing.

> And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”

Except that shouldn't be a problem because I'm pretty sure Google already blacklists domains.

ncruces · on July 27, 2023

Surely, they do. But they reserve that for stuff that's really way beyond the line. For everything that might be legitimate they leave it to the ranking algorithm to sort out and it's a game of cat and mouse.

jazzabeanie · on July 25, 2023

This is good to know. I refuse to click on a geeks4geeks result even if it looks like it has exactly the answer I want.

coldpie · on July 25, 2023

Do you have a suggestion for a Firefox extension to do that?

gpgn · on July 25, 2023

I have noticed that Wikipedia is often pushed to the bottom of the page compared to a few years ago where it would always be at the very top.

makeitdouble · on July 25, 2023

Anecdotally Wikipedia is often the top result for me ... with the twist that it's the Google widget, with the side bar and related videos.

Only way below this block (which takes about 120% of my whole screen's height) come the "organic" results, that aren't great, but probably match what Google assumed I wanted to see.

stavros · on July 25, 2023

And it's always the result I want.

OJFord · on July 25, 2023

Then consider using DDG & '!w term', or some other method (searching Wikipedia, extension, I think Firefox has something engine-agnostic built-in) instead?

stavros · on July 25, 2023

Oh I do use DDG. I think it might suffer from the same problem, actually, since now I'm wondering why I see Wikipedia results very far down in Google when I don't really use Google.

prmoustache · on July 25, 2023

The comment above mention using !w in your search to look only in wikipedia since you say it is nearly always what you are looking for.

I would say at this point you could just use wikipedia as your default search engine.

stavros · on July 25, 2023

Yeah, I do do that for those things, but when I don't, I've noticed a decline in quality.

TRiG_Ireland · on July 25, 2023

You can add a keyword search in Firefox.

PaulHoule · on July 25, 2023

I never really liked StackOverflow. The only questions they seem to allow are “How do I get the length of a string in Python?”. Most of the problems where I am scratching my head and really need the benefit of somebody’s experience are software selection problems that aren’t allowed.

The competing answers paradigm is also fundamentally broken, I don’t want to see 15 answers to “How do I get the length of a string in Python?” I just need to see

   len(x)

Programming splogs do better than SO does in this respect. In fact, even the Q/A paradigm is bad because the average SO post requires scrolling past at least one extensive code example that does not work,

For more than 10 years I thought the world needed a search engine for programmers. You really ought to be able to upload your POM file or equivalent and have the system automatically search the correct version of documentations. (Any attempt to look up things in the Java manual has to be written like “JDK17 javadoc {className}”; Javascript libraries like reactstrap, react-router and such often have a few wildly incompatible versions and I don’t want to waste a millisecond with the wrong version doc, …)

I wouldn’t mind searching answers from stackoverflow but I only want the best correct answer and I don’t want to read a long confused question, etc. As this would clearly save coders time maybe they’d pay for a subscription as they do for Jetbrains tools.

Cthulhu_ · on July 25, 2023

Years ago, Google announced they would crack down on content farms, and SEO advice was really like "you NEED to have this meta tag if you duplicate content from elsewhere else google will fuck you over HARD!", but it seems they earn more money off of content farms then the sources.

This will hurt them long term I presume, but they won't care because they earned money.

soco · on July 25, 2023

Google's current recommendation is usually heaps of Pinterest randomness, and then they wonder why people start relying on ChatGPT "oh it's not a search engine" - sorry folks Google isn't one (anymore) either.

mschuster91 · on July 25, 2023

Google has gone down the drain. As I've written recently somewhere here, they could easily fix their search by hiring maybe a dozen people per country to moderate common search request results or to, hell, listen to users like here, and respond by booting the scammers.

The problem is, they won't, because active moderation beyond responding to legal (DMCA, right to be forgotten, anti-CSAM) demands would massively endanger their "we are an impartial search engine" defense.

ljm · on July 25, 2023

It's been over 10 years and it still endlessly frustrates me that searching for any Ruby or Rails documentation will send you to an APIDock page for Rails 3.2 and you have basically goad Google into giving you the official documentation for either.

I suppose the real frustration is that Google became so pervasive that bookmarking a website and using its own search functionality is a total afterthought.

fouc · on July 25, 2023

Google search results have a filter for time, so could potentially improve the results by changing the date range back to 3 years ago.

masukomi · on July 25, 2023

try Kagi ( kagi.com/ ) . SO answers are almost always the first ones for my geeky questions (as they should be in most cases), and it also extracts and displays the official answer to the question that best matched your search.

shafyy · on July 25, 2023

Try out Kagi Search. You can manually increase website's weight and completely block others. E.g. I have increased Stack Overflow's weight and blocked those stupid content farms. Works great.

mglz · on July 25, 2023

Conspiracy theory: Bad initial search results forces people to search more often, hence allowing google to show more ads. Since few people switch away as a result, they continue doing this.

permo-w · on July 25, 2023

this is a bit like the cosmetics industry. there are very clearly probiotic solutions to body odour that could be developed with the coins down the back of P&G's sofa cushions, but if you fix everyone's body odour, then how are you gonna sell them anti-perspirant from now to the end of time?

now, in an ideal world competition would solve this problem, but the cosmetics companies heavily collude and anti-compete to prevent this

JacobSeated · on July 25, 2023

This is where I want to remind you, Stackoverflow is a Q/A site that sometimes contains stolen content from, as you put it, so-called "content farms" and the official resources.

Now, I do have a Stackoverflow user as well, but I actually prefer publishing my ideas on my own site rather than help build someone else's content farm for free. Stackoverflow is, itself, a content farm, and it can be very hard for new users to join the site. You can not even post comments without first earning enough points. For a very long time I would actually resist joining the site for that reason. I have only recently earned enough points to comment.

Now, I happen to own a so-called "content farm" too, and the choice can either be between creating a standard blog with very little traffic or try and cover everything you can possible think of in order to compete with other "content farms" in your niche. It is very difficult if not near impossible for a single individual to create a valuable resource and maintain it, and it is simply not sustainable if you have paid authors working on it as well. There is no way you can monetize it decently. Stackoverflow probably found a way around this problem by simply leaning back and monetizing their users content.

Once your site grows big enough, you also deal with a ton of spam- and hacking attempts. Everything combined just requires an inhumane amount of time to deal with.

Of course, authors are desperate because of how difficult it is, and perhaps especially authors from poor countries that might not have other sources of income. Their basic business model seem to be: create a content farm with ads, fill it with copy-written spam and hope Google indexes. Often these sites even have multiple authors, which is quite baffling given the extra expense it must create for them. But I do not think they have actually thought the idea through – because it is just not profitable.

Weirdly it's often in the technology niche, which they are clearly not proficient in, and more or less containing stolen solutions with little original content added.

I have seen a few sites like this, ripe with some of the most nasty grammar too. It interesting they are able to rank simply based on their volume? Of course they must be using blackhat techniques, including linkbuilding if you analyze their link-profiles, because there is no way that something so poorly designed and maintained is getting that much attention compared with official sources or stackoverflow.

For those of us who own blogs, such sites are often easily outranked simply by writing a comprehensive article on whatever tiny topic they have posted about.

wahnfrieden · on July 25, 2023

Yes, if you cite a solution the mods there get angry when you don’t copy paste the third party site content instead of just link to it. The stated reason is to make sure the content isn’t lost. In other words to ensure the content is duplicated on SO.

I have no allegiance to SO ownership so when the fake SO sites show up in results instead of SO, usually reading them will just give me the answer more quickly than finding the actual SO source.

ceejayoz · on July 25, 2023

They want enough of an excerpt so the answer doesn't become useless years later when someone redesigns their blog URL schema or shuts it down. That's reasonable, and probably falls within fair use.

wahnfrieden · on July 25, 2023

That’s what I said

bobajeff · on July 25, 2023

>mods there get angry when you don’t copy paste the third party site content instead of just link to it...

There's a good reason for that. Sites come and go and as a result links to solutions die and you wish someone had just answered the question instead of just linked to it.

wahnfrieden · on July 25, 2023

Thats what I said

EspressoGPT · on July 25, 2023

Absolutely. Google search results quality has declined and I often find myself prefixing search queries with "site:reddit.com".

Someone · on July 25, 2023

> it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?

Are they just fine today, too? To judge that, you have to look at the date of the question and its answers, make an educated guess at what OS/language/library versions they are about, judge whether that makes a difference for the version(s) you’re using, and only then evaluate whether the reply even was correct at the time (it may have had a thousand upvotes, but still be dated)

I think a really good Q/A resource would require posts to be tagged with version info. Most people think manual tagging isn’t fun, though, so it’s hard to get such a set from volunteers.

An alternative would be to require test cases that the site can run to check what version(s) replies are valid for, but writing such tests that do not break over time is hard, and, again, in general volunteers don’t like writing them.

That leaves generating tags or test cases. I don’t think we’re there, quality wise, to do that.

avereveard · on July 25, 2023

Stack overflow committed the cardinal sin of running their own ad network on their sites, not much of a mistery it's downplaced.

azherebtsov · on July 25, 2023

I was involved in SEO related projects some time ago. Not that I’m an expert. I’ve heard google understands that the site is a search engine and does not index it. However it should be smarter, like do not index SO’s search pages, but do index question pages - because the original content is there. SO might have ran out of crawl budget which Google assigns to each site, and/or Google prioritises fresh content. But I agree with the sentiment, what we know as SEO is nothing more than playing games with Google indexing algorithms, based on rumours about the recent changes in it, or improving page performance beyond reasonable boundaries. The other day I was looking at apple.com internals and spotted a few things which we were “fixing” on our pages. I asked SEO experts “what is the point of doing X, since there are examples of a well indexed page having that same problem?”. And the answer was like “when we will be as big as Apple…”

Modified3019 · on July 25, 2023

uBlacklist can help by culling the spam results, while it's mostly a manual thing, it's fast, easy, and a little effort goes a long way I've found.

Unfortunately it still doesn't solve the issue that sometimes the good results are still buried pages away or simply not come up at all due to google's shitty algorithm.

I really need to look into SearXNG or something.

ajross · on July 25, 2023

Is that actually true though? So, I literally just went to Google this morning for a toddler python question (very much not my first language, heh).

"how to load a file all at once in python" returns a first hit pointing to a blog post answering the question correctly, a second pointing to a SO answer that is actually for a slightly different problem but contains the correct answer, answer #3 is a youtube video that probably answers the question correctly.

Geeks4geeks doesn't show up until #4, well below Stack Overflow. (FWIW, their answer was fine too).

> You seriously can't find useful anything these days using Google.

That really feels more like a meme than reality. Are there other subareas where the SEO is doing better than this one? It seems like a pretty representative question.

coliveira · on July 25, 2023

The answer is that the content farms are doing a better job of interacting with Google algorithm than SO. Of course it is a problem with Google search, but search was always hackable. The made-for-google sites know very well how to play the game.

bee_rider · on July 25, 2023

I wonder if Google should make their SEO prevention worse but simpler. Everyone has always wanted to SEO for Google, as long as Google has been around. It has seemed like only recently that good sites predictably lose.

Perhaps 10 years ago Stack Overflow was able to do some minimal SEO and then get by on content strength. Perhaps nowadays Google is doing a good job preventing basic stuff from working, so the only people to get good results are SEO-ologists that only know about exploiting SEO, and have nothing interesting to say on any other topic.

coliveira · on July 25, 2023

I think the answer is simpler. To rank well on Google you need to integrate with Google (search console, analytics and similar). I guess SO is not giving all their data to Google, so they cannot "optimize" for the site in the way that content farms are willing to.

m000 · on July 25, 2023

I think they need to bypass Google somehow to keep it going. Embracing LLMs could be a way out.

I already go to ChatGPT to cut through the SEO-optimized crap that Google offers me in the first couple of result pages. I would bet that a lot of the responses given by ChatGPT come from Stack Overflow.

Now, what if we had StackGPT which offered me similar funtionality as ChatGPT, but better? E.g. respond with some code and an explanation, but also link to the sources (which are probably within their site, so they have prime access to them). Or offer as an explicit option to respond using sources other than their archive, but perhaps without citing sources.

highspeedbus · on July 25, 2023

My theory these days is that indexing services like google are now too big to work properly. There's more and more noise added every time new information is indexed, to the point where strong bias is necessary for it to return relevant results to average user.

Maybe there's a point where the internet, with decades old information pilling up, becomes unbearably big for indexing services to handle all of it in a efficient manner. Hence the recent "optimizations" that companies swear haven't worsened searchability.

madog · on July 25, 2023

This is what I want from a new search engine:

1. Respect exact match searches - this used to work enclosing the search terms in "" quotes, but no longer works. If there are no exact match results, return nothing.

2. Allow blacklisting or removing results from certain websites entirely e.g. I want to be able to configure geeks4geeks to never show up in any results ever

If someone could make this new search engine they would have a good shot at replacing Google :)

freediver · on July 25, 2023

Both features exactly as described already exist in Kagi search [1] (founder here).

We are not trying to replace Google though, but offer an alternative to people who care so much for the quality of their search experience, that they are willing to pay for it.

[1] https://kagi.com

fho · on July 25, 2023

You won me over by summarizing listicles to a short list :-)

To be honest I think your pricing is to high. $25 for unlimited queries might be fine for somebody who needs a good search to work and earn appropriately.

But as a (former) PhD student I ran through the 100 free queries in 2 or 3 days and just would not have been able to afford 25€.

I would gladly pay 10€ (for unlimited searches) or 15€ (for a unlimited family option). But to me, 25€ just seems to high. That's 5 meals at my workplaces cantina right now (Germany, NRW).

(I assume you are aware of pricing issues as pricing options have changed at least once while kagi is on my radar)

freediver · on July 25, 2023

Thanks for the kind words. There are many things like grouping listicles you can do to improve search experience, once the incentives are aligned.

Unlimited for $10 is something we are working towards.

fho · on July 30, 2023

Thanks for listening! At $10 per month unlimited searches I'll immediately switch.

Also thanks for creating kagi. Kagi was the first "alternative" search that convinced me that there can be competition to Google. YaCy just does not work, most competitors (DDG, etc) just repackage the big engines. I use presearch as my daily driver right now, but am somewhat turned down by the NFT shenanigans behind that. Kagi looks like the only engine that stands on its own and is definitely something worth paying for.

eitland · on July 25, 2023

Can confirm.

And if you find a result that got included despite not being an exact match you can report it and see it get fixed in a few days.

christophilus · on July 25, 2023

I think Brave search has those features. (I haven’t tried it, though.)

DarkmSparks · on July 25, 2023

https://search.brave.com/

probably my new default search engine. Thnx

GMoromisato · on July 25, 2023

I'm sure everyone has thought of this, but is any search engine trying to add LLMs to the crawler pipeline? That might be more useful than at the user side (like Bing) where the index is already polluted.

adolph · on July 25, 2023

> but it still holds years worth of answers that were just fine a few years ago

The flip side of that is a large proportion of those are no longer fine or operant.

Calamitous · on July 25, 2023

That’s one of the nice things about Kagi: you can lower our block content farms, and elevate sites like stack overflow.