Your first point seems to be most important. > - google used to return really re...

technion · on July 25, 2023

This whole situation just shows as a lie everything we hear about SEO. Stack Overflow has the exact text, it loads incredibly fast (should be commended more for this), doesn't require ten meg of Javascript to render as far as I know generally meets HTML standards.

These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them, if you open devtools you'll often see pages of warnings about deprecations and/or invalid html, and despite having the same scraped text always score higher on Google.

mountainb · on July 25, 2023

It has long been a mantra in SEO land that user generated content sites in general and forums in particular are to be aggressively down ranked. The reason for this is that industrial strength spam farms otherwise spin up tens of thousands of forum domains to pass link juice to what they are targeting. This naturally penalizes real forums, which often contain the best content for a query.

This is why Google has basically surrendered and why so many search result categories are now dominated by whatever sites Google has arbitrarily declared the winner through editorial decision making. In many search categories, we are effectively back where we started with Yahoo directories and hand-picked search rankings. What you see on the first page SERP is the best that they can do under the circumstances based on the fundamentals of how search works.

The web was a fun idea while it lasted, but if you are using it as a primary information resource, you are wasting your time.

wiz21c · on July 25, 2023

funny, I just rented an introductory book about signal processing and was (re) amazed to see how much the information is well explained, with tons of example, and a real plan to guide you in the ton of knowledge you have to master.

I for one welcome back our new library overlords.

daveguy · on July 25, 2023

Come on now... Don't hold out... Citation?

wiz21c · on July 25, 2023

Understanding Digital Signal Processing, 3rd edition

    Richard G. Lyons

ajmurmann · on July 25, 2023

I think there are a number of things that could be done to improve this, but I'm sure Google won't do:

1. Heavily penalize presence of ads. The content farms makes their money with lots of shit ads. This will wreck their business model.

2. Just manually block out heavily penalize content farms and boost known good sides like SO and MDN.

Fanmade · on July 25, 2023

Well, those sites are often running ads probided by Google, so it's understandable why Google doesn't really have a good incentive to follow your first suggestion.

mountainb · on July 25, 2023

They do penalize heavy ads. There have been shenanigans on this front also (penalizing non-Google ad networks and favoring Google ad networks). They do penalize some content farms and favor others. The issue they are concerned about with SO and MDN among other user generated content sites is covert seeding at scale for the purpose of manipulating search results.

There's just a lot of fraud on the internet related to search and advertising manipulation, but it's under-policed in part because of the internationalized nature of it and because it is hard to bring fraud cases in the United States because of the particularized pleading standard. That should not stop the feds from bringing criminal cases, but generally the feds care about large dollar value frauds (as they probably should be) rather than on policing very large numbers of small dollar value frauds that have a major aggregate impact on the online economy. They like going after the guys who steal $100 million from deaf children with Lupus rather than doing 200 $500k fraud prosecutions.

Xeamek · on July 25, 2023

Except those adds are often provided by Google itself. So penalizing them would be self harm for google revenues

DeusExMachina · on July 25, 2023

Anecdotal, but my website lost a lot of search traffic after Google's core update in March, which seems to have affected SO as well looking at the first chart.

If I look at Google's guidelines, my articles follow all of them: in-depth, well-researched, demonstrating personal experience, better than other articles appearing in the search results. And yet, they were "penalized" by this update for who knows what reason.

I looked into it and some other websites benefited from the update, so who knows what changes they made and why.

johnny_reilly · on July 25, 2023

Again anecdotal, but I lost the majority of my SEO traffic late last year around the time of a core update. I've spent the best part of a year attempting to repair it, on the assumption I'd committed some heinous SEO crime. The more time that passes, I'm starting to think that the issue isn't mine so much as Google's. It's baffling. I wrote about it here:

https://johnnyreilly.com/how-i-ruined-my-seo

slashdev · on July 25, 2023

I don’t see what you did wrong, it must have been the algorithm change. My parents had a business that was killed by a Facebook algorithm change. My brother took a significant hit from an Amazon algorithm change. Building a business around any of the big tech companies seems very risky.

I think Google search has just declined a lot. I guess they’re losing the constant cat and mouse game with SEO. It seems worse than it has ever been, I’m relying more on ChatGPT and copilot now.

I can only imagine that LLMs will be the end of any content based search ranking. I don’t know how they’ll adapt to that.

DeusExMachina · on July 25, 2023

Looking at your post and the HN discussion, I am also of the same impression.

verteu · on July 25, 2023

Maybe Google (semi/permanently?) penalized your domain when spammers started using your GA4 tag?

johnny_reilly · on July 26, 2023

Gosh that would suck. Sounds plausible though

pjc50 · on July 25, 2023

> These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them

Only if you're not googlebot. The crawler sees a much nicer site.

gingerlime · on July 25, 2023

which should — in theory — get them penalized for cloaking. But obviously it doesn’t. Reinforcing GP’s point.

NavinF · on July 25, 2023

Google has gotten pretty lenient about that: https://developers.google.com/search/docs/essentials/spam-po...

"If you operate a paywall or a content-gating mechanism, we don't consider this to be cloaking if Google can see the full content of what's behind the paywall just like any person who has access to the gated material and if you follow our Flexible Sampling general guidance."

I wonder if they just gave up

post-it · on July 25, 2023

Hypothesis: Search, being Google's oldest product, is no longer prestigious to work in. It's in maintenance mode.

zerkten · on July 25, 2023

Does Google run other indexers for the purposes of catching cloaking? Are there other strategies that can be used? One of the problems of SO is that most of the valid content is out there and easily available without having to scrape the site which may mean penalizing for bad content is harder.

raverbashing · on July 25, 2023

And the fact that google is not detecting those is damning (to google)

XCSme · on July 25, 2023

Does it even make sense to serve different content to a bot than what a human would see? Isn't the search engine trying to rank content made for humans?

pjc50 · on July 25, 2023

It's an adversarial process. The search engine is, in theory, trying to rank by usefulness to the user, and the site owner is trying to maximize revenue by lying to the search engine. And the user.

AlexandrB · on July 25, 2023

I'm generally puzzled by Google's reluctance to do manual intervention in these cases. It's not like this is a secret. Just penalize the whole domain for 60 days every time a prominent site lies to the crawler.

sulam · on July 25, 2023

There are very many sites where the content you see as a non-logged-in user is different from what you see if you have in your possession an all-important user cookie.

gtirloni · on July 25, 2023

If Google's support is any indication, Google doesn't like to involve humans in their processes. There probably isn't enough humans to do this manual intervention you propose.

XCSme · on July 25, 2023

Then, maybe the "crawler" should be an actual PC navigating to the browser, taking a screenshot (or live feed) of the page and processing it with AI.

pjc50 · on July 25, 2023

Eh, Google choose to be identifiable as googlebot and to obey robots.txt for other reasons of "good citizenship", because not everybody wants to be crawled.

Scarblac · on July 25, 2023

It makes sense if you know your content isn't nice for humans (e.g. full of ads and tracking stuff) but you want it to rank high anyway.

soco · on July 25, 2023

I wonder what will I see if I change my browser's user agent?

Shrezzing · on July 25, 2023

Google is really failing hard in this regard, and I'm fairly sure it's intentional on their part. Searching "Typescript array" has obvious intent from the user, and an obvious "correct" first result. Google returns the documentation page in the 3rd result, but it's a link to a deprecated version of the page. The rest of the above-the-fold links are websites that contain Google ads.

Duck-Duck-Go returns the up-to-date documentation link 2nd, and the MDN result in 3, with W3Schools in 1. Bing returns actual content on the results page, describing exactly what you need to understand a TS Array.

Google have the incentive to push the poor sites, because they earn revenues from doing so. Bing and DDG don't have that incentive, and return much more relevant and useful links. That doesn't feel like a coincidence.

RyanHamilton · on July 25, 2023

I spent years learning a programming language well then further years delivering a training course, iterating and then providing sections of the course on the website free online. Both as advertising and to get new people started. Your "typescript array" returns one of the sites in the top 5 that basically copy-pasted via thesaurus many of my articles. I checked and it turns out they offer $50 for people to submit content for any language / technology. So you have someone in a cheap country paid to go copy content and reword it on that site. Then they rank higher than you, as they do this over many languages thus seeming more authoritive. Even more worryingly with chatgpt, they won't even have to pay the $50 any more. So the whole internet may become like this. Leaving me little incentive to publish material except that which solely entertains myself, mmm facebook/twitter = not a good outcome.

withinboredom · on July 25, 2023

I have a friend that does something similar, but only does video with the text gated behind a paid-only site. He makes pretty good money, but for the exact reasons you listed is why the site is paid-only. They have a much harder time stealing (as in posting as their own content) the video.

freediver · on July 25, 2023

Results on Kagi for comparison:

https://kagi.com/search?q=typescript+array&r=us&sh=Qa5cXHvwj...

We simply downrank sites which display a lots of ads on them and also use community blacklists for dev site clones.

sotix · on July 25, 2023

I also notice and appreciate that Kagi returns older results while Google continues to push newer webpages. I have found so many useful results from perfectly fine content on older webpages. At this point, I’d be extra happy if Kagi had a Web 1.0 filter that focuses on basic html websites.

eitland · on July 25, 2023

For those who want this it exists on https://search.marginalia.nu

geysersam · on July 25, 2023

They should just add Google ads to all technical documentation pages. Problem solved.

coliveira · on July 25, 2023

Yes, google search is nowadays, like everything else, run by AI. What nobody tell is that the AI is trained to maximize Google's revenue. That's why they figured out it is better to put these ad sites on top.

NoMoreNicksLeft · on July 25, 2023

> Google is really failing hard in this regard,

Failing at what though? Is it anything they care about, that they want to do?

If not, then it's not so much failure as it is a change of plans on their part. They don't want to do that anymore, and there's no one else to pick up the slack.

geysersam · on July 25, 2023

There are browser extensions for blacklisting domains from your Google searches. I've been so incredibly happy using one if them. If I see one of those despicable content farms I just blacklist it and move on. Often when I search on Google for technical stuff I only get 2 visible results on the first page, 1 SO and 1 documentation. Soooo relaxing.

The business reasons why Google doesn't take steps to remove the bad content and make their product pleasant to use again is so far from my understanding it might well be aliens running the company for all I know.

lolc · on July 25, 2023

My understanding is that Google has an incentive to send people to content farms because those farms will show Google's ads. Stackoverflow doesn't. So they can increase ad exposure.

Thinking of it, it would be an interesting test to compare the ranking of two similar sites, one with google ads, another with ads from another provider. Might be good evidence for antitrust litigation. But what do you do if they just prefer sites with more ads? Because due to their market position, that benefits them, but it isn't anti-competitive against other ad-pushers.

geysersam · on July 25, 2023

Maybe you're correct. I've heard that explanation before but it just seems too incredible that they'd undermine their monopolistic global billion dollar business for a measly share of the revenue of geeks4geeks.

rightbyte · on July 25, 2023

Google is a self playing piano with clueless leadership. There is probably no plan involved.

Just managers doing what they get more money for or devs hunting promotions by increasing ad revenue by 0.01% in the short term one sting at a time.

permo-w · on July 25, 2023

the way you phrase it there makes it sounds miniscule, but you scale that up to the size of the SEOified internet and the numbers are surely into the billions

martin_a · on July 25, 2023

I was thinking the same. Taking in consideration the vast amount of such SEO farms, there's surely a lot of ad money to be spend/earned if you prioritize the "right" sites.

marcosdumay · on July 25, 2023

Last time I saw, Google gets much more revenue from ads on search than from the entire 3rd party ecosystem.

permo-w · on July 26, 2023

a big number that's much smaller than a big number is still a big number

arp242 · on July 25, 2023

I don't think it's an intentional decision anyone has taken, or that they intentionally made the search engine the way it works now, but more of a "there's nothing wrong here from out perspective, so what's there to fix?" kind of thing.

dmje · on July 25, 2023

I've been using Kagi[0] for a while now and it's pretty great in general - but also has options to boost up / down / totally ignore certain domains. It also has "lenses" that let you set a context (example: I'm searching for code stuff so just include sites a,b,c).

It's really good and IMO more than worth the price.

[0] https://kagi.com

viraptor · on July 25, 2023

Yeah, my Kagi list of content farms / SO clones which are completely dropped from all results keeps growing. On the other hand, searching just SO from Kagi still seems to give decent results.

rightbyte · on July 25, 2023

Your experience matches mine. Spend two or three weeks blacklisting sites as you hit them and they disappear.

Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

It takes time to build ranking.

The underlying reason is probably that the spam sites use Google Ads (revenue which is tied to 1000s of PMs and managers bonuses) and that Google as an org is deeply dysfunctional at this point.

ncruces · on July 25, 2023

Yeah, but then they're editorializing.

And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”

tivert · on July 25, 2023

>> Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.

> Yeah, but then they're editorializing.

> And not in a generic “this kind of content is bad for our users” but very specifically “site x.com is bad for our users.”

Except that shouldn't be a problem because I'm pretty sure Google already blacklists domains.

ncruces · on July 27, 2023

Surely, they do. But they reserve that for stuff that's really way beyond the line. For everything that might be legitimate they leave it to the ranking algorithm to sort out and it's a game of cat and mouse.

jazzabeanie · on July 25, 2023

This is good to know. I refuse to click on a geeks4geeks result even if it looks like it has exactly the answer I want.

coldpie · on July 25, 2023

Do you have a suggestion for a Firefox extension to do that?

gpgn · on July 25, 2023

I have noticed that Wikipedia is often pushed to the bottom of the page compared to a few years ago where it would always be at the very top.

makeitdouble · on July 25, 2023

Anecdotally Wikipedia is often the top result for me ... with the twist that it's the Google widget, with the side bar and related videos.

Only way below this block (which takes about 120% of my whole screen's height) come the "organic" results, that aren't great, but probably match what Google assumed I wanted to see.

stavros · on July 25, 2023

And it's always the result I want.

OJFord · on July 25, 2023

Then consider using DDG & '!w term', or some other method (searching Wikipedia, extension, I think Firefox has something engine-agnostic built-in) instead?

stavros · on July 25, 2023

Oh I do use DDG. I think it might suffer from the same problem, actually, since now I'm wondering why I see Wikipedia results very far down in Google when I don't really use Google.

prmoustache · on July 25, 2023

The comment above mention using !w in your search to look only in wikipedia since you say it is nearly always what you are looking for.

I would say at this point you could just use wikipedia as your default search engine.

stavros · on July 25, 2023

Yeah, I do do that for those things, but when I don't, I've noticed a decline in quality.

TRiG_Ireland · on July 25, 2023

You can add a keyword search in Firefox.

PaulHoule · on July 25, 2023

I never really liked StackOverflow. The only questions they seem to allow are “How do I get the length of a string in Python?”. Most of the problems where I am scratching my head and really need the benefit of somebody’s experience are software selection problems that aren’t allowed.

The competing answers paradigm is also fundamentally broken, I don’t want to see 15 answers to “How do I get the length of a string in Python?” I just need to see

   len(x)

Programming splogs do better than SO does in this respect. In fact, even the Q/A paradigm is bad because the average SO post requires scrolling past at least one extensive code example that does not work,

For more than 10 years I thought the world needed a search engine for programmers. You really ought to be able to upload your POM file or equivalent and have the system automatically search the correct version of documentations. (Any attempt to look up things in the Java manual has to be written like “JDK17 javadoc {className}”; Javascript libraries like reactstrap, react-router and such often have a few wildly incompatible versions and I don’t want to waste a millisecond with the wrong version doc, …)

I wouldn’t mind searching answers from stackoverflow but I only want the best correct answer and I don’t want to read a long confused question, etc. As this would clearly save coders time maybe they’d pay for a subscription as they do for Jetbrains tools.

Cthulhu_ · on July 25, 2023

Years ago, Google announced they would crack down on content farms, and SEO advice was really like "you NEED to have this meta tag if you duplicate content from elsewhere else google will fuck you over HARD!", but it seems they earn more money off of content farms then the sources.

This will hurt them long term I presume, but they won't care because they earned money.

soco · on July 25, 2023

Google's current recommendation is usually heaps of Pinterest randomness, and then they wonder why people start relying on ChatGPT "oh it's not a search engine" - sorry folks Google isn't one (anymore) either.

mschuster91 · on July 25, 2023

Google has gone down the drain. As I've written recently somewhere here, they could easily fix their search by hiring maybe a dozen people per country to moderate common search request results or to, hell, listen to users like here, and respond by booting the scammers.

The problem is, they won't, because active moderation beyond responding to legal (DMCA, right to be forgotten, anti-CSAM) demands would massively endanger their "we are an impartial search engine" defense.

ljm · on July 25, 2023

It's been over 10 years and it still endlessly frustrates me that searching for any Ruby or Rails documentation will send you to an APIDock page for Rails 3.2 and you have basically goad Google into giving you the official documentation for either.

I suppose the real frustration is that Google became so pervasive that bookmarking a website and using its own search functionality is a total afterthought.

fouc · on July 25, 2023

Google search results have a filter for time, so could potentially improve the results by changing the date range back to 3 years ago.

masukomi · on July 25, 2023

try Kagi ( kagi.com/ ) . SO answers are almost always the first ones for my geeky questions (as they should be in most cases), and it also extracts and displays the official answer to the question that best matched your search.

shafyy · on July 25, 2023

Try out Kagi Search. You can manually increase website's weight and completely block others. E.g. I have increased Stack Overflow's weight and blocked those stupid content farms. Works great.

mglz · on July 25, 2023

Conspiracy theory: Bad initial search results forces people to search more often, hence allowing google to show more ads. Since few people switch away as a result, they continue doing this.

permo-w · on July 25, 2023

this is a bit like the cosmetics industry. there are very clearly probiotic solutions to body odour that could be developed with the coins down the back of P&G's sofa cushions, but if you fix everyone's body odour, then how are you gonna sell them anti-perspirant from now to the end of time?

now, in an ideal world competition would solve this problem, but the cosmetics companies heavily collude and anti-compete to prevent this

JacobSeated · on July 25, 2023

This is where I want to remind you, Stackoverflow is a Q/A site that sometimes contains stolen content from, as you put it, so-called "content farms" and the official resources.

Now, I do have a Stackoverflow user as well, but I actually prefer publishing my ideas on my own site rather than help build someone else's content farm for free. Stackoverflow is, itself, a content farm, and it can be very hard for new users to join the site. You can not even post comments without first earning enough points. For a very long time I would actually resist joining the site for that reason. I have only recently earned enough points to comment.

Now, I happen to own a so-called "content farm" too, and the choice can either be between creating a standard blog with very little traffic or try and cover everything you can possible think of in order to compete with other "content farms" in your niche. It is very difficult if not near impossible for a single individual to create a valuable resource and maintain it, and it is simply not sustainable if you have paid authors working on it as well. There is no way you can monetize it decently. Stackoverflow probably found a way around this problem by simply leaning back and monetizing their users content.

Once your site grows big enough, you also deal with a ton of spam- and hacking attempts. Everything combined just requires an inhumane amount of time to deal with.

Of course, authors are desperate because of how difficult it is, and perhaps especially authors from poor countries that might not have other sources of income. Their basic business model seem to be: create a content farm with ads, fill it with copy-written spam and hope Google indexes. Often these sites even have multiple authors, which is quite baffling given the extra expense it must create for them. But I do not think they have actually thought the idea through – because it is just not profitable.

Weirdly it's often in the technology niche, which they are clearly not proficient in, and more or less containing stolen solutions with little original content added.

I have seen a few sites like this, ripe with some of the most nasty grammar too. It interesting they are able to rank simply based on their volume? Of course they must be using blackhat techniques, including linkbuilding if you analyze their link-profiles, because there is no way that something so poorly designed and maintained is getting that much attention compared with official sources or stackoverflow.

For those of us who own blogs, such sites are often easily outranked simply by writing a comprehensive article on whatever tiny topic they have posted about.

wahnfrieden · on July 25, 2023

Yes, if you cite a solution the mods there get angry when you don’t copy paste the third party site content instead of just link to it. The stated reason is to make sure the content isn’t lost. In other words to ensure the content is duplicated on SO.

I have no allegiance to SO ownership so when the fake SO sites show up in results instead of SO, usually reading them will just give me the answer more quickly than finding the actual SO source.

ceejayoz · on July 25, 2023

They want enough of an excerpt so the answer doesn't become useless years later when someone redesigns their blog URL schema or shuts it down. That's reasonable, and probably falls within fair use.

wahnfrieden · on July 25, 2023

That’s what I said

bobajeff · on July 25, 2023

>mods there get angry when you don’t copy paste the third party site content instead of just link to it...

There's a good reason for that. Sites come and go and as a result links to solutions die and you wish someone had just answered the question instead of just linked to it.

wahnfrieden · on July 25, 2023

Thats what I said

EspressoGPT · on July 25, 2023

Absolutely. Google search results quality has declined and I often find myself prefixing search queries with "site:reddit.com".

Someone · on July 25, 2023

> it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?

Are they just fine today, too? To judge that, you have to look at the date of the question and its answers, make an educated guess at what OS/language/library versions they are about, judge whether that makes a difference for the version(s) you’re using, and only then evaluate whether the reply even was correct at the time (it may have had a thousand upvotes, but still be dated)

I think a really good Q/A resource would require posts to be tagged with version info. Most people think manual tagging isn’t fun, though, so it’s hard to get such a set from volunteers.

An alternative would be to require test cases that the site can run to check what version(s) replies are valid for, but writing such tests that do not break over time is hard, and, again, in general volunteers don’t like writing them.

That leaves generating tags or test cases. I don’t think we’re there, quality wise, to do that.

avereveard · on July 25, 2023

Stack overflow committed the cardinal sin of running their own ad network on their sites, not much of a mistery it's downplaced.

azherebtsov · on July 25, 2023

I was involved in SEO related projects some time ago. Not that I’m an expert. I’ve heard google understands that the site is a search engine and does not index it. However it should be smarter, like do not index SO’s search pages, but do index question pages - because the original content is there. SO might have ran out of crawl budget which Google assigns to each site, and/or Google prioritises fresh content. But I agree with the sentiment, what we know as SEO is nothing more than playing games with Google indexing algorithms, based on rumours about the recent changes in it, or improving page performance beyond reasonable boundaries. The other day I was looking at apple.com internals and spotted a few things which we were “fixing” on our pages. I asked SEO experts “what is the point of doing X, since there are examples of a well indexed page having that same problem?”. And the answer was like “when we will be as big as Apple…”

Modified3019 · on July 25, 2023

uBlacklist can help by culling the spam results, while it's mostly a manual thing, it's fast, easy, and a little effort goes a long way I've found.

Unfortunately it still doesn't solve the issue that sometimes the good results are still buried pages away or simply not come up at all due to google's shitty algorithm.

I really need to look into SearXNG or something.

ajross · on July 25, 2023

Is that actually true though? So, I literally just went to Google this morning for a toddler python question (very much not my first language, heh).

"how to load a file all at once in python" returns a first hit pointing to a blog post answering the question correctly, a second pointing to a SO answer that is actually for a slightly different problem but contains the correct answer, answer #3 is a youtube video that probably answers the question correctly.

Geeks4geeks doesn't show up until #4, well below Stack Overflow. (FWIW, their answer was fine too).

> You seriously can't find useful anything these days using Google.

That really feels more like a meme than reality. Are there other subareas where the SEO is doing better than this one? It seems like a pretty representative question.

coliveira · on July 25, 2023

The answer is that the content farms are doing a better job of interacting with Google algorithm than SO. Of course it is a problem with Google search, but search was always hackable. The made-for-google sites know very well how to play the game.

bee_rider · on July 25, 2023

I wonder if Google should make their SEO prevention worse but simpler. Everyone has always wanted to SEO for Google, as long as Google has been around. It has seemed like only recently that good sites predictably lose.

Perhaps 10 years ago Stack Overflow was able to do some minimal SEO and then get by on content strength. Perhaps nowadays Google is doing a good job preventing basic stuff from working, so the only people to get good results are SEO-ologists that only know about exploiting SEO, and have nothing interesting to say on any other topic.

coliveira · on July 25, 2023

I think the answer is simpler. To rank well on Google you need to integrate with Google (search console, analytics and similar). I guess SO is not giving all their data to Google, so they cannot "optimize" for the site in the way that content farms are willing to.

m000 · on July 25, 2023

I think they need to bypass Google somehow to keep it going. Embracing LLMs could be a way out.

I already go to ChatGPT to cut through the SEO-optimized crap that Google offers me in the first couple of result pages. I would bet that a lot of the responses given by ChatGPT come from Stack Overflow.

Now, what if we had StackGPT which offered me similar funtionality as ChatGPT, but better? E.g. respond with some code and an explanation, but also link to the sources (which are probably within their site, so they have prime access to them). Or offer as an explicit option to respond using sources other than their archive, but perhaps without citing sources.

highspeedbus · on July 25, 2023

My theory these days is that indexing services like google are now too big to work properly. There's more and more noise added every time new information is indexed, to the point where strong bias is necessary for it to return relevant results to average user.

Maybe there's a point where the internet, with decades old information pilling up, becomes unbearably big for indexing services to handle all of it in a efficient manner. Hence the recent "optimizations" that companies swear haven't worsened searchability.

madog · on July 25, 2023

This is what I want from a new search engine:

1. Respect exact match searches - this used to work enclosing the search terms in "" quotes, but no longer works. If there are no exact match results, return nothing.

2. Allow blacklisting or removing results from certain websites entirely e.g. I want to be able to configure geeks4geeks to never show up in any results ever

If someone could make this new search engine they would have a good shot at replacing Google :)

freediver · on July 25, 2023

Both features exactly as described already exist in Kagi search [1] (founder here).

We are not trying to replace Google though, but offer an alternative to people who care so much for the quality of their search experience, that they are willing to pay for it.

[1] https://kagi.com

fho · on July 25, 2023

You won me over by summarizing listicles to a short list :-)

To be honest I think your pricing is to high. $25 for unlimited queries might be fine for somebody who needs a good search to work and earn appropriately.

But as a (former) PhD student I ran through the 100 free queries in 2 or 3 days and just would not have been able to afford 25€.

I would gladly pay 10€ (for unlimited searches) or 15€ (for a unlimited family option). But to me, 25€ just seems to high. That's 5 meals at my workplaces cantina right now (Germany, NRW).

(I assume you are aware of pricing issues as pricing options have changed at least once while kagi is on my radar)

freediver · on July 25, 2023

Thanks for the kind words. There are many things like grouping listicles you can do to improve search experience, once the incentives are aligned.

Unlimited for $10 is something we are working towards.

fho · on July 30, 2023

Thanks for listening! At $10 per month unlimited searches I'll immediately switch.

Also thanks for creating kagi. Kagi was the first "alternative" search that convinced me that there can be competition to Google. YaCy just does not work, most competitors (DDG, etc) just repackage the big engines. I use presearch as my daily driver right now, but am somewhat turned down by the NFT shenanigans behind that. Kagi looks like the only engine that stands on its own and is definitely something worth paying for.

eitland · on July 25, 2023

Can confirm.

And if you find a result that got included despite not being an exact match you can report it and see it get fixed in a few days.

christophilus · on July 25, 2023

I think Brave search has those features. (I haven’t tried it, though.)

DarkmSparks · on July 25, 2023

https://search.brave.com/

probably my new default search engine. Thnx

GMoromisato · on July 25, 2023

I'm sure everyone has thought of this, but is any search engine trying to add LLMs to the crawler pipeline? That might be more useful than at the user side (like Bing) where the index is already polluted.

adolph · on July 25, 2023

> but it still holds years worth of answers that were just fine a few years ago

The flip side of that is a large proportion of those are no longer fine or operant.

Calamitous · on July 25, 2023

That’s one of the nice things about Kagi: you can lower our block content farms, and elevate sites like stack overflow.

pydry · on July 25, 2023

>SO might be horrible now, but it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?

SO had an expert sexchange.