These figures show an extremely precipitous (and permanent) decline in traffic over the course of a few days in May of 2022 [0], during which the number of daily new visits dropped from ~1M to ~300K, the number of total daily page views dropped from ~20M to ~14M, and the number of daily sessions dropped from ~9.4M to ~6.1M.
However, there is no commensurate decrease in posts/votes during the same time period. Posts/votes remained relatively constant through 2022 (modulo normal seasonal fluctuations), until February 2023 when both fell off a cliff (I assume due to the rise of LLMs). Traffic data are sourced from Google Analytics, while post/vote data are computed internally by StackOverflow [1]. I wonder if the apparent precipitous drop in traffic in May 2022 is simply an artifact of Google Analytics suddenly changing how it tracks traffic/visitors.
From the comments on this answer https://meta.stackexchange.com/a/391625/136010 it is suggested and agreed by staff that the chnage in May 2022 was the role out of a proper cookie consent form. If you don"t have performance cookies SO can"t work out the analytics.
Fro staff member Catija
"@JourneymanGeekOnStrike Yeah, if you go back further, the "traffic" numbers see a 40M/week drop between April and May 2022, which is when the cookie tracking changed, and then normalizes again until December. So, prior to the cookie changes, traffic was about 140-150M per week. But, to be clear - this is stuff we're aware of and have "corrected" for, I guess."
the post doesn't explain where they got these traffic numbers, and it seems unlikely they have access to stackoverflow's real traffic stats. they're using some sort of estimation here. there's always a chance that their estimates are wrong - especially if they're showing implausible shifts like this.
That makes sense. "New visits" are first time users, likely young coders who are looking up answers to things on a search engine, find what they're looking for on Stack Overflow, maybe click on an ad, and leave. They probably don't vote or post much. A sudden die-off there suggests something very bad happened to organic traffic (change in Google? Terrible new SEO scheme? A sudden stop in ad buys?)
The new content rate been dropping at a dismally constant rate for a long time, but the first few months in 2023 were awfully grim. I wonder what might've corresponded to that.
If SO was worried about that drop I think they would have bought back some of that traffic. More likely something has changed how they count the visits or they blocked some bad traffic. Traffic data is often sampled as well.
The fall in the beginning of 2023 may be the introduction of ChatGPT. A more worrying idea is that the numbers reflect not just the decline of SO but a decline of the whole IT business.
Which would make sense, right? You are more likely to get an answer on StackOverflow for questions that touch very common technology (because more people are likely to answer). And that is exactly where Copilot probably shines too (I don't use it): because that is where there is a lot of training data.
I personally used to like StackOverflow as my last recourse: I grew up in those years where we had to RTFM, and I kept the habit. So if I go ask on StackOverflow, it is a tricky question. It used to be fine, and I was getting an answer eventually (sometimes after adding a bounty).
But in the last few years, I have had legit questions downvoted or even closed, and it was obvious that the people voting to close it did not even understand it. I agree that the moderation culture on StackOverflow is toxic. If everytime I contribute something, I have to fight to not get downvoted or closed, then I will slowly stop contributing.
The most help I ever got from SO for questions not already there, was because of their (perceived) strictness. The process of writing a high-quality question, with a minimally viable example, clearly lined out thought-processes, and other things tried, solved the question for me in most cases without me ever having to post it.
> The process of writing a high-quality question, with a minimally viable example, clearly lined out thought-processes, and other things tried, solved the question for me in most cases without me ever having to post it.
Nevertheless post the question and provide an answer. Everybody wins: you reap the upvotes, and everyone else benefits from the shared knowledge.
You don’t mark answers as duplicates, you mark questions as duplicates. And if it’s a duplicate question, the new answer should be posted to the old question. So it’s correct to mark the question as a duplicate. Otherwise all the people arriving at the original question won’t see the new answer.
My SO account is almost 12 years, with just over 2k reputation and I don't really care. Until now I am still somewhat helping answer some basic questions in the mobile development area tags, my only gripe with SO is the hostile nature of some mods with large reputation. Some seem to get a kick out of this and forgot that reputation does not translate to expertise.
For 12 years, they have not figured this one out. New users will ask a very valid question and then won't respond anymore. I have seen this one played out every single day. Back in the day, users were generous with the upvotes even for a simple basic question, this is not the case anymore today.
I think that with the rise of push notifications, no one really goes to a site to check notifications anymore. So the new user may have not developed the muscle memory to go back to SO and participate. I suspect this also has something to do with the decline of forums. Reddit still works because the app sends 200 notifications a day, but without it, I don’t think it would be as popular.
Also SO is participation hostile unless you’re a pro, so as a newbie I’m not going to do anything other than ask and lurk, because I’m not worthy
At least part of the reason for the hostility is that SO is a game. You get points, but you can also prevent others from getting points by voting down or removing their questions and answers.
On SO this hostility is pronounced because participants believe that if they get a lot of points they have easier time finding a well-paying job.
I don't know if we all do it the same way. I don't use push notifications for barely anything, because I don't want to be disturbed by random sites (least of all linkedin or SO.)
Since SO is often used in a professional capacity, that problem could have easily been fixed by dev tools providing a formal way to link to SO traffic for topics that are relevant to the team.
It's just been a while since anyone has started trying to integrate tools with each other, outside of the established players.
these peasants with high reputation thinking they are johnskeet
reputation is meaningless and bloated in stackoverflow now there are many 100k reputation people because asking or answering basic shits on javascript/python/pandas/git
Every time I posted on SO(or other SE sites) I always have to clarify my question with something like "I know it's probably not good idea to do A, and I understand B could be a better solution, but in my specific situation I really want to do A."
Then people will still try to close my question because it's a duplicate of B.
I've literally included the search terms I used to ensure it wasn't a duplicate. Other times ice explained why this is clearly not a duplicate . Nope. Closed for being a duplicate.
Sometimes it can be as simple as "version 2 of this software does things this way, but I'm using version 14, how can I do this?". "Closed as duplicate: [question from 12 years ago]".
I think the problem is google losing the fight with spammers.
It's being a while for me that I have to put "stackoverflow" in the search query to avoid sites with scraped content
Google is not "losing" any fight. Google is deliberately letting spam thrive because that spam may contain Google Ads/analytics and increases engagement on the SERP page as people who click on the spam go back to try something else (potentially one of the sponsored results). All these contribute to Google's bottom-line.
Problem is that in addition to people whose salary depends on it, there seems to be plenty of people out here defending Google and spreading misinformation despite having no obvious profit motive.
That's my guess too; I'm sure Google drives the overwhelming majority of SO traffic.
A few years ago, my programming-related queries would hit Stack Overflow as the first or second result. Now it's very frequently spammy garbage in the top 2-3 slots.
What kind of spam do you get when searching something specific and technical? Who is trying to SEO their way to the top for "how to set redis max memory"? A lot of comments here saying the spam is beating out SO, but what spam and from who and why??
especially YouTube links. sadly, it would not surprise me if these people are earning a decent enough money from ads to make it worth their while to be "content creators" solely from search results from Googs
It is both infuriating and sad that Google can‘t figure out a way to compensate for this SEO spam. Is there an easier problem than doing it for SO (and yes, coding is a big enough problem for Google imho to be worth investing a little here).
Which means they aren't applying any sort of primacy to the information.
If three segments of the internet think the same piece of information is relevant, that should affect the score of all 3 copies, not just the largest segment.
I'm not sure I'm reading you right--you're suggesting it should work this way?
When content republished on some bizarre/sketchy/unaccountable ~adfarm outranks the site where it first appeared, users of Google's search service end up at higher risk of getting phished or infected with malware.
Is there some benefit you see here that outweighs this downside risk?
Applying SEO to a copy of someone else's content gets you highly ranked on Google. I'm saying that at this point Google is doing enough processing that they should be able to detect duplicates after a fashion, and weight the oldest copy more heavily than duplicates.
Well if they served up a high quality site it you'd just go there and might not have ads even. Where the dozen SEO garbage sites they do serve up are all hosting ads google gets a cut of.
That's a very short sighted business strategy if true. Simply liquidating their reputation. Those junk AI results have certainly led to me using Google less.
> That's a very short sighted business strategy if true. Simply liquidating their reputation.
Why should they care? They're too big to fail…
Google controls almost the whole end-user realm through Chrome & clones, and Android, the dominating end-user OS by a wide margin.
At the same time end-users are completely helpless and can't do anything against Googles liking because they don't understand anything about IT tech.
Computers are black magic to most people so they're trapped. This never changed! Especially millennials and gen-z are completely clueless as they didn't had the chance to use personal computers ever, where you had at lest some control over the device and needed to know at least some basics about its inner working. All the younger people know are the tightly sealed black-box devices you don't have any control over, called mobiles, which are fully operated by big-tech. Google search + Android apps are "the internet" for most people. They mostly don't even know there is something else beyond that, so Google can do whatever they want, and this will have exactly zero consequences for them by now.
Google's move regarding rolling out "browser DRM", the next "trusted computing" initiative, regardless of what anybody thinks about is is very telling.
Now they will violently reap the fruits of their monopoly, and likely nobody will be able to stop them in the next decade. People where warned about the consequences of this monopoly for many many years. Nobody cared. Now it's payout day for Google.
When do you think millennials were born? The very youngest millennials were in their tweens when the first iPhone came out, and the oldest were pushing 30. They definitely experienced pre-smartphone computing. In fact, it's probably the defining characteristic of the generation: millennials grew up with modern computing, but before the smartphone. Gen Z grew up in a world were smartphones were ubiquitous.
I think we're rapidly approaching a point where any content that come with ads is suspect. The fact that only Wikipedia has managed to largely escape deterioration (or as some call it, "enshitification") is testament to this. A search engine that can selectively search non-sponsored content or soft-paywalled content would be potentially quite popular. However, monetising such a service without ads will be a challenge.
What’s interesting is that Google was known for how hard it was to figure out the Google algorithm.
Remember, when people were hired because they knew the secret sauce on how to get the best Google ranking. Google experts?
Well, it turns out that the person at Google that was responsible for keeping the algorithm fresh and the search results fresh retired and everything went to shit when they left.
Actually, I’m betting that person did leave the company, but the real damage happened when someone came along and convinced everyone they knew the real trick to better search results and we have the shit that is now Google. Nice work new guy! Let me rephrase that. Nice work to the guy that thinks they are smarter than everyone else and still thinks their approach is the best, yet evidence to the contrary.
Really sounds as if your made up story is deeply rooted in your own experience. I am sorry if something like a new guy taking your position and claiming to be smarter has happened to you but creating imaginary stories is not quite what this comment section needs and you'd probably be better off dealing with this in a different way
Not at all. Nothing personal. Although, it looks like you are the one self-projecting here.
It’s simply how times change and people with it. Knowledge is lost when people move on and the reasons why certain decisions were made are not transferred.
At any rate, I imagine people at Google are trying to figure out why there is such a negative opinion on their search results lately.
Matt Cutts was instrumental in community outreach and helping SEO differentiate from spam. When he left, Search pivoted to stuff like using Twitter data and lifting content directly from websites into results. While it’s probably hard to attribute all the changes to one person, Matt Cutts made a huge impact on the product.
I left SO because I was downvoted to oblivion for an answer that took me 2 minutes to write - but I had answered a similar question several years before (which I actually didn't remember). Searching for my own answer would have taken way more time than it took to write a new one.
When I pointed out that it's not the responsibility of the one answering to search for dupes, but for the one asking, I was told that I should still invest the time or otherwise don't answer at all.
Yes especially if you know you have answered the same thing before. You look for your original.
Remember all users are moderators. There are some explicit moderators but they don"t close or downvote often they deal with other problems - or on smaller sites just use normal user powers to vote and close.
Then SO's reasonms for this policy need to be explained more.
The aim of SO is to provide answers to a question.
You do not want many questions with the same answer as if you have a new answer or a comment on this duplicate answer you need to then add it to all the questions. Thus we want to collapse all these multiple questions into one.
Also the person who I was replying to did noit seem to understand that they were a moderator, moderators are not a separate set of people to users.
citation needed that it actually provides better more consistent data. all it leads to is a flood of closed articles in Google search results. no curation whatsoever
I'd also like to highlight "non-hostile" as a reason why folks might prefer ChatGPT.
Stack Overflow has a lot of stridently opinionated jerks contributing to it, and if I can just ask ChatGPT a question and get an answer that works rather than having to deal with being belittled by those people, then I'm probably having a much better day as a result.
This post, to me, is about the rise of ChatGPT — but I do think over-moderation is a huge problem.
I had a hard moment on the gamedev stackexchange where I was stuck trying to learn how to do something in OpenGL. A moderator immediately closed my question as a duplicate because there was a similar question about OpenGL ES, which is a (related but) different API. I tried to plead my case, but was shut down.
Shortly after that, I gave up on the game I'd been working on for a couple years. The mod's decision contributed to that.
I felt stuck by a wall between me and answers to some of my game programming questions. Over-moderation is more than an inconvenience. It can destroy the ability of users to get things done.
The graphs in the post show the traffic decline starting around May 2022, months before ChatGPT was available. I'd wager the cause is a change in Google's algorithm. Most of the time I end up on Stack Overflow, it's because I've typed a question into a search engine.
The top search resuls used to be either a SO answer, or a forum post or the actual docs having the answer to the question. These days it's either a dupe site copy pasting ad verbatim, a recurgitated and slightly modified variant of the former, or a "AI" generated answer, all full of ads. And to make it worse, none of them are useful as they obfuscate the answer or are simply wrong.
Looks like Google started to prioritise ads even more than actual useful results is what changed mostly.
This is a problem on the other side of the experience spectrum too. Sometimes I want to ask an advanced question and interact with other experienced users on SO. However I have to battle the mods (who clearly don’t understand my question) to keep it open.
My questions usually go unanswered for years with several "me too"s and "did you ever figure it out?"s nailing my inbox.
I do typically self-answer if I figure it out, but you know, if I'm going to be ignored maybe it should be a github issue so I can get the sweet zero replies and that juicy 90 day auto-close from inactivity.
I remember trying to learn front end development around 2013, was fascinated by responsive web design and twitter bootstrap. Asked some questions on that site, was mostly ridiculed for my amateur questions several times, never touched the site again and also never learned front end. So this is my story with that site.
Same. I got put in StackOverflow jail for posting my contribution an answer because I didn't have enough karma to post a comment on a previous answer (or maybe it was the other way around, I forget). Never mind that I was earnestly trying to help the original poster, and pointed out a legitimate mistake in one of the answers. I broke protocol and had to be punished.
Not quite true, though indeed getting upvotes seems not to be as easy as it once was. I still get an upvote on some of my relatively recent answers once in a couple months, although it depends on the answer.
The quick way to accumulate reputation is through bounties. However, it's very much a lottery: bountied questions are often about ultra-specialised niche topics. You may need to hunt for a long time to find something you actually know something about.
You don"t need any karma to answer any logged in thing can answer. I say thing as ChatGPT is being used to produce a load or crappy wrong answers now.
Thus I don"t understand your issue. This is a XY-problem :) I know enough about the subject to know that your issue is not the actual issue as anyone can answer, if you had issues then there is something else.
Ask a dumb question about a trendy JS framework and you get hundreds of votes.
Answer a difficult question on a barely documented part of software (e.g. low-level) and you'll get a couple of votes, at most. And you're lucky if the answer gets accepted.
There are a few unsung heroes on certain hard/obscure SO tags. They dedicate a lot of time and get little reward. Whatever follows SO should find a way to fix this.
The problems I’ve noticed with Stack Overflow are a few and hard for me to narrow down but basically:
- google used to return really relevant results for SO, and it stopped doing so at some point a while ago
- moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.
- because of the previous bullet, oftentimes the best answer is buried in comments and has very negative feedback despite answering the exact question
Due to a combination of these things, filtering against the noise for what I wanted became increasingly more difficult and often the solution to my problem was easier found searching github comments or random blogs.
> - google used to return really relevant results for SO, and it stopped doing so at some point a while ago
SO might be horrible now, but it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now? Google's current recommendation of going to w3schools or - even worse - geeks4geeks or any other content farm is and always will be worse than stackoverflow. I don't have a clue what their algorithm is doing but it's surely trying to kill Google search as fast as possible.
Another joke is the fact that searching for "[language] [symbol]" also brings me to these content farms instead of the documentation. You seriously can't find useful anything these days using Google.
This whole situation just shows as a lie everything we hear about SEO. Stack Overflow has the exact text, it loads incredibly fast (should be commended more for this), doesn't require ten meg of Javascript to render as far as I know generally meets HTML standards.
These scam sites load megabytes of junk, load slowly, have text interpersed with ads and modals that render right on top of them, if you open devtools you'll often see pages of warnings about deprecations and/or invalid html, and despite having the same scraped text always score higher on Google.
It has long been a mantra in SEO land that user generated content sites in general and forums in particular are to be aggressively down ranked. The reason for this is that industrial strength spam farms otherwise spin up tens of thousands of forum domains to pass link juice to what they are targeting. This naturally penalizes real forums, which often contain the best content for a query.
This is why Google has basically surrendered and why so many search result categories are now dominated by whatever sites Google has arbitrarily declared the winner through editorial decision making. In many search categories, we are effectively back where we started with Yahoo directories and hand-picked search rankings. What you see on the first page SERP is the best that they can do under the circumstances based on the fundamentals of how search works.
The web was a fun idea while it lasted, but if you are using it as a primary information resource, you are wasting your time.
funny, I just rented an introductory book about signal processing and was (re) amazed to see how much the information is well explained, with tons of example, and a real plan to guide you in the ton of knowledge you have to master.
Well, those sites are often running ads probided by Google, so it's understandable why Google doesn't really have a good incentive to follow your first suggestion.
They do penalize heavy ads. There have been shenanigans on this front also (penalizing non-Google ad networks and favoring Google ad networks). They do penalize some content farms and favor others. The issue they are concerned about with SO and MDN among other user generated content sites is covert seeding at scale for the purpose of manipulating search results.
There's just a lot of fraud on the internet related to search and advertising manipulation, but it's under-policed in part because of the internationalized nature of it and because it is hard to bring fraud cases in the United States because of the particularized pleading standard. That should not stop the feds from bringing criminal cases, but generally the feds care about large dollar value frauds (as they probably should be) rather than on policing very large numbers of small dollar value frauds that have a major aggregate impact on the online economy. They like going after the guys who steal $100 million from deaf children with Lupus rather than doing 200 $500k fraud prosecutions.
Anecdotal, but my website lost a lot of search traffic after Google's core update in March, which seems to have affected SO as well looking at the first chart.
If I look at Google's guidelines, my articles follow all of them: in-depth, well-researched, demonstrating personal experience, better than other articles appearing in the search results. And yet, they were "penalized" by this update for who knows what reason.
I looked into it and some other websites benefited from the update, so who knows what changes they made and why.
Again anecdotal, but I lost the majority of my SEO traffic late last year around the time of a core update. I've spent the best part of a year attempting to repair it, on the assumption I'd committed some heinous SEO crime. The more time that passes, I'm starting to think that the issue isn't mine so much as Google's. It's baffling. I wrote about it here:
I don’t see what you did wrong, it must have been the algorithm change. My parents had a business that was killed by a Facebook algorithm change. My brother took a significant hit from an Amazon algorithm change. Building a business around any of the big tech companies seems very risky.
I think Google search has just declined a lot. I guess they’re losing the constant cat and mouse game with SEO. It seems worse than it has ever been, I’m relying more on ChatGPT and copilot now.
I can only imagine that LLMs will be the end of any content based search ranking. I don’t know how they’ll adapt to that.
"If you operate a paywall or a content-gating mechanism, we don't consider this to be cloaking if Google can see the full content of what's behind the paywall just like any person who has access to the gated material and if you follow our Flexible Sampling general guidance."
Does Google run other indexers for the purposes of catching cloaking? Are there other strategies that can be used? One of the problems of SO is that most of the valid content is out there and easily available without having to scrape the site which may mean penalizing for bad content is harder.
Does it even make sense to serve different content to a bot than what a human would see? Isn't the search engine trying to rank content made for humans?
It's an adversarial process. The search engine is, in theory, trying to rank by usefulness to the user, and the site owner is trying to maximize revenue by lying to the search engine. And the user.
I'm generally puzzled by Google's reluctance to do manual intervention in these cases. It's not like this is a secret. Just penalize the whole domain for 60 days every time a prominent site lies to the crawler.
There are very many sites where the content you see as a non-logged-in user is different from what you see if you have in your possession an all-important user cookie.
If Google's support is any indication, Google doesn't like to involve humans in their processes. There probably isn't enough humans to do this manual intervention you propose.
Eh, Google choose to be identifiable as googlebot and to obey robots.txt for other reasons of "good citizenship", because not everybody wants to be crawled.
Google is really failing hard in this regard, and I'm fairly sure it's intentional on their part. Searching "Typescript array" has obvious intent from the user, and an obvious "correct" first result. Google returns the documentation page in the 3rd result, but it's a link to a deprecated version of the page. The rest of the above-the-fold links are websites that contain Google ads.
Duck-Duck-Go returns the up-to-date documentation link 2nd, and the MDN result in 3, with W3Schools in 1. Bing returns actual content on the results page, describing exactly what you need to understand a TS Array.
Google have the incentive to push the poor sites, because they earn revenues from doing so. Bing and DDG don't have that incentive, and return much more relevant and useful links. That doesn't feel like a coincidence.
I spent years learning a programming language well then further years delivering a training course, iterating and then providing sections of the course on the website free online. Both as advertising and to get new people started. Your "typescript array" returns one of the sites in the top 5 that basically copy-pasted via thesaurus many of my articles. I checked and it turns out they offer $50 for people to submit content for any language / technology. So you have someone in a cheap country paid to go copy content and reword it on that site. Then they rank higher than you, as they do this over many languages thus seeming more authoritive. Even more worryingly with chatgpt, they won't even have to pay the $50 any more. So the whole internet may become like this. Leaving me little incentive to publish material except that which solely entertains myself, mmm facebook/twitter = not a good outcome.
I have a friend that does something similar, but only does video with the text gated behind a paid-only site. He makes pretty good money, but for the exact reasons you listed is why the site is paid-only. They have a much harder time stealing (as in posting as their own content) the video.
I also notice and appreciate that Kagi returns older results while Google continues to push newer webpages. I have found so many useful results from perfectly fine content on older webpages. At this point, I’d be extra happy if Kagi had a Web 1.0 filter that focuses on basic html websites.
Yes, google search is nowadays, like everything else, run by AI. What nobody tell is that the AI is trained to maximize Google's revenue. That's why they figured out it is better to put these ad sites on top.
Failing at what though? Is it anything they care about, that they want to do?
If not, then it's not so much failure as it is a change of plans on their part. They don't want to do that anymore, and there's no one else to pick up the slack.
There are browser extensions for blacklisting domains from your Google searches. I've been so incredibly happy using one if them. If I see one of those despicable content farms I just blacklist it and move on. Often when I search on Google for technical stuff I only get 2 visible results on the first page, 1 SO and 1 documentation. Soooo relaxing.
The business reasons why Google doesn't take steps to remove the bad content and make their product pleasant to use again is so far from my understanding it might well be aliens running the company for all I know.
My understanding is that Google has an incentive to send people to content farms because those farms will show Google's ads. Stackoverflow doesn't. So they can increase ad exposure.
Thinking of it, it would be an interesting test to compare the ranking of two similar sites, one with google ads, another with ads from another provider. Might be good evidence for antitrust litigation. But what do you do if they just prefer sites with more ads? Because due to their market position, that benefits them, but it isn't anti-competitive against other ad-pushers.
Maybe you're correct. I've heard that explanation before but it just seems too incredible that they'd undermine their monopolistic global billion dollar business for a measly share of the revenue of geeks4geeks.
the way you phrase it there makes it sounds miniscule, but you scale that up to the size of the SEOified internet and the numbers are surely into the billions
I was thinking the same. Taking in consideration the vast amount of such SEO farms, there's surely a lot of ad money to be spend/earned if you prioritize the "right" sites.
I don't think it's an intentional decision anyone has taken, or that they intentionally made the search engine the way it works now, but more of a "there's nothing wrong here from out perspective, so what's there to fix?" kind of thing.
I've been using Kagi[0] for a while now and it's pretty great in general - but also has options to boost up / down / totally ignore certain domains. It also has "lenses" that let you set a context (example: I'm searching for code stuff so just include sites a,b,c).
It's really good and IMO more than worth the price.
Yeah, my Kagi list of content farms / SO clones which are completely dropped from all results keeps growing. On the other hand, searching just SO from Kagi still seems to give decent results.
Your experience matches mine. Spend two or three weeks blacklisting sites as you hit them and they disappear.
Some people argue that Google possibly can't win the fight vs spam sites, but obviously it works perfectly fine manually blacklisting them.
It takes time to build ranking.
The underlying reason is probably that the spam sites use Google Ads (revenue which is tied to 1000s of PMs and managers bonuses) and that Google as an org is deeply dysfunctional at this point.
Surely, they do. But they reserve that for stuff that's really way beyond the line. For everything that might be legitimate they leave it to the ranking algorithm to sort out and it's a game of cat and mouse.
Anecdotally Wikipedia is often the top result for me ... with the twist that it's the Google widget, with the side bar and related videos.
Only way below this block (which takes about 120% of my whole screen's height) come the "organic" results, that aren't great, but probably match what Google assumed I wanted to see.
Then consider using DDG & '!w term', or some other method (searching Wikipedia, extension, I think Firefox has something engine-agnostic built-in) instead?
Oh I do use DDG. I think it might suffer from the same problem, actually, since now I'm wondering why I see Wikipedia results very far down in Google when I don't really use Google.
I never really liked StackOverflow. The only questions they seem to allow are “How do I get the length of a string in Python?”. Most of the problems where I am scratching my head and really need the benefit of somebody’s experience are software selection problems that aren’t allowed.
The competing answers paradigm is also fundamentally broken, I don’t want to see 15 answers to “How do I get the length of a string in Python?” I just need to see
len(x)
Programming splogs do better than SO does in this respect. In fact, even the Q/A paradigm is bad because the average SO post requires scrolling past at least one extensive code example that does not work,
For more than 10 years I thought the world needed a search engine for programmers. You really ought to be able to upload your POM file or equivalent and have the system automatically search the correct version of documentations. (Any attempt to look up things in the Java manual has to be written like “JDK17 javadoc {className}”; Javascript libraries like reactstrap, react-router and such often have a few wildly incompatible versions and I don’t want to waste a millisecond with the wrong version doc, …)
I wouldn’t mind searching answers from stackoverflow but I only want the best correct answer and I don’t want to read a long confused question, etc. As this would clearly save coders time maybe they’d pay for a subscription as they do for Jetbrains tools.
Years ago, Google announced they would crack down on content farms, and SEO advice was really like "you NEED to have this meta tag if you duplicate content from elsewhere else google will fuck you over HARD!", but it seems they earn more money off of content farms then the sources.
This will hurt them long term I presume, but they won't care because they earned money.
Google's current recommendation is usually heaps of Pinterest randomness, and then they wonder why people start relying on ChatGPT "oh it's not a search engine" - sorry folks Google isn't one (anymore) either.
Google has gone down the drain. As I've written recently somewhere here, they could easily fix their search by hiring maybe a dozen people per country to moderate common search request results or to, hell, listen to users like here, and respond by booting the scammers.
The problem is, they won't, because active moderation beyond responding to legal (DMCA, right to be forgotten, anti-CSAM) demands would massively endanger their "we are an impartial search engine" defense.
It's been over 10 years and it still endlessly frustrates me that searching for any Ruby or Rails documentation will send you to an APIDock page for Rails 3.2 and you have basically goad Google into giving you the official documentation for either.
I suppose the real frustration is that Google became so pervasive that bookmarking a website and using its own search functionality is a total afterthought.
try Kagi ( kagi.com/ ) . SO answers are almost always the first ones for my geeky questions (as they should be in most cases), and it also extracts and displays the official answer to the question that best matched your search.
Try out Kagi Search. You can manually increase website's weight and completely block others. E.g. I have increased Stack Overflow's weight and blocked those stupid content farms. Works great.
Conspiracy theory: Bad initial search results forces people to search more often, hence allowing google to show more ads. Since few people switch away as a result, they continue doing this.
this is a bit like the cosmetics industry. there are very clearly probiotic solutions to body odour that could be developed with the coins down the back of P&G's sofa cushions, but if you fix everyone's body odour, then how are you gonna sell them anti-perspirant from now to the end of time?
now, in an ideal world competition would solve this problem, but the cosmetics companies heavily collude and anti-compete to prevent this
This is where I want to remind you, Stackoverflow is a Q/A site that sometimes contains stolen content from, as you put it, so-called "content farms" and the official resources.
Now, I do have a Stackoverflow user as well, but I actually prefer publishing my ideas on my own site rather than help build someone else's content farm for free. Stackoverflow is, itself, a content farm, and it can be very hard for new users to join the site. You can not even post comments without first earning enough points. For a very long time I would actually resist joining the site for that reason. I have only recently earned enough points to comment.
Now, I happen to own a so-called "content farm" too, and the choice can either be between creating a standard blog with very little traffic or try and cover everything you can possible think of in order to compete with other "content farms" in your niche. It is very difficult if not near impossible for a single individual to create a valuable resource and maintain it, and it is simply not sustainable if you have paid authors working on it as well. There is no way you can monetize it decently. Stackoverflow probably found a way around this problem by simply leaning back and monetizing their users content.
Once your site grows big enough, you also deal with a ton of spam- and hacking attempts. Everything combined just requires an inhumane amount of time to deal with.
Of course, authors are desperate because of how difficult it is, and perhaps especially authors from poor countries that might not have other sources of income. Their basic business model seem to be: create a content farm with ads, fill it with copy-written spam and hope Google indexes. Often these sites even have multiple authors, which is quite baffling given the extra expense it must create for them. But I do not think they have actually thought the idea through – because it is just not profitable.
Weirdly it's often in the technology niche, which they are clearly not proficient in, and more or less containing stolen solutions with little original content added.
I have seen a few sites like this, ripe with some of the most nasty grammar too. It interesting they are able to rank simply based on their volume? Of course they must be using blackhat techniques, including linkbuilding if you analyze their link-profiles, because there is no way that something so poorly designed and maintained is getting that much attention compared with official sources or stackoverflow.
For those of us who own blogs, such sites are often easily outranked simply by writing a comprehensive article on whatever tiny topic they have posted about.
Yes, if you cite a solution the mods there get angry when you don’t copy paste the third party site content instead of just link to it. The stated reason is to make sure the content isn’t lost. In other words to ensure the content is duplicated on SO.
I have no allegiance to SO ownership so when the fake SO sites show up in results instead of SO, usually reading them will just give me the answer more quickly than finding the actual SO source.
They want enough of an excerpt so the answer doesn't become useless years later when someone redesigns their blog URL schema or shuts it down. That's reasonable, and probably falls within fair use.
>mods there get angry when you don’t copy paste the third party site content instead of just link to it...
There's a good reason for that. Sites come and go and as a result links to solutions die and you wish someone had just answered the question instead of just linked to it.
> it still holds years worth of answers that were just fine a few years ago - so why aren't they showing up now?
Are they just fine today, too? To judge that, you have to look at the date of the question and its answers, make an educated guess at what OS/language/library versions they are about, judge whether that makes a difference for the version(s) you’re using, and only then evaluate whether the reply even was correct at the time (it may have had a thousand upvotes, but still be dated)
I think a really good Q/A resource would require posts to be tagged with version info. Most people think manual tagging isn’t fun, though, so it’s hard to get such a set from volunteers.
An alternative would be to require test cases that the site can run to check what version(s) replies are valid for, but writing such tests that do not break over time is hard, and, again, in general volunteers don’t like writing them.
That leaves generating tags or test cases. I don’t think we’re there, quality wise, to do that.
I was involved in SEO related projects some time ago. Not that I’m an expert. I’ve heard google understands that the site is a search engine and does not index it. However it should be smarter, like do not index SO’s search pages, but do index question pages - because the original content is there. SO might have ran out of crawl budget which Google assigns to each site, and/or Google prioritises fresh content. But I agree with the sentiment, what we know as SEO is nothing more than playing games with Google indexing algorithms, based on rumours about the recent changes in it, or improving page performance beyond reasonable boundaries. The other day I was looking at apple.com internals and spotted a few things which we were “fixing” on our pages. I asked SEO experts “what is the point of doing X, since there are examples of a well indexed page having that same problem?”. And the answer was like “when we will be as big as Apple…”
uBlacklist can help by culling the spam results, while it's mostly a manual thing, it's fast, easy, and a little effort goes a long way I've found.
Unfortunately it still doesn't solve the issue that sometimes the good results are still buried pages away or simply not come up at all due to google's shitty algorithm.
Is that actually true though? So, I literally just went to Google this morning for a toddler python question (very much not my first language, heh).
"how to load a file all at once in python" returns a first hit pointing to a blog post answering the question correctly, a second pointing to a SO answer that is actually for a slightly different problem but contains the correct answer, answer #3 is a youtube video that probably answers the question correctly.
Geeks4geeks doesn't show up until #4, well below Stack Overflow. (FWIW, their answer was fine too).
> You seriously can't find useful anything these days using Google.
That really feels more like a meme than reality. Are there other subareas where the SEO is doing better than this one? It seems like a pretty representative question.
The answer is that the content farms are doing a better job of interacting with Google algorithm than SO. Of course it is a problem with Google search, but search was always hackable. The made-for-google sites know very well how to play the game.
I wonder if Google should make their SEO prevention worse but simpler. Everyone has always wanted to SEO for Google, as long as Google has been around. It has seemed like only recently that good sites predictably lose.
Perhaps 10 years ago Stack Overflow was able to do some minimal SEO and then get by on content strength. Perhaps nowadays Google is doing a good job preventing basic stuff from working, so the only people to get good results are SEO-ologists that only know about exploiting SEO, and have nothing interesting to say on any other topic.
I think the answer is simpler. To rank well on Google you need to integrate with Google (search console, analytics and similar). I guess SO is not giving all their data to Google, so they cannot "optimize" for the site in the way that content farms are willing to.
I think they need to bypass Google somehow to keep it going. Embracing LLMs could be a way out.
I already go to ChatGPT to cut through the SEO-optimized crap that Google offers me in the first couple of result pages. I would bet that a lot of the responses given by ChatGPT come from Stack Overflow.
Now, what if we had StackGPT which offered me similar funtionality as ChatGPT, but better? E.g. respond with some code and an explanation, but also link to the sources (which are probably within their site, so they have prime access to them). Or offer as an explicit option to respond using sources other than their archive, but perhaps without citing sources.
My theory these days is that indexing services like google are now too big to work properly. There's more and more noise added every time new information is indexed, to the point where strong bias is necessary for it to return relevant results to average user.
Maybe there's a point where the internet, with decades old information pilling up, becomes unbearably big for indexing services to handle all of it in a efficient manner. Hence the recent "optimizations" that companies swear haven't worsened searchability.
1. Respect exact match searches - this used to work enclosing the search terms in "" quotes, but no longer works. If there are no exact match results, return nothing.
2. Allow blacklisting or removing results from certain websites entirely e.g. I want to be able to configure geeks4geeks to never show up in any results ever
If someone could make this new search engine they would have a good shot at replacing Google :)
Both features exactly as described already exist in Kagi search [1] (founder here).
We are not trying to replace Google though, but offer an alternative to people who care so much for the quality of their search experience, that they are willing to pay for it.
You won me over by summarizing listicles to a short list :-)
To be honest I think your pricing is to high. $25 for unlimited queries might be fine for somebody who needs a good search to work and earn appropriately.
But as a (former) PhD student I ran through the 100 free queries in 2 or 3 days and just would not have been able to afford 25€.
I would gladly pay 10€ (for unlimited searches) or 15€ (for a unlimited family option). But to me, 25€ just seems to high. That's 5 meals at my workplaces cantina right now (Germany, NRW).
(I assume you are aware of pricing issues as pricing options have changed at least once while kagi is on my radar)
Thanks for listening! At $10 per month unlimited searches I'll immediately switch.
Also thanks for creating kagi. Kagi was the first "alternative" search that convinced me that there can be competition to Google. YaCy just does not work, most competitors (DDG, etc) just repackage the big engines. I use presearch as my daily driver right now, but am somewhat turned down by the NFT shenanigans behind that. Kagi looks like the only engine that stands on its own and is definitely something worth paying for.
I'm sure everyone has thought of this, but is any search engine trying to add LLMs to the crawler pipeline? That might be more useful than at the user side (like Bing) where the index is already polluted.
> moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.
Related: I've often been looking how to do X, find an SO question asking that, but the answerers there refused to answer until the person explained why they wanted to do X, and then all the answers (correctly) told the person that they actually needed to do Y and explained quite well how to do Y.
I actually need to do X, so those answers are useless to me.
Then I find another question on how to do X, and the mods close it as a duplicate of that earlier question. Even when the questioner specifically notes in their question that it is not a duplicate of that earlier question because they really need to do X the idiot moderators close it.
I ran into that a lot when I was doing low level firmware programming. The answers to someone's question would be something like "that feature is only intended for ultra-specialized low level programmers". And it's like "in this case, I am an ultra-specialized low level programmer".
I'm thinking of things like assigning constants to pointers in C and/or manipulating pointers directly.
> Related: I've often been looking how to do X, find an SO question asking that, but the answerers there refused to answer until the person explained why they wanted to do X, and then all the answers (correctly) told the person that they actually needed to do Y and explained quite well how to do Y.
> I actually need to do X, so those answers are useless to me.
I know what you mean. Whenever I (rarely) ask a question on Stack Overflow, I always have to defensively load it up with language anticipating misinterpretations and instructing people to answer my question and not some other one.
Otherwise, internet-point-chasers will fall through the woodwork giving easy, worthless advice. Even with all the defensive language, a few always show up.
I've never really thought about how contributors trying to avoid the XY problem really stands at odds with StackOverflow's mission of being a repository of answers, rather than a helpdesk. Not all Ys present as X, and not all Xs are actually Ys. Sometimes its an XZ problem.
The best you can hope for is some answer down the page that says something like "to answer the actual question..."
The mods are so ubiquitous and so busy on SO, I wish they'd spend some of their time silencing the "let's figure out what your real question is" pseudo-trolls.
I call them pseudo-trolls because I think they are well-meaning, but they function as trolls: overrunning a web site, hijacking discussions with repetitive and irrelevant content, and making most potential users feel that participating isn't worth the time and effort of interacting with them.
Even if X isn't the right solution to my use case I still often want to know _why_ X (or my implementation of X) doesn't work. The answer to that might be a really valuable learning independent of the problem at hand.
The very popular Zalgo answer is a perfect example of this problem [1]
The user asks a question that can be answered quite easily, and dozens of people post answers claiming that this is the wrong way to do it and that they should use some other tech to solve the problem.
Some people on Stack Overflow care more about showing off how smart they are rather than answering questions, and I think the point system attracts these people.
Hilariously, the accepted (I assume by default, not by the asker) answer is flagrantly breaking, like, half a dozen rules and guidelines… but because it’s cynically and unhelpfully crapping on a newbie, it stays up. Or maybe there’s a good reason for it to stay up, but at a glance it sure isn’t a good look.
I actually think SO is a great site and resource, but I also think a lot of that is despite the bitter old timers in the community, not because of them.
That answer is only there because it's really old, from the early days of S.O. where people were allowed to ask questions that weren't super serious binary yes/no style. It'd get moderated and deleted in a heartbeat today. A forlorn monument to the cool place that S.O. once was
Maybe I’m just a fun-hating asshole but personally I find this kind of thing annoying, not cool. People are just trying to get work done, not see someone’s attempt at cringey “nerd culture” humor.
That is not what most of us complain about I think.
I and I think many other are sad that S.O. removes many serious work related questions (have lost count of how many times I saw the perfect question with the perfect answer, with a note that this isn't what Stack Overflow is made for and these questions only exist for historical reasons).
Oh come on, it's not crapping on a newbie. It's a funny comment that serves as a reflection of the days and weeks this guy spent debugging these kinds of systems.
As someone who was a newbie at the time when it was posted, who was looking for a way to parse HTML, I took away that it's just really the wrong way to go about it. I didn't feel crapped on at all.
The question is about tokenizing XHTML, not parsing it into a tree structure like a DOM, which is a critical distinction. Regular expressions are a perfectly valid way to tokenize. This is why the snarky answers does not suggest a better solution - there isnt one!
If you scroll down long enough, you will see answers explaining that. But they arent upvoted as the answers suggesting the questioner is an idiot.
The question isn't looking to parse HTML (or XML). Regexes are inappropriate for HTML because they can't adequately match the starting and ending tags, not because of black magic. The OP isn't looking to do that, so regexes look like a perfectly acceptable way to go.
But the asker is very clearly a newbie. The question does not contain further context. The asker's suggestion is wrong (I think). And we've all worked with junior engineers who try to use the wrong tool.
The answer is a whimsical way of making an appropriate suggestion in this inferred context.
Also, to be fair, I think it's not mathematically impossible to use dark regex magic (with look-behinds and such) to parse HTML, but that's a discussion for another day...
The answer can only ever be accepted by the asker, not even mods can change that. It's actually not that rare that the accepted answer is not the one with most votes in which case the accepted answer is somewhere further down, not the first one on top.
The question explicitly invites the kind of witty reflection shown in the accepted answer, by adding: "and what do you think?"
As mentioned elsewhere, this is an old question and both the kind of question and answer wouldn't be allowed these days.
However, I also fundamentally disagree that questioning the assumptions in a question is unhelpful. You want to solve a problem, find an approach and want help because you have problems with the approach? What if the approach you took _is_ wrong? Its very helpful especially for advanced beginners or at the intermediate level, to be given a different way of solving the problem even if that is not what you asked.
It depends on context if this is just pedantry or genuinely helpful. The best answers I found start with answering the question that was stated, but then proceed in showing how the problem behind the question can also or better be solved.
“You didn’t actually want to do X, here’s how to do Y instead” may indeed be helpful for the beginner who initially asked the question, but it’s very unhelpful for me who finds the page years later actually wanting to do X.
> “You didn’t actually want to do X, here’s how to do Y instead” may indeed be helpful for the beginner who initially asked the question
Stack Overflow isn't a site for beginners, it's for "professionals". At least, that's what all the Stack Overflow defenders tell me every time I criticize the snarkiness, rudeness and patronizing manner of many answers/comments you receive on Stack Overflow.
> "Some people on Stack Overflow care more about showing off how smart they are rather than answering questions, and I think the point system attracts these people"
> and dozens of people post answers claiming that this is the wrong way to do it and that they should use some other tech to solve the problem
Yes, and they are factually correct in doing so. The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."
HTML isn't parseable with regex. The various answers under the question explain in great detail why [1] that is the case.
SO isn't a help forum, it's a question archive. The purpose of an answer isn't to solve one guys specific question, but to provide an answer that is useful to all people who ever stumble upon this question.
Your response and the other responses are proving our point. It wasn't about context free grammars, level 2 or level 3, etc. It was a very limited subset of a problem. Answer should have been, "while I don't recommend doing it the way you want to do it, that should work for your limited subset".
Yes, and answers on that very page, with lots of upvotes, do exactly that. People looking for answers online, can reasonably be expected to scroll down a page with results.
Poster is not asking for this. He is asking how to parse a specific subset of HTML. And it is demonstrably parseable.
> The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."
It is not the appropriate way to tighten a screw, but there likely is a correct way to do it with a hammer.
It is fine to point out that there are better ways to parse that HTML, but it is not wrong to do it with regexes.
Sorry to be blunt, but having coworkers like you make the job really annoying. I'm not a newbie, but a seasoned programmer. If I'm asking a question and am in a domain with a fair amount of experience, don't give me patronizing answers.
Poster is not the one answers are for. Answers are for everyone who stumbles upon this question in the future, and the general topic of the question is very much about parsing some HTML with regex.
Again: SO != Help FOrum
> but there likely is a correct way to do it with a hammer.
No, there isn't. Because the correct way is to use a screwdriver. There is certainly a way to do it with a hammer, same as there is a way to write a webserver in brainf__k. Doesn't mean that way is good or should be done.
> Sorry to be blunt, but having coworkers like you make the job really annoying.
Bluntness is fine. I will be blunt as well: Having to fix code full of hammers used to tighten screws is a lot more annoying than having colleagues who try to prevent a codebase full of hammers in the first place.
> The correct answer to the question: "How do I use a hammer to tighten a screw" isn't a lengthy description of how to perform some magic with a hammer. The correct answer is: "You don't. Use a screwdriver."
My car broke down in the middle of the desert because of a screw that came loose and all I have is a hammer. You have just condemned me to death because you assume you know better.
Nothing on that questions makes me think the person asking it wants to parse HTML. Most HTML parsers will never give the result the question described. And unless you want to dig into tar structure, solving that question is an essential part of creating a parser.
The funny thing is that the person who worted top voted answer is not smart all. He might look smart for a newbie but the question is about tokenizaion and not about parsing.
So here is the reason: the top voted answers are wrong.
> If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes.
To be fair, that question was written during a period when "how do I parse this html with regular expressions" was asked multiple times per day. And "regular expressions are not a reasonable tool to do that, use a parser" was the correct response to 99% of them. And, at some point, someone decided to throw out a more amusing version of that response. It _was_ funny at the time.
The thing is that oftentimes people who want to do 'X by Y' are actually asking how to accomplish Q. They think 'X by Y' is the solution and get hit by a roadblock, not knowing that it will not help them and they are wasting time.
This is called the XY problem and is extremely common on tech related forums and mailing lists.
Sure, but the issue is that SO was used largely for people working in companies with arcane rules. I can’t tell you how many times I’ve gotten one of these annoying “don’t do X, do Y” when I already know this. I have to do X for some reason, I don’t know how to do X because I do Y when given a choice and now no one will answer how to do X because someone killed interest in the question by apparently answering it. I use whatever points I get to downvote these answers.
The thing people don’t get is: when you answer on SO you’re not answering that poster. You’re answering anyone who will ever have this question. It’s quite arrogant to assume it will be an XY for every single person forever more.
The proper way to answer is to answer the question exactly as ask and then insert your “but you probably should be doing Y instead” at the end.
Again, you’re not answering the person who asked but every person who ever will. Some of them will be asking because the “right way” is not an option in their situation.
And those people can look for questions where the "right way" is justifiably unusable, or pose those questions themselves (and find out if they really have to avoid it.)
Because you're answering every person who ever will ask, a lot of the people who pass through your question & answer will be people who don't know the difference between the right way and the wrong way. If they want to know how to do something the wrong way, because they don't know what the right way is, an answer that simply tells them how is a bad resource.
It's not enough to tag caveats onto such dangerous answers, because people can't read. Instead, newbies should have to overcome a sufficient amount of opposition to filter out those who don't know why they're doing what they want to do, and the rest can make the little effort of being very explicit about why they want to do something the wrong way.
Exactly. I've seen precisely this "documentation antipattern" occur many times. "How do I do X with Y"? "You probably want to do Z instead". Upvoted, question answered, all other related questions of "no, really I do want to do Y" get closed as duplicates.
Then Googling for doing X with Y gets you a bunch of closed questions and a labyrinth of links all leading to a question that was answered 10 years ago on a different software version where Z possibly was the right way to do it but now isn't.
And of course there's no way to reopen the question because it has been closed by a level 15 Magister Templi moderator and a lowly level 3 apprentice moderator like yourself needs to either answer 146 more questions or moderate 192 other questions to clear enough arbitrary hurdles to achieve holy question reopening powers.
And there's possibly an appeals process but that involves recruiting 13 moderators who you have to convince to give this question special treatment and declare that one of their number of sacred moderators made a mistake.
Yes. StackOverflow mods frequently mark questions duplicate that are not. That is something that has been observed by many many people.
Some of it is that SO has gamified shitting on and suppressing the question/asker instead of gamified providing the answer, and built a culture of toxicity that tolerates the abuse of the tools in this fashion.
And when the CEO asked them to tone it down maybe 5 years ago they basically did a collective “am I so out of touch? no, it’s the askers who are wrong”. Extremely funny to read the meta responses to that at the time.
(admittedly "women and people who don't speak english well are particularly unlikely to adopt to the pedantic neckbeard culture we've built" is a spicy take for your average SO'er, or wikipedian, but it's also not actually a wrong one either. SO's culture problems probably do disproportionately chase away users with marginal engagement, nobody likes putting up with formalized neckbeard culture and those users have absolutely encountered it before and absolutely have an aversion/revulsion to entering yet another online neckbeard nest. I think this is a case of “he’s probably right but the medicine would have gone down better with the manchildren if he hadn’t mentioned women and minorities”, and he’s also right that those issues have continued to bury SO over the last 5 years.)
> Because you're answering every person who ever will ask, a lot of the people who pass through your question & answer will be people who don't know the difference between the right way and the wrong way.
Then you have to do two things in your answer:
1. Correctly answer the question as asked.
2. Add your opinion about the "right way" to do it.
If you only do #2, you are failing "every person who ever will ask."
Again, I don't think this enough - because it's a well-acknowledged fact that people can't read[0] (as I said in my comment.) How many newbies are going to see a working solution, try it out, and immediately skip all the extra text that they don't think they need?
> Again, I don't think this enough - because it's a well-acknowledged fact that people can't read[0] (as I said in my comment.) How many newbies are going to see a working solution, try it out, and immediately skip all the extra text that they don't think they need?
You know that's not your responsibility. If some newbie makes a mistake, that's their responsibility (and a learning experience for them).
And frankly, I think you greatly overestimate how valuable and essential your non-responsive "you're asking the wrong question" answer is.
>That link is about users. You're misapplying its lesson if you're using it to justify not answering a developer's development question.
Why do you think "users" is an inaccurate description of the role question askers have on a developer Q&A board?
Put another way - when was the last time you used a development tool, or a library, or some other resource, and sat down to read the full documentation of it? I would posit that that's very rare as an activity, even for developers who need to develop a deep understanding of what they're using.
It's much more common to learn by doing, and the limit of that learning is very often what the developer can't do. Answers which easily enable developers to do something are overwhelmingly likely to lead to developers doing that thing - much in the same way that a long page of library documentation which gives an example is likely to lead to developers repeating that example, even if at the end of the docs, there's a little caveat saying that you shouldn't follow the example for so-and-so reason.
>If some newbie makes a mistake, that's their responsibility (and a learning experience for them).
But is it a good experience? Sure, maybe they'll learn that they always have to read the whole answer before they use any part of it. But we sensibly have abandoned this no-guardrails approach to teaching in almost every arena where it's been used, because it's not really suited to the way people do things in real life - and in real life, people often end up affecting others with their mistakes.
Does junior developer who learns how to glue SQL strings together in their favourite programming language, and makes the "small mistake" of not learning anything about SQL injection in the process, benefit from the learning experience when they cause a data leak? Do their customers? Or should the learning resources they access maybe use the pedagogical tools available to make sure those kinds of mistakes are really hard to make, even if it occasionally inconveniences a seasoned pro?
> Why do you think "users" is an inaccurate description of the role question askers have on a developer Q&A board?
The are a lot of different kinds of "users," and I think the kind of thinking in that article is totally inappropriate when applied to developer Q&A board.
To be perfectly blunt: the result if what you're advocating is to condescendingly treat experienced people as newbies so dumb that their question should not be answered, because you think they're so dumb the real answer might distract them from the lecture you want to condescendingly give them.
People like that are super annoying and almost always unhelpful.
Every single fucking question I ask on SO has some lazy condescending dude chiming in to answer the easy question he thinks I should have asked, after he totally failed to understand the constraints that made my question hard. Of course, lazy condescending dude always thinks he knows better.
No, the best thing is not assume you know better than anyone who will ever ask this. It’s good to mention what the right way is and why but your answer should always include the answer to the question exactly as asked at a minimum.
I agree but sometimes the answer exactly as asked leads to wrong things. So sometimes you don"t provide the answer to the exact question but include the reason why the exact question is not good. This gives the option for the questioner to comment why the exact answer is needed.
My experience with less experienced developers is that they ask the exact question as that is where they have stuck but they are ignorant of the better ways.
I do tend to answer differently depending on the questioners reputation. If they have a higher rep then I can assume they know what they are doing.
You know what's at least as common on SO? "I don't understand the thing you're asking, so I'll pretend it's the XY problem and tell you about something I do understand"
I’m not convinced anyone interacting on SO can diagnose something like this. The act of breaking down a problem to a tiny part so you can post it kinda guarantees this scenario.
But I think it will always be up to the user of SO (not the poster or answerers) to make the real judgement on what is useful.
Often I think SO is useful to use as a bunch of puzzles folks solved. You gotta decide if they are relevant.
SO is at its best when it’s actual error debugging IMO. When you google some specific error whoever else has the similar error it’s right there. I feel like GitHub is replacing this more and more though - I often get the GitHub ranked specific error higher than Stackoverflow these days. Usually you get better discussions on the GitHub issues too, for a multitude of reasons. Two off the top of my head:
1. all of the people working on the stuff related to the issue are very close by
2. the moderation is not nearly as heavy handed as SO.
ChatGPT is also much better than SO as well if you can give it enough context and the thing you are working on wasn’t built on stuff released after 2021
I also really like Stackoverflow for current event type stuff, like black swan type events. One recent example is when google’s Paris data center was on fire and infra guys were helping each other out trying to get systems online.
All of this combined means that StackOverflow the forum is probably on its way out though. They made the mistake of taking VC money and the model hasn’t really proven profitable so they have really made some poor decisions to please the vc overlords.
I won’t miss Stackoverflow much other than nostalgia unfortunately - better alternatives have arrived. Seeing the decline of all of the other Stackexchange sites kind of sucks though. There aren’t better alternatives for many of those
Just out of curiosity, what are the alternatives?
I still find the moderation approach well made, even if it looks heavy handed. It’s important to create information for the future, not just for now
ChatGPT is not in anyway better than SO - no see the current moderation strike.
Both sides can identify ChatGPT answers as being wrong. The question is how can they be deleted. The moderators say they can delete a lot by manual inspection. SO say that AI tools were deleting wrong ones.
My biggest problem with github issues is similar to the problems with SO:
Bots closing issues because someone doesn't spam the page. Closing as duplicate of (non related bug). A slew of random solutions that are only tangentially related and don't really solve the problem.
The issue is that sometimes poeple just want 'X by Y'. To get a question answered, you shouldn't have to list every constraint and design descision that led you to that point.
Comes up all the time when people ask how to do things in bash/sh. I know there are better tools for the job, but this is the one I have.
Oh god that just reminded me how often people ignore the question asking for posix shell or “/bin/sh” or other specific shell scripting language… and processed to answer the question using bash, zsh, Perl, python, or even the slightly less wrong (because it is kinda weird to be shell scripting without a moderately normal unix environment) of using a bunch of Unix binary programs to do the requested job without actually solving the core problem of the question, because the tools made it easier…
And then to ice the cake you find the question because your question has been marked as a duplicate of the older question where they answered using unix binary tools… and you specifically asked about doing something in “pure shell script” or something similar to that phrase.
Stack overflow is fundamentally a system design that breaks down at scale due to misalignment of incentives that are necessary for it to work well at smaller scales (as can be seen in the successful operation of various smaller Stack Exchange sites for various topics such as Law, Aviation, Physics, etc)
The bash not sh issue is due to ignorance on the part of the answerers not an XY problem. Also shell can"t do everything - you are in a POSIX environment so use POSIX tools. The Unix environment is an environment of putting together many small tools and not just using one so any shell script can call POSIX tools as a minimum. So just making a complex shell script rather than using the tools does need to explain why.
I must admit that I don"t buy in to that philosophy and like using one tool so for scripting I would do all in python, so I would not be answering that question.
Many Linux users think that their way is the only and that means bash as the shell and many others like GNU coreutils, gcc etc. I am an MacOS user and my professional career includes several non Linux Unixes so I know bash is not the only shell - try csh for fun - which is partly why I use python or previously perl as they are the same on all machines
Since I apparently did a terrible job explaining it (the link does a much better one) -- it is when someone has a problem that they are trying to solve in a way which will not solve their problem adequately or at all -- it is not when they are using the perceived wrong tool for the job.
And what all the replies are telling you is that the most XY problems are misdiagnosis.
Explaining what the XY problem is to people who are telling you about it's high false positive identification is, itself, an XY misdiagnosis.
Your reply is an example of what people are complaining about - you are addressing the issue you wished was asked, not addressing the issue you were presented with.
Sometimes people ask questions like "how do I shoot myself in the foot and still have a working foot", though. Questions are not always reasonable. A question is not always "pure" either, but can embed incorrect assumptions.
> Sometimes people ask questions like "how do I shoot myself in the foot and still have a working foot", though. Questions are not always reasonable. A question is not always "pure" either, but can embed incorrect assumption
But those are correctly diagnosed XY problems. No one is complaining about those.
My parent was told the issue is too many incorrectly identified XY problems, and responded with an explanation of what the XY problem is.
That is the example of a misdiagnosed XY problem, which was kinda my point. This sort of behaviour makes the actual experts leave the site in droves.
If, when answering a question, one where to discard the answer the minute they write "Why would you want to do this?", you'd get much fewer incorrectly diagnosed XY problems.
As I said in a different thread, ChatGPT sometimes does this as well, but at least with ChatGPT, when it is answering a question that was never asked, it doesn't also act like a condescending jackass. There are no "Why would you want to do this?" type of questions.
Asking why does not have to be condescending. I agree that some responses can read that way, or seem in some other way hostile. In text, or with any reasonable spoken tone, I would not assume that a person asking me why is condescending.
But on second consideration, I suppose you would not, either, and I suppose you are specific'ly talking about responses which, each taken as a whole, are easily interpreted as some form of hostile.
I do see both of your points of view. There are some good answers on SO that capture both. They first explain why it’s infeasible, talk about a better approach, then lastly give pointers on how to achieve what is asked regardless using their best reasoning.
Think it just depends on the quality
> And what all the replies are telling you is that the most XY problems are misdiagnosis.
I responded to the person above me when there were literally two other comments in this thread.
> Your reply is an example of what people are complaining about
Defining something is not an example of an XY problem.
> - you are addressing the issue you wished was asked, not addressing the issue you were presented with.
As I am not much of a programmer but work on electronics and computer hardware I deal with different types of people than would be on SO, so I am not addressing anything but my own experiences.
> So sorry then, but listing every constraint and design descision it is.
You don't need to do that, a simple "I know what the XY problem is, and this isn't it" prefixed to every question you ask should be enough to stop the race to tell you all about the XY problem.
I mean, at this point it's clear that more people know about the XY problem than people ho don't.
> you shouldn't have to list every constraint and design descision that led you to that point
True, but that logic goes both ways: Unless told otherwise, whoever reads the question, isn't required to assume that there is a constraint.
If I get asked how to water plants with a sieve, without getting told why getting a watering can is impossible, "You don't, use a watering can", is a perfectly acceptable answer.
Especially when the question is aked in a question archive, rather than a help-forum.
If specific constraints apply to a question, then they should be a part of the question.
The trouble with SO that I've seen is that there are more false positive identification of the XY problem than false negatives.
IOW, any time you think you have spotted an XY problem, you're probably wrong.
And that's the problem with SO moderators and regulars. They classify everything as an XY problem because it allows them to answer the question they know the answer to rather than answer the question that was asked.
Part of this is because problems (X) are complicated and people just as commonly demand the "simplest possible example" (Y) that demonstrates the problem. So people ask how to do Y, and then others ask why on earth they would do that.
One common example I've run into lately, as I've been reading about state machines, is people asking how to implement a simple react component as a state machine, and others objecting to the premise of the question since using a state machine for a simple react component is obviously a bad idea.
Telling people they're doing the wrong thing is extremely common. People actually wanting the wrong thing is common, but not _extremely_ common, and the mismatch is one of the more rage inducing things on the internet.
Alternatively, people might have to use X and Y to accomplish Q because of their organization or team. If it's technically doable, there should be a solution and explanation for that X and Y problem somewhere.
Like how to use a database as a queue, which generally works much better than any queuing system I’ve ever used, except ones that use redis as the database engine for the queue.
I’m sure if you’re twitter Kafka is actually a better solution, for everyone else, it isn’t.
Want to host videos on a laptop (which has a big SSD) and stream them to a Pi (which is attached to a big screen) over a LAN? Hey, here's a post about how to host videos on a Pi and stream them to a laptop! Upvote and share! My point is, you don't even have to be trying to do something all that strange for people to apply the XY Problem logic and refuse to help you.
(Solution: NFS mount and a patient understanding that the Pi cannot play certain kinds of video, so you'll need to transcode some of them first. See? Nothing bizarre, but surprisingly outside-the-box given what I could find online at the time.)
My assumption whenever I see that behaviour in a response is that the responder simply does not know the answer to the question asked.
It’s fine, I think, to answer a question and then suggest a better method. It’s presumptuous in the extreme to dismiss a question with some pseudoacademic neologism.
If doing X by Y is possible and will achieve Q, then the best way to respond to these questions is either of:
1. To do Q that way you would to <solution, or at least pointers to how to find the solution elsewhere> but you will likely it far more efficient/easy/whatever to do <something else> instead.
2. Q can be achieved far more efficient/easy/whatever by doing <alternative>, but if you are stuck with using Y then try <solution or pointers as above>.
Of course this relies on you correctly deriving that they are trying to achieve Q, or them explicitly stating the fact. Maybe instead they are trying to get to K.
Everybody is aware and have read esr etc. The point is that people who are asking X by Y want to learn Y, with accomplishing X as a side goal at best. It's funny that you call wanting to learn Y a waste of time. Is it because you believe Z is a superior way of doing X? Why do you believe that? Experience, science or mathematical proof?
Often the better answer is to just explain what's being asked by strongly encouraging them in the Z direction. Sometimes people just want to understand what's happening behind the scenes rather than just looking to solve for Y.
But that's a different question. And should be asked different.
"I'm trying to learn foo.js and attemt to do so by porting Tetris to it. I know foo.js is a terrible option for a game. So. How can I write to the canvas from foo.js?"
Is often answered properly, if only because to many it's a nice puzzle.
"How can I write to the canvas fro foo.js?" Is different in that it will attract a lot of people explaining that foo.js deliberately did not allow writing to the canvas, because Z"
This is part of the problem. You know your constraints, other people do not, but like to assume they do. So you end up having to write a defensive argument about your problem rather than just plainly asking your real question.
This may help less experienced engineers not understanding their problem set, but for a more experienced engineer it's absolutely obnoxious to think of all the ways to defend my question so I don't have to deal with a rush of "oh but you should do this instead" answers getting up voted that don't actually answer my question, then asking to accept the answer.
To one set of users it's possibly helpful, to another it's useless if not also condescending, often condescending to both sets.
The bottom line is now, SO is so filled with these types of responses, I can't expect to get a very specific question answered, which is really the only reason I'd ask a question in the first place, so why use it?
There are plenty chat groups now via Slack and discord in my field where I can get much more direct answers. People aren't worried about getting down voted for a bad question, people aren't giving low quality answers to boost their points. So for me, SO is practically dead except for the occasional obscure error message that I can query for there.
> So you end up having to write a defensive argument about your problem rather than just plainly asking your real question.
My experience is that it's not so much a defensive argument, but context. My example was poor in that it could be misread as a defensive argument, sorry about that.
I meant it to show how adding some context changes the question. Because, in programming, it is all about context. AKA that "It Depends" meme.
That's unkind, and doesn't really address the nature of the problem at hand.
What's the name for the "I don't want to use a sledgehammer to solve a problem that should be solvable with a screwdriver" problem?
Or in this case, the "spinning up 300 lines of code to integrate an XML parser vs. a dozen lines of code based on regexes" problem. For reasons that are unclear to me, XML parser libraries tend to be painfully difficult to use (speaking from personal experience with 4 different XML parser libraries).
I don't think it's a surprise to anyone involved that an XML parser is going to solve the problem.
LOL. It's not "called the XY problem" just because some Dunning-Krugerite decided to make a website on a budget TLD.
Here, I'll coin a name for a problem I see much more often, which is called the XX problem:
1. User has problem X, and asks for a solution for it.
2. People viewing the question decide that the original user actually has problem Y.
3. Those people tell the original user that they actually want to solve problem Y, condescendingly flame the original user for not asking about problem Y, and if they have the power to do so, edit the original user's question to be asking about problem Y.
4. Those people use poorly-thought-out pop-social-psychology to justify their shitty behavior.
5. The original user still doesn't have a solution to problem X, and they really needed a solution to problem X all along.
It is not without irony that examples of the XX problem are sometimes also examples of mansplaining.
I understand your frustration but honestly the tone and style of your comment is dismissive and condescending. It strikes me that you are complaining about people treating others high-handedly and without understanding by epitomizing that attitude in your own post.
"You're saying your problem is (X) people don't answer questions, but have you considered that your problem is actually (Y) that your tone is condescending and dismissive?"
You realize that it's dismissive and condescending to ignore the problem I'm describing and respond with an assumption that I'm unaware of the tone of my post, right? Pot, meet kettle.
I'm not ignoring anything, and this conversation has gotten strangely emotional for people responding to me. I joined this thread when it had a few comments and explained what an XY problem was. I don't use stackexchange and I am a little bewildered why people are accusing me of being condescending.
> I don't use stackexchange and I am a little bewildered why people are accusing me of being condescending.
Because you're propagating the idea that you (or anyone asking questions) knows better what a person asking needs than the person asking does. Telling people you know what they need better than they do is pretty close to the definition of condescending.
People are emotional because nearly everyone who asks a question on the internet has to deal with people telling them "You don't actually want an answer to your question, you want an answer to this other question." By boosting the signal of the horrible XY problem idea, you're contributing to that problem.
I'm not saying this kind of miscommunication never happens, but the opposite is actually far more common.
I ask follow up questions and get at what the person really wants, confirm it, and help them with it. I'm sorry if I am making troubleshooting harder -- I think that the answerer of the questions should be obligated to solve the problem or they shouldn't be helping.
Unfortunately I had no idea about the perverse incentives for question answering and the terrible moderator practices on SO, so I walked into a minefield giving an answer here that I thought was helpful based on my experience as a hands on technician working directly with people -- but it turned out to be a lightning rod for people's frustrations on these issues.
Is it SO's purpose to make assumption about the intention or context of questions? I would say no, and that it's even harmful, as it prevents the quest-giver from learning the flaws in their solutions on their own.
I actually finally made an account this year to leave a comment on this question[0]. I'm not familiar with what or why curricula are what they are so I can't really answer the question, but as a math teacher it'd be helpful for the OP to know that calculus and linear algebra are deeply related; the derivative of a function is the best linear map (i.e. matrix) that approximates it! For practical purposes, calculus is a toolbag to turn (intractable) nonlinear problems into (tractable) linear ones.
But apparently I can't comment as a new user so I guess the discussion will just be "because calculus classes use matrices in examples".
This is precisely what kept me away asking anything on SO. It is so much easier and quicker searching on or figuring out by myself than preparing a question defensively and then answering all the questions about why asking about that thing and the road getting there.
When I find a straight and relevant answer that is great, I like browsing SO, but I wouldn't dream having a question just to find myself in a word duel with some very eager contributor steering me into his favourite domain. I want to get ahead with a task not being in the quest of finding The Ultimate Solution or having a social life over interesting and related topics.
sprinkle in a 100K+ reputation user commenting "this is easily answered with `man Y`" — oblivious to the fact that Y's manpage is a 188 page monster, and manpages being awkward to search in.
Here's a question maybe someone can answer for me, or maybe I'm just stupid and other people don't need it, but...
What's up with the "man Y" aversion towards the usage of examples in the man pages? I don't expect some behemoth like ffmpeg to have a billion examples in the man files, but damn, every other reasonably sized CLI app would be so much easier to use, if you had just a dozen or two of examples at the end of the man file.
I may be missing something, but the top answer is pretty good. And the comments even explain why searching for an answer on google might be difficult, and suggests how to.
Quote from the actual answer (which is not the bogus accepted one):
The question wasn't "should it be done?" But, for the same reason men climb mountains, "could it be done?" The answer, [...] is yes. Thus, we introduce [...]
This reminds me of my last ever SO question. It was many years ago, so I might be misremembering some details, but:
I wanted some information on figuring out what texture serialisation was supported on a given client for a WebGL app. I needed to know this because I was optimising the client and had to deal with very large textures (it was an AR / augmented reality context).
Cue a barrage of comments along the lines of "you shouldn't need to do this" "the abstraction means you can simply assume the GPU has infinite texture memory" (!) "just provide all the formats and let the GPU bridge figure it out". Then the question had a downvote and that was that, cast to the bottom of the pile.
It seemed to me everyone responding to my question had this assumption "questioners are morons". A moron asking about texture serialisation is a paradox, ergo the question must have a faulty premise.
Should get rid of the reward system -- or at least reform it drastically to discourage pettiness. An accepted solution should not result in a reward to anybody. Somebody just starting to learn a topic posts a basic question and gets an answer from an old timer - and they both get rewarded! for what? There seems to be an army of "moderators" who are ready to pounce on easy questions and they end the answer by reminding the OP that he should upvote or accept the answer! Did any of them added anything to the knowledge base? You can get excellent answers to basic questions from ChatGPT anyway. Also they won't allow new members (perhaps experts) from commenting until they have picked up enough credits yet. The whole system is downright silly.
SO changed its policies a long time ago already... they don't want to answer your questions. They hate your questions (and by extension, they hate you!).
It's ok, you deserve the hate. After all, you're asking the wrong questions.
They want questions that are "textbook-worthy" or possibly "encyclopedia worthy". They're on record as having said this officially. If programming and technology is just messier than that, if it's more complicated than that, and you still have questions that don't fit because of it... well, fuck you. They don't exist to answer your questions, they exist to answer great questions that they can use to build up this pretty little sight that now has some purpose other than whatever it was you thought you were using it for.
The Z guy, he's playing their game. He's awesome. You, you need to be punished until you comply.
My experience is that the "real" answer depends on the situation. Sometimes one really should use the alternative (e.g. cleaner, more general solution, most updated API), while other times they should address the original question as is (e.g. avoid additional dependency/third-party library).
The question is very clearly formed. The accepted (and currently top) answer does a good job. But at one point the other answer which is badly worded and confusing was at the top. And I don't think it should ever get the top position of the answers.
Or a befuddled asking of the same "Why?". Or an honestly curious asking.
Yes, there are plenty of rude replies, but sometimes it helps to assume what somebody really means is they failed to do a context switch. Exchanging more details can help turn a 'madness' into a method. Even if they were outright rude, doing this can lead to an answer that might not otherwise be given (even though that answer may be provided by someone else).
They're talking about the XY problem-problem. The problem when other programmers mistakenly think you have an XY problem and ignore your words.
Sometimes, the original poster is not mistaken. Especially when an expert asks a question, there's a reason for it. Assuming the expert to be a beginner who hasn't tried easier solutions is degrading, and forces experts off the site.
Actively insulting your userbase, especially your expert-level userbase, is a bad idea. It leads to StackOverflow falling and collapsing over the years.
I never saw this term before, but it is perfect. So many times I have experienced it when asking questions in mature tech domains. It is so frustrating. Frequently, my question is asked poorly from the view of an expert, so they sweep it aside as not a real problem. Only after getting help (from comments) to provide more info or improve the writing, does the expert suddenly agree it is an issue.
This is also why I no longer waste my time raising bug tickets for open source projects. You just get shouted down and feel terrible about yourself. I raised many "WONTFIX" bugs in my career against open source projects. What a waste of my time and a harm to my self esteem.
I was active on SO years ago and my answers and questions typically are either upvoted to large number, or downvoted to a large number despite my best attempt at explaining the context.
So I do know this problem (problem-problem) very well and I've since stopped asking or answering on SO.
I was learning Golang nearly a decade ago while participating in a performance competition.. Asking questions about unsafe in Golang Nuts was nearly rage inducing.
IMHO Copilot Chat has picked up on these bad habits.
I've noticed a similar tendency in ChatGPT (answering a question that was never asked), but at least in ChatGPT it doesn't act all condescending towards you.
I literally cannot remember when last SO was useful time, due to the false positive identification of the XY problem.
At least with ChatGPT, it's much faster to get it to answer the question asked and not the question that was not asked.
That is incredibly annoying to find when you search for help for your problem. While maybe OP could/should do something else, you have great reasons to do X by Y. Ofcourse if you make your own question about doing X by Y you get referred to the other question and have yours closed.
> How do I shoot myself in the foot? I’m pointing the gun at my foot and pulling the trigger, but nothing is happening.
Is it:
> A common reason for guns not firing is that they aren’t loaded. Try loading the gun and trying again.
…or is it:
> Woah, hang on a sec! What are you trying to achieve exactly? It seems like you are doing something very wrong here. I’m sure there’s a better way to do whatever it is you are trying to do.
There are many technical questions that give the very strong impression that somebody is asking how to shoot themselves in the foot. It’s not responsible to blindly answer the question regardless of the consequences. Yes, people sometimes overcorrect for this which can be annoying, but they are only trying to steer newbies away from shooting themselves in the foot.
When a user is describing exactly how they're aiming at their foot and pulling the trigger but nothing happens, I'd assume they actually want to shoot themselves in the foot...
Thus, the first response is the correct one.
On preventing people from doing what they intend to do...I think the main issue is the world is a large place, people face many situations, and most advices given in good faith only have an extremely limited view of what people might need to do. And it's kinda awkward to ask for someone's whole life to decide if they are right in wanting to straight rename a table column on their live production DB.
Something could be a bad idea 99% of the time. But that leaves millions of people in the 1% for whom it's the best course of action.
> There are many technical questions that give the very strong impression that somebody is asking how to shoot themselves in the foot.
Sure, but some of the stuff I've looked up in the past is answered with helpful comments like "You won't need to do this, your CA will do this for you" or "this is handled by your certificate verification stack, you don't need to be involved in this stuff".
Well, thanks, but I'm actually implementing both of those things right now and I'm having an annoying issue with figuring out how the API for the (very popular but IMHO poorly documented) library I'm using for part of it hangs together.
> Yes, people sometimes overcorrect for this which can be annoying, but they are only trying to steer newbies away from shooting themselves in the foot.
I think sometimes its because comparatively the respondents are newbies, and are repeating received wisdom.
How I usually try to answer these kinds of questions is "Shooting yourself is generally not a good idea, for these reasons. If you really want to do it, you can try loading your gun and pulling the trigger again, but a better alternative might be to take off your shoe by untying your shoelaces and carefully pulling on your shoe, following the curve of your foot."
But it's hard! On the one hand, you don't want to teach people to do the wrong thing (especially not on SO, where answers are often copied wholesale and pasted into production code), but also not answering the question as asked usually doesn't help the OP at all.
If it's not clear what they're trying to achieve I usually leave a comment asking for clarification rather than answering, though.
The correct answer is obviously the first one. I come to StackOverflow because I want to learn some piece of information about some specific technology, not to be lectured by someone who thinks they know better than me about what I should or shouldn’t be doing (and who is virtually always wrong).
ChatGPT has replaced most of my usage of StackOverflow/Google.
It probably won't last forever without people generating new answers somewhere else, but it answers a lot of things correctly, and the things it gets wrong are easy enough to verify.
I really hope ChatGPT causes StackOverflow to change.
> I really hope ChatGPT causes StackOverflow to change.
It already has, and not for the better. SO is currently awash with nonsensical answers that are clearly the result of feeding the question into ChatGPT. Not to mention questions like this: https://stackoverflow.com/questions/76748781/how-pythons-bui...
Well, perhaps that's what moderation should be focusing on blocking then, rather than driving away humans (as was mentioned a lot by other answers in this thread).
The impression I get from commenters is that somehow they think their careers are tied to StackOverflow points. Trying their best to downvote correct answers and promoting their own in order to further their agenda.
SO ran a careers site for a while, and as a part of that, provided a resume/profile page upon which you were encouraged to share your score and/or your own questions and answers. I wonder if there was actually something to that…
Yes, it's stupid af. Rather to keep silence, than to answer indirectly the quesiton. Or even if you want to suggest "another problem", make it as indirect (not the accepted answer).
For me personally it’s the rise of good documentation. 12 or 13 years ago I needed to build something for our sharepoint 2010 in a world where sharepoint 2013 was out.
I’m not a share point developer, at all, it was my first time with it. I am of the old school however, so figuring out how systems work by reading the manual or specification isn’t foreign to me. I’ve worked with the bitmap format, I’ve worked with solar inverters, I’ve worked with embedded construction software and so on, so sharepoint should’ve been easy. But I couldn’t have done it without StackOverflow.
Fast forward to 2023 and I need to build something for sharepoint again. Haven’t touched it since I was sort of dreading it. Only this time the official documentation made it so easy I never needed anything else. I’m sure StackOverflow could have helped me, but I didn’t need it.
Those are very isolated examples, but really, that is my personal experience with almost every thing I work on these days. Yes, I’ve also got 10 more years of experience under my belt, but I do think it’s because we as an industry have become much better at working through the official channels. I mean, when you have a problem with something today, so you go on StackOverflow or do you go to the GitHub issues (or whatever else they have)?
Out of interest, do you think your recent experiences with SharePoint were because Microsoft decided to unify pretty much all of their services behind the common Microsoft Graph API? SharePoint used to suffer from having multiple different overlapping styles of APIs depending on what API style was fashionable at the time that feature was developed - which could make development a nightmare... (OK, an even bigger nightmare).
I sort of like ODATA on the client end of things. It's an absolute nightmare on the server side, at least with .NET, but for clients it's usually fairly easy to consume. This is true for every part of the Microsoft Graph API that I've worked with except for the SharePoint part. Which for whatever reason works differently when you query things. It's sort of hard for me to answer though. I've spent quite a lot of time in enterprise organisations, and I've integrated with basically everything, and it's almost all bad in one way or another. On that scale I think the Microsoft Graph API in general is a 8/10 maybe even a 9/10, but the SharePoint parts of it are at most a 5/10.
I'm not 100% sure this is the fault of the API, if it's because of SharePoints indexation, if it's because of the 3rd party meta-data indexation that we buy for our document library or if it's because of how terrible our own architecture for the data flow is. But I've had to build a lot of redundancy and caching into the terrible piece of gaffa tape which is our integration, because SharePoint won't give me every document everytime that I ask for them.
So a bad experience? But at least it's not FTP (yes not sftp) pulling different file formats from 9000 solarplant inverters bad.
Yes, and documentation that is actually usable. I don't enjoy the standard JavaDoc pages but I do enjoy the, for example, GoDoc pages. I think the programming community did a good job in establishing user-friendly documentation.
A typical pattern I've repeatedly seen in various shapes and forms:
Q: [very explicit title about F in context X] I'm specifically not asking about F in context Y which I know about but is irrelevant; I'm specifically not asking about libT which may accept it in accordance with standard U; instead, in context X, is F working for libS?
A1 [100+] [+50 bounty] [auto-accepted] [date d]: "it works" + long winded answer about decontextualised substring of F proposition working for libS in context Y
Comment 1: this is not the correct answer, see A15
Comment 2: complaint that A15 note about U is from a second hand site so answer is warrant of discredit [even though authoritative docs about U are not publicly available]
A2 through A7 [30+]: "it works" (rehash of A1, posted within d+[1..365])
A8 [30+]: Q is a dupe, see this answer [link to question and answer about libS in context Y]
A9 through A14 [30+]: "it works" (gives example for libT, which behaves differently)
A14 through A17 [30+]: "it works" (gives example of standard U, which libS does not comply with)
A15 [5] [d+5]: Q being explicit about F for libS in context X and explicitly _not_ about context Y, let's address it for libS in context X: *it does not work* because foo (link to libS doc || source code || example). Tangent for the sake of completeness, context X is irrelevant because bar is orthogonal to foo + context Y. standard U does not apply because libS is not compliant [quote or link about U]
Comment 1: this is the correct answer
Comment 2 [d+1095]: answer is invalid because libS vN+1 has just been released and adds F in context X
A16+ [<0]: completely haphazard lottery answers
On meta.so this has been asked repeatedly and the consensus is that no, it cannot be done and furthermore, "accepted" doesn't mean "the best", it just means whatever the original asker marked as helpful for their situation, so "auto-accepting" would be meaningless. Also see: https://meta.stackoverflow.com/a/262915/147346
As for upvotes: if a question is upvoted that means the community upvoted it. The "community" in SO is just regular people, programmers both knowledgeable and newbie. I don't know that any "open access" platform can solve the problem of people upvoting the "wrong" answers; it's not a tech problem. It happens here on HN, too!
I have two opinions about your post. One: I agree. Two: I disagree. My point: I see both.
I see "answer as comment" in my most mature tech subjects because people are afraid of downvotes. In my experience, the most mature tech subjects are carefully guarded by a small community of very unfriendly "fastest gun in the West" types. The C++ community is incredibly unwelcoming and negative towards most questions. Note: Comments can only receive upvotes. (In theory, comments can be flagged as off topic, or offensive, but assume -- for my commentary here -- they are not.) Examples of more mature tech subjects: Python, Ruby, ASP.NET, Java (language/foundation libraries), C# (language/founding libraries), C (but not yet C++!), Win32, etc.
For less mature tech subjects (or faster moving), you see many more answers. Example of less mature tech subjects: Python AI/ML libraries, Java/C# open source libraries (Spring, etc.), C++, Qt, Gtk+, Zig, Swift, Golang.
People are afraid of downvotes because people will throw out downvotes without even reading your answer, and once you get the first downvote, everyone else will pile on, also without reading your answer.
The worst part about xy problems is that even if the user could use a different approach, often the actual answer for the question would be very interesting. But because of SO, it's an xy problem, marked as duplicate and there no answer to the original question.
Owners of the SO sites are control freaks. Mods have started ruining the site around 10 years ago. Now, when I do a fresh registration I can't do basic things like upvote an answer or write a helpful comment, because I need X reputation.
So I can't add relevant info/feedback to a topic I landed on via google, because I don't have reputation, so in essence the site strangles itself. Because in 2023 I'm not going to farm(!) 50 points just to share my info. All this because a few bad apples tried to game a system and someone came up with this lame idiocy of you must have X reputation...
But you can ask questions.
But if noone upvotes your answer because it's a hard one, and/or noone knows the reply to your problem, then you are stuck. So should you ask some blatant question, like, is the sky blue and why? But it will be a duplicate. No reps for you. So it's a totally braindead system, they deserve to die.
Why don't they introduce a system where you can say, I need this answer for 5 bucks. But that will not be implemented, because it would make a dangerous precedent where you could harness other's valuable knowledge in almost an instant.
> Why don't they introduce a system where you can say, I need this answer for 5 bucks.
I think the problem with that is how do you prove it's a good or bad answer? Say you offer a bounty and I generate a great answer. You look at the answer and use it and then flag the answer as bad so you don't have to pay.
Someone reading your profile will have to parse whether the not-paid answer actually answered the question, before they can determine whether it's a "non-paying customer" or just someone refusing to pay for an irrelevant non-answer.
You can show something akin to ebay's reliability value based on user feedback. People can establish payment conditions before doing any task, I don't think that would be a problem.
This issue of having the answer in the comments definitely wasn't there 5-10 years ago and makes the site much more difficult to skim and read hence provoking more hapless questions.
Maybe they should expire comments or remove them alltogether.
"I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever."
Stackoverflow once set out - with the very best intentions - to be better than expertsexchange et. al.
I think that's why they are so strict with their moderation.
The issue, of course, is that this is a hard problem. To moderate a community of experts you need to be an expert yourself. With Stackoverflow I feel most moderators are just overwhelmed with the task, so they aggressively shut down everything they don't understand.
Many of them are probably aware of that and just don't know how to handle it better. Others are so clueless that they do not even recognize the good questions.
I just realized how much SO's role has changed for me as a dev. Like most devs, I used to use it as a common reference. However, I realized just now that it's been months since I've looked as SO.
That's not because I was intentionally avoiding it, but a natural side-effect of it becoming much less useful, largely because of the thing you're citing here. The quality of answers has fallen quite a lot, and SO seems to actively avoid hard questions (and if I'm searching a site like SO, it's because I have a hard question.)
Also, the site itself - the ethos of the owners; I used to have an account, became disenchanted with the mods, decided to delete my content, close the account.
I then discovered;
1. You cannot delete answers which have been accepted.
2. The mechanism for deleting your own content allows you to delete a max of something like five replies per day.
I then took the time to look at the T&C, and SO as I remember it simply make all of your work their property.
I deleted everything I could, over the course of a significant number of days, and left.
I think not allowing deleting answers is ok, this is something like asking wikipedia to delete content you contribute; at some point it isn't really your content and its part of an article. If your asnwer is accepted others are discouraged to post another answer, so deleting content retroactively is really demaging.
But also none of the answers should contain personal information so don't really see a reason to delete it.
Read the T&C. The answers are all licenced under creative commons with attribution so anyone can copy them. They are not owned by SO. This is why sites can copy all of SO to get better Google SEO also this is why ChatGPT etc can build their answers based on SO.
so regardless of who owns them, they can be reproduced, and probably are. So in many senses there is no deleting them.
Also, don't understand why you wanted to delete. Despite its problematic policies, SO is an important public resource and I assume so were your answers on it...
On point 2, even worse is when you ask a question and the mods close it and point to another post that isn't the same. Or they mark the question as too open ended when it's not, or tell you to split up the questions into multiple posts even though that's dumb because they're dependent on each other's answers.
I understood what you meant. But first of all, many languages have libraries relevant for several decades; second, asking questions about legacy code is often valid (and often where you need the most help); third, some people even still use jquery! (Poor guys)
I recently was looking into why wget --quiet --content-on-error didn't output anything with a 403. Turns out its an ancient bug they never fixed... it's not documented anywhere, except some crusty old thread on SO.
I am old. I find what you wrote resonates with my SO experience, and, in my antiquity, I wonder, grumpily, why "RTFM" isn't faster than stack overflow, and i think the answer is the quality of the manuals to 'freaking' read has decreased. I'm just old enough to realize the Unix manpages were excellent in proprietary Unixes, and then the Linux manpages got complicated by some emacs based reader. Anyway, Unix manpages used to be a quality go-to in order to answer questions, BSD and Sun (BSD and SysV variants) and NeXT and Sequent etc. I'm still inclined to do that, but it's not always adequate.
I responded to a question on SO once where I elaborated on the accepted answer - my solution was more functional as it provided assignment operator override.
Some hot shot user with an absurd amount of badges told me, in a condescending tone, that my solution did not answer the question and linked me to some TOS article or something about staying on topic. This was the first time I had ever answered a question on SO.
My takeaway from the situation is that SO is full of accounts that farm badges/rep. To what end, I do not know - perhaps they reference it on their resume or portfolios.
I called the guy out, seems he has since deleted the comment.
I honestly feel that Google and increasingly LLMs are the challenge here.
Behaviour of users on SO has always been, well, as all the complaints here suggest. It's never really bothered me that the top answer's someone going off on their favourite approach, most questions have multiple answers, and I've learned a lot about coding from reading the top few and trying to understand where the answers are coming from.
The discussion's as important as the answers sometimes, and that's why a potential collapse is a problem.
There is also a bizzare thing that I noticed only when I enabled RSS for some of the tags I wanted to watch. My RSS reader routinely had questions that were nowhere to be found on the site, even though they did not seem to be offtopic or have some other kind of structural problem.
Of course the users themselves could have deleted the questions, but regardless, I did not expect to see that.
It could be a nice experiment to enable RSS for some niche topics and checking automatically after a number of days to see which are gone.
IME moderation was bad after a year or so of SO. Something about the moderation role in general seems to attract certain personalities that have a negative effect on the content they are supposed to be moderating.
That said I find SO still very useful, you can always ignore the moderators and caustic comments. Often a question has a high quality answer that I can use even though the question is locked for being off topic or the comments are filled with irrelevant argument.
This. Stack overflow contains many diamonds. We just can't find them anymore because Google randomly decided to stop being a proper search engine at some point.
moderation on SO has gotten progressively more horrible. can’t tell you how many times I found the exact, bizarre question I was asking only to see one comment trying to answer it and then a mod aggressively shutting it down for not being “on topic” enough or whatever.
Yup. And that's why I stopped using SO quite a while (years) back.
I'll add one more with moderation. I've seen far too many cases where person asks "how do I solve X" and it's closed as a duplicate for a question of "how do I do Y". 2 unrelated questions but they have some similar wordings so the moderator shuts it down.
I'm glad SO is going down, because it is a really nasty site. Every time I click on a page it asks again for cookie permission. Probably they want me to register to the web site, which I refuse to. So, no wonder people who are not already members think it is not a good search result.
> google used to return really relevant results for SO, and it stopped doing so at some point a while ago
Oh, so maybe DDG is ahead of Google in terms of quality alone in at least one domain now? Because it definitely still gives answers from SO at the top.
My least favorite thing is how many highly upvoted answers are telling the OP to "do something else" instead of actually answering the question. Sometimes requirements come from above and we don't have a choice!
The sweet spot for SO for me is workarounds for esoteric problems, not for more common issues because the solutions tend to be out of date (but correct for an older version of the library or whatever).
The last 2 reasons are why I stopped to post on SO 10 years ago ... I tried to explain something and some mode was kicking my english and for being "off topic"
> google used to return really relevant results for SO, and it stopped doing so at some point a while ago
I haven't noticed that at all. I also haven't noticed any degradation in the results on Google overall contrarily to what people claim.
What I have noticed though is that every time someone complains about search results and shares the way they search on Google, it becomes obvious that the problem is between the keyboard and the chair and not in Google.
For example my girlfriend always get completely unrelated results when searching. But she searches an unintelligible mess that even a human wouldn't understand.
I’ve been using stackoverflow for 15 years. Search results were fine for at least 10 of those. But yea, I’m sure it’s me and all the other people agreeing search got worse are also dumb like your girlfriend, thanks for the condescending and helpful answer, mr google turfer
Really? For programming they are immensely helpful! ChatGPT 3.5 even rewrote Rust code for me that compiled with borrowing issues, replacing iter() with itermut() and other things. I also had a question about specific configuration w/ an old version of Chart.JS that I could not find an answer on Google, and ChatGPT figured it out.
I had it write an ansible job to enable a service with a specific name, and I got what I expected: something that looks correct a first glance, but with some subtle errors.
Rather than ask it to write a full ansible job, try asking it the question you would ask on Stack Overflow (i.e. you probably wouldn't ask Stack Overflow to write the full job for you).
I wanted good ways to deal with pointers to 2D arrays in C, which aren’t hard but you need to remember that [] has precedence over *, so (*pointer)[x][y] is needed to deference. It’s not the hard to mess it up. Ultimately GPT had an OK answer but… it had that immediately without searching and the didn’t have to craft some well written example, it picked up my dirty explanation no problem.
I have the 2D array typedef’ed now, but it’s still confusing to read and hard to work with. I’ll search tomorrow on it.
Asking anything C or C++ on SO has become an invitation to a ritual hazing. If, after dealing with everyone's ego, you can successfully demonstrate that you already know enough and you're not asking to pass a CS test, the Wise Ones may - may - deign to bestow you with an answer. Which usually won't relate to your specific issue.
Also, in C++, you can set your clock to someone commenting that you should just use smart pointers. It doesn't matter if the question is entirely unrelated.
As someone who spent a decade helping people answering code questions on Flashkit and later SO, I find the SO community and moderation now to be so off-putting that I avoid asking anything there if I can. I still give answers sometimes, but I'm much less likely to be on the site at all.
Not to mention that you have no rights over the content you create on SO. You can’t take your content off the site if it happens to be an answered question or an accepted answer. Their reasoning is “but the content would be gone and would stop helping people”. The way it completely dismisses your efforts and your emotional connection with your own creation is the greatest indictment of how SO lacks the human perspective. No other platform, not even post-Elon Twitter or Facebook does this. The only exception could be Wikipedia, and it communicates its content format and collective editing mechanics very clearly. SO just isn’t that but thinks it’s whatever it works for them.
I do because I’m not a robot? I wrote answers that made it to HN’s frontpage (this one: https://news.ycombinator.com/item?id=5243389). Later, that answer was edited by automod to remove “Hello” line because somebody hated that people greeted each other on SO.
Something long enough to be a blog post like that isn't what the site is for. I'm not saying you shouldn't care about things like that.
And I wasn't talking about the issue of mods deleting entire answered questions. I was talking about single contributions, and mods deleting things is the exact opposite of what my post was addressing!
On the other side, that compiler post is fun but it's not some big effortful thing. Caring about it some makes sense, but I wouldn't care about it that much.
> Something long enough to be a blog post like that isn't what the site is for
It wasn't a blog post on the site; I extended it to a blog post after mods closed it and flagged it for deletion. As a top 0.5% or whatever contributor of SO, believe me, I know what the site is for.
> but it's not some big effortful thing
It doesn't have to be. It's my thing. It sounds like me, it carries my spirit. Nobody else would have written it then. It reflects a period of my life, how I perceived things; and more importantly, its continued existence has an impact on me regardless how it's licensed. Say, if my writing style, or my choice of words, or my tone were associated with a traumatic event in my past, SO's insistence on keeping it up would be explicitly abusive, don't you agree?
Even if no such event had occured, SO's or HN's insistence on keeping my content online against my wishes is also abusive; it deprives me of control over my thoughts and my words. Is it legal? Yes, 100%. But, is it ethical? No, I don't think so. And for what? For keeping the answer for "how can I simulate a click on a DOM element?" online, as if that problem immediately becomes "unsolveable" when that comment, heck, the whole SO web site goes down. What a pretentious excuse to keep your income stream steady.
> Say, if my writing style, or my choice of words, or my tone were associated with a traumatic event in my past, SO's insistence on keeping it up would be explicitly abusive, don't you agree?
No, I'd say your trauma is causing you to make an unreasonable demand. The writing style in a short technical post existing somewhere should not be harmful.
> SO's or HN's insistence on keeping my content online against my wishes is also abusive; it deprives me of control over my thoughts and my words. Is it legal? Yes, 100%. But, is it ethical? No, I don't think so. And for what?
It's supposed to be a collaboration, especially SO, and keeping things intact is important for that.
Just yesterday I found a helpful guide on reddit where half the posts were "." It's pretty clear how a site where the primary purpose is guiding people would do a worse job if it worked that way.
> And for what? For keeping the answer for "how can I simulate a click on a DOM element?" online, as if that problem immediately becomes "unsolveable" when that comment, heck, the whole SO web site goes down. What a pretentious excuse to keep your income stream steady.
If it would affect the income stream, then it's something that makes the site bad for users. You can't argue both sides of that at the same time.
> As a top 0.5% or whatever contributor of SO, believe me, I know what the site is for.
Maybe? I don't think most contributors would be anywhere near as upset about an inability to delete posts.
SO wants text like you gave them, I don't think they want the level of emotional investment in those pieces of text. (They might want emotional investment into the site itself, but that's a different thing.)
> I don't think they want the level of emotional investment in those pieces of text
Of course they don't. They want you to be a free ChatGPT as much as possible. The less human you are, the better for them. That doesn't mean that what they are doing is okay or harmless.
The question was about whether hacker news allows you to remove your content? It does not. You can send an email just to have your content anonymized, but it will remain forever.
I happen to have tried this multiple times. Every now and again I want to erase all trace of me on the internet. HN doesn't allow it. They tell you that you can give a spreadsheet with comment ids and how you want to rephrase things in case you've given TMI about yourself (in order to anonymize). I just don't have the time to go through my entire history to do this. So I attempt to post less, knowing I don't own my comments and I can be tracked
Perhaps if HN eventually improves in this area then we will have all the proof we need that it is in fact becoming more and more like Reddit. Because it seems like a lot of the discourse is already checking a lot of those other boxes.
The content is licenced under the Creative Commons license, which ensures that everyone can take it and make something useful out of it. For example in the case Stack Overflow turns evil, adds a paywall or something like that.
This is a lot better than almost every other comparable site with user-created content. In most cases the company has the rights there and can do whatever they want with it, and users have no right to reuse the content of the site.
Well, I must be the only one left to think Stack Overflow is great then.
Sure, you may encounter the occasional rough moderator but having to correspond with random people on the internet in writing as my daily job, I can't blame them. If you don't have a rep, it's up to you to gain some by crafting the best question, linking to the relevant docs, transcribing screenshots, correcting typos etc.
The only issue with SO imho is that it's getting too big and there should be a lot to gain from further splitting to other StackExchanges like computer science, computer graphics, databases, infosec, Vim etc.
It would be great if there was a way to transfer a SO question over the more relevant community I guess
Those rough moderators would close off discussions for the most trivial reasons, yet SO still allowed those discussions to be indexed by Google (despite rel="nofollow" having been a thing since 2005 or so). Presumably this was because SO still wanted the Google juice.
It was irritating as hell to have the top several Google links go to SO discussions that had been shut down by the moderators.
> I can't blame them
I'm sorry, this is like saying you can't blame cops for beating up the occasional suspect because they encounter a lot of genuinely bad people every day.
If you can't do the job in the presence of shitheads, you have no business being a cop. Or a forum moderator.
> Those rough moderators would close off discussions for the most trivial reasons, yet SO still allowed those discussions to be indexed by Google (despite rel="nofollow" having been a thing since 2005 or so).
At least for "closed as duplicate" that is working as intended: "Closed" doesn't necessarily mean "you shouldn't have asked that question". In the ideal world, a post being closed as duplicate just allows people that phrase the duplicated question differently to also find it.
For police and moderators, having a false positive rate that equals 0 seems impossible.
There are lots of ways to respond to a positive indicator that could reflect a aelf-awareness of your own false positive rate. Being trigger happy probably isn't the correct response.
It isn't clear how SO moderates the moderators, do they have an IA team that investigates corruption?
Except the moderators are unpaid and they don't hurt much more than egos ?
Also Stack Overflow is not supposed to be a public service unlike police (is supposed to be), so the moderators are not accountable to you, only to the site guidelines (which you appeal to I guess)
Given the value it gives to society, it should be turned into a public service before some rich prick takes it over and turns into a cow to be milked or a self-aggrandizing venture.
> Except the moderators are unpaid and they don't hurt much more than egos ?
Read again. It has nothing to do with "ego". By allowing their "moderated" threads to continue polluting search engine search results, they are wasting my time. They are wasting the time of everyone else who has a similar question and makes the mistake of clicking through an SO link.
I would not be at all surprised if they got downranked because people at Google were tired of SO wasting their time.
I joined SO in 2013 and was asking questions since then and had no problem with moderation. But it takes effort to ask question, I always aim to gather all bits of information I already know and if possible write minimal working example (many times it was not possible, like when I had problem with Apache and kerberos). Most of the time I was then answering my own questions a few weeks later but that's because not many people faced problems I was facing..
I have the same feeling. I recently started contributing more, even after such articles were already present, and I have to say it's a fun experience and I feel like I learn a lot, despite the occasional hang-ups I mentioned in a sibling comment to parent.
Especially nice to be able to link my own answer to things.
Having a high reputation shouldn't be a free pass to be an obnoxious jackass, but often that is what ends up happening. There is virtually no consequence to coming into a valid question (or answer) and just close it like a dickhead. No, in fact this behavior is rewarded.
If you are either someone who likes wielding power recklessly or autistic then becoming a power user isn't a problem. But demanding users "prove" they are trustworthy in a system where blatant abuses of authority go unchallenged is farcical.
Even though StackOverflow in the common use case has been taken over by ChatGPT, I sincerely hope it keeps operating, stays strict (even if it causes collateral) and keeps ban on LLM-generated content.
The wheels of this kind of stuff turn slowly, but obviously ChatGPT was trained partly with data only gainable from a healthy StackOverflow-kind of site with users actively asking unique questions and enough people answering those unique questions with well-though-out answers. The shittyfuture outcome is that StackOverflow goes out of business and LLM's stagnate on this front, while being capable of answering in fuwwy speawk when prwompted, would still be limited to / biased towards older versions of libraries, software, languages, tools etc.
If you loosen up your definition of of LLM the moderators and posters are really just LLMs that have been jailbroken to insult you and close your post.
Stack Overflow is the programmer's internet bloodsport.
It has for me, and several developer friends, and considering the fame ChatGPT has gotten, and that StackOverflow's fall has accelerated, it's obvious the milkshake's migrating. Not all of it of course, as I stated, "for the common use case"
It's painfully slow. I can Google the question, click one of the top results, skip to the relevant part and read it faster than GPT can generate two sentences. You also have to build an elaborate prompt instead of throwing two/three keywords into it.
It doesn't help that GPT is insistent on replying in the three paragraph format, meaning that the first 30-40 words it creates are just trash to be ignored.
I found it useful once - when I had to write an essay about ISO 27001 for college and just wanted it to go away. Took what it generated and spent 20 minutes editing it to look closer to my style. For real work it isn't as useful.
> I can Google the question, click one of the top results, skip to the relevant part and read it faster than GPT can generate two sentences.
Ironically, this is why people like me prefer LLMs (when they're accurate). With Google, about 50% of the times the top SO hit is not answering my question. So I have to click 5-10 SO links, parse each one to see if:
1. The question being asked is relevant to my problem.
2. The answer actually answers it.
I may be able to do it quickly, but it is a tedious burden on my brain. While GPT doesn't always work, the nice thing about it is that when it does work, it has taken care of this burden for me.
Also, GPT's pretty much memorized a lot of the answers. I once asked it an obscure question involving openpyxl. It gave a perfectly working answer. I wondered: Did it reason it and generate the code, or is there a SO post with the exact same answer? So I Googled it, and sure enough, there was an SO question with the same code!
Except GPT's solution was superior in one tiny respect: The SO answer had some profanity in the code (in a commented line). GPT removed the profanity :-)
I find it incredible you find a LLM slower and less full of useless chitchat about a question than stack overflow.
I don’t even open SO anymore; if it has a direct answer to your question, the LLM almost certainly does too; and asking new questions on SO is basically impossible.
If you manage to survive the gauntlet of “too specific, already answered, not general interest, arbitrary moderator activity”, the chances of getting an answer that answers your question can take forever; most likely you’ll get a stupid answer that doesn’t answer it, upvoted by idiots who don’t understand that it not an answer the the actual question, and, ultimately, because it “already has an answer”, ignored, never to receive an answer.
Maybe one day, a passing savant will answer in a comment.
…and yet, you find it faster and more reliable?
You, and I, have had different experiences on stack overflow in the last two years.
I think maybe you haven't been using GPT4 (the one where you have to pay money). Or else you're coming at it with a very strong prior, or you're not asking it about software engineering questions, or you're not phrasing your questions carefully. GPT4 is demonstrably extremely useful for technical questions in the realm of software engineering, and in addition to surfacing useful answers, it (obviously) presents a completely unprecedented conversational interface.
Can you give an example of a technical software question where you found it wasn't helpful? I'll see if I can get a good answer and post the permalink for you. I suspect you're not phrasing your questions well.
I have 110% replaced it with ChatGPT. Perhaps SO would still have a chance back in its glory days but there's no comparison to having a direct, specific, instant answer vs having to fight against SEO or moderators for hours.
I haven't. Because (free, as in free beer) chatGPT is extremely slow, I have to make a rather extensive proompt to get the result I want to, and then I still have to debug most code.
That's not very convenient, atleast for now. I got so used to search engines by now, that it only takes a few keywords to get the expected result. Be it a SO-answer or a documentation page. And as people have mentioned, chatGPT was learned on the stuff that's on the internet, so if there will never be any new stuff, because people just use AI, then it will not learn and won't answer your new questions. For some edge cases I might try AI here or there, but usually it's not for me.
Hell, there comes even an example to my mind. I recently just asked chatGPT what a single-issue 5 stage pipeline on a CPU actually means. I wanted to know if, especially, the "single-issue" meant that only one instruction is present in the pipeline at a time, or if a new one gets shifted in on every clock cycle (if there is no hazard). It just couldn't answer it straight-forward. It was also kinda hard to find the exact definition on the internet. I found it in a book from the 90s which was chilling in my book shelf (Computer architecture and parallel processing by Kai Hwang). Hint: Single-issue just means that only one instruction can be in one stage at a time, but still multiple get processed inside the pipeline. The keyword is 'underpipelined'
Yes, someone tested it on GPT-4 for me too and that actually gave a quite decent reply. Still, there are always some cases somewhere where it messes up.
I'll just keep an eye on AI progress, but will probably not make it my goto for some time. Maybe later (whenever that is)
> I haven't. Because (free, as in free beer) chatGPT is extremely slow, I have to make a rather extensive proompt to get the result I want to, and then I still have to debug most code.
That's because you are comparing asking ChatGPT to write full code to searching for a question on Stack Overflow and adapting their answer (which is comparing apples and oranges).
Try using ChatGPT like you use Stack Overflow instead (i.e. the question is "How would I record an audio stream to disk in Python" rather than "write me an application / function which...").
As an aside, try "How would I record an audio stream to disk in Python"" in both GPT4 and searching for an answer on Stack Overflow and see what has the better answer! (Clue: GPT4, and if you don't like GPT4's answer just ask it to clarify/change it)
>Try using ChatGPT like you use Stack Overflow instead (i.e. the question is "How would I record an audio stream to disk in Python" rather than "write me an application / function which...").
That's my point though. I get, that it can produce quite good results, if you are specific enough. And for some applications it makes sense to take your time and describe that as much as possible.
Most of the time I just need some small snippet though and usually I can get that with just a few keywords in my favorite search engine, which is way faster. So the conclusion is: There is no one or the other. They should be used complementary, or atleast that's what I am doing (as in use the search engine for quick hints and chatGPT for some more verbose stuff 'write me a parser for this csv in awk'.)
Personally ChatGPT generally gives me a quicker, better, simpler and ad-free result for the snippet (At least with GPT4).
Plus I can ask follow-up questions in a context-driven way ("Can I do this without importing a library?").
I'm aware that different people will have different feelings on this though and personal tastes will differ, but while search engines stagnate I suspect the needle will continue to shift towards AI.
I always read here ChatGPT is amazing. Can you give a link on how to use it? Every time I tried to google it returns lots of different results and when I tried it it:s not even usable for basic things I want. Is the ChatGPT you:re talking about on their website? Do I have to pay for it?
> Can you give a link on how to use it? Every time I tried to google it returns lots of different results and when I tried it it:s not even usable for basic things I want.
Here is an example of using it to write simple powershell scripts:
You need to pay for it if you want access to the latest version of the model, along with some beta features like plugins. Plugins are extremely useful and it is worth paying just to get access to them. For instance, you need to have a certain plugin to get it to read links.
In that JSON example you're honestly losing more time with ChatGPT that doing it yourself. It seems more like mentoring a junior than a helpful assistant. Most of my interactions with it have been this way.
I knew/know very little about 3D printers and the fields didn't mean much to me so I didn't want to have to research every one of them. It wouldn't have been difficult, just tedious.
You're mostly right in your experience. I have spent quite a bit of time trying to get ChatGPT to be a worthwhile piece of my workflow, and I guess sometimes it is, but most of the time the basic code or config or content I try to generate, it gets very fundamental things incorrect. It feels like it's mostly just hype these days.
Can you give an example of a technical software question where you found it wasn't helpful? I'll see if I can get a good answer and post the permalink for you. I suspect you're not phrasing your questions well.
Can you please specify whether you use (paid) GPT-4? Would you kindly provide links to a few examples of very fundamental things incorrect?
My experience - the free version made up a lot of things but still felt very useful - enough to want to upgrade to the paid version. With the paid version, I notice very rarely that it hallucinates. It does make errors but it can correct them when I provide feedback. It is possible that I just do not notice the errors you would notice, it is also possible that we use it differently. I would like to know.
> Can you please specify whether you use (paid) GPT-4?
Paid.
> Would you kindly provide links to a few examples of very fundamental things incorrect?
No, definitely not.
> I notice very rarely that it hallucinates.
Unsure of what "hallucinates" means in this case. Some examples of things I've used it for: docker configuration, small blocks of code, generating a cover letter, proofreading a document, YAML validation, questions about various software SDKs. The outcome is usually somewhere on the spectrum of "not even close/not even valid output" to "kind of close but not close enough to warrant a paid service". When I ask for a simple paragraph and I get a response that isn't grammatically correct/doesn't include punctuation, I'm not sure what I'm paying for.
>> Unsure of what "hallucinates" means in this case
The term "hallucinations" is now commonly used for instances of AI making stuff up - like when I asked ChatGPT (before I had paid account) to recommend 5 books about a certain topic and two of the recommended books looked totally plausible, but when I tried to find them, I discovered there are no such books. This is where I see a big difference between GPT-3.5 and GPT-4.
>> I get a response that isn't grammatically correct/doesn't include punctuation
What punctuation? If you mean stuff like commas separating complex sentences, my English is definitely not good enough to spot that. But your mention of punctuation reminded me of problems that ChatGPT has with my native language... any chance you are using ChatGPT in a language other than English?
Anecdata: I've started asking Bing these questions instead of SO. E.g., it recently gave me a very helpful answer for debugging a Spring issue and cited its sources. What it didn't do was present me with a whole lot of moderation cruft.
I can ask for recommendations for tools and libraries, which IIRC SO disallows.
I also don't have to pray my question will get enough vote attention or worry that I posted it at the wrong time of day.
On the whole, going the GPT route has been more satisfying in all ways.
> I can ask for recommendations for tools and libraries, which IIRC SO disallows.
Bing Chat almost always is useless for me with these kinds of queries. A few days ago I asked for a tool that monitors to see if a website is up. I told it I needed the tool to be something I'd run locally - not an online service and not something I need to sign up for.
small nitpick but I thought it was just an icon but it turns out to be the button for switching light/dark mode. It would be great if you could replace it.
Sure, but that doesn’t mean it’s not true, and for many of us its truth is prima facie because it’s true of both our own usage and the people we work with and talk to.
Hopefully so. As I mentioned in my other comment somewhere here, my optimistic prediction would be that StackOverflow will eventually still keep operating, but only by questions that can't be solved by AI, so hopefully leading into a more high quality discussions.
If the lights are on at SO then they must be in the process of training their own AI with their own dataset and documentation for the topics covered. That is what it would take for me to make SO my first stop again. It should be very doable for that talented group.
How is this being enforced? It's either bots banning bots in a digital game of whack-a-mole; or humans arbitrarily trying to asses whether something has been written by an LLM or a human.
It's human judgement. Definitely not perfect, but something has to be done to prevent SO from being overrun.
There are some subjective signs that a post is LLM generated, like being overly verbose and making unrelated assumptions, or mix of horrible and perfect grammar. Those bans are hard to justify because the false positive rate is high.
But other signs are pretty obvious. My favorite is the use of APIs that should exist but don't. Passing parameters that neatly solve the problem but have never been accepted, or importing non existent libraries. I'm happy to flag those.
Yeah, I've been using chatGPT quite intensively. And while pure LLM output is relatively easy to spot, human edited LLM output is almost impossible to detect. Most of my message above has actually been written by GPT4 (3 prompts + some light editing).
And I think Stack Exchange needs a new CEO. Maybe new owners, which is since 2021 Prosus. My impression is that they don’t understand what is the purpose for developers.
Interesting that this seems to have started around Spring 2021 when posts + votes started tailing off, followed by traffic starting to decline around spring of 2022.
I can think of a few theories that I don't think hold water:
1. The rise of ChatGPT to answer many questions that StackOverflow would previously have been used for. This seems unlikely, since the timing doesn't really work out.
2. The perennial complaints about StackOverflow's culture of closing everything as duplicates or offtopic. This seems unlikely as well, since those complaints have been common for a decade or more.
3. The prevalence of SEO-optimized scrape sites - the ones that pop up with a "blog post" merely reposting a stack overflow question + answer in a different font. I've seen these for a while, and anecdotally they feel more common that they used to, but I couldn't give any real timeline for that vague feeling.
4. StackOverflow internal politics? I've seen the occasional stack-overflow meta thread pop up periodically on HN or social media, but I don't recall anything earth-shattering recently.
5. Most questions have good answers now and there's less need for new ones. I'd have bought that answer 10 or so years ago when StackOverflow's pile of questions + answers reached maturity. I don't think it suddenly hit some sort of answer saturation point in 2021.
My guess is that it's a slow shift in the culture of the StackOverflow userbase:
- Being a top answerer confers some cachet and makes you more employable in some places
- People notice this and start looking for the most effective ways to become a top answerer
- The most effective way is fast, low-effort answers
- There's been a rise in such low-effort answers over the last 5 years or so
- As a result, the cachet of being an StackOverflow top answerer is a lot lower
- The really good, deep, technical answerers (as well as the mods) are leaving as that cachet goes away
- Post quality starts dropping around 2021 and views start declining as people react to that in 2022.
If I were building a Q&A site that genuinely wanted to encourage high-quality answers, I'd implement something like a 1-hour window where all answers are invisible immediately after the question is posted. This is to give people time to work on a good quality answer, without racing to be the first and gain those precious early upvotes. The UI could still indicate how many other people have answered / are answering, and if answer volume became a problem you could perhaps block new answers during that window after the first 20 or so have landed. When the hour is over, answers are displayed in random order and the existing site mechanics around upvotes etc kick in.
It feels like that could potentially address the problem of low-effort answers killing off the good ones.
As much as SEO experts would love to do that for their clients, I don't see any way they could've realistically crossed that moat. If you had a programming question, and you went to Google to answer it and found some answers - would you choose Stack Overflow, or PythonIsSuperFun.com? I think the behaviour of Google searching for programming issues decreased overall, and I don't think Stack Overflow suffered more than any other site. Disclosure: I've nearly abandoned SO for ChatGPT myself.
> would you choose Stack Overflow, or PythonIsSuperFun.com
I would choose first result - people are lazy. The problem comes and goes but it is true, spam sites, with content copied from SO, are ranking higher than SO itself. And it is often hard to tell you are viewing a copy on first glance, the layouts are usually more like a forum.
I think Discord has had a huge impact on Stack Overflow as there are many communities created for specific technologies, languages, frameworks, or topics, where most users can get answers from direct contributors or people who have the best industry knowledge on the topic in question.
Not to mention how beneficial it is for brands and products to create such closely-knit communities for their users. There's no need to Google things or ask on SO, as one can chat in real time and go back and forth for solutions with other users.
Not to mention unsearchable, poor discovery, a chat platform is not a replacement for message boards. SO is well suited to the Q&A format and Discord just isn’t.
On Discord? Really? My experience has been that if I don't know what the popular community discord is and what channels to search, I will simply not find an answer. For me, my Discord experience really suffers from Discord not being indexed by search engines.
Nowadays my questions are much more likely to get closed and it is impossible to get them reopened. This takes a pretty big emotional toll, as I usually invest quite a bit of time to describe my problems.
People vote to close and then move on. I don't mind editing my questions to satisfy the moderator's demands, but not if it has zero effect and just wastes my time.
I think people should be required to confirm their close/down-votes after a question or answer was edited, or else the votes should be automatically reverted. This mechanism should probably only apply within a limited time period after the question/answer was posted (or until a certain period of inactivity has passed).
It saddens me deeply, but currently I prefer not to ask questions, because the experience is so jarring.
- do not post a "teach the man how to fish" answer
- closing this question as duplicate because i'm in a hurry and i didn't read that the poor poster has already mentioned the possible duplicates and explained why theirs is different
- do not post an answer with links to the original docs that have context, instead copy/paste 3 lines that don't explain enough here
I wonder how much traffic has been siphoned away from those copy-cats that mirror SO? Out of principle alone, I will immediately hit the back button and look for the genuine article, but I could believe many do not.
The other aspect that grinds my gears is the closing of duplicate question. Fine in principle, except when the original was answered 10 years ago and all of the answers are jQuery.
With such a stark barrier to entry, why would I spend much time on there? Also, at the beginning of my career I got much more value from "explaining articles", and much less from ultra-specific answers.
Later on when I had more experience, I wanted to give back:
- try to answer -> "you need cred"
- okay, upvote -> "you need cred"
- try to comment -> "you need cred"
- Ask a question -> "Duplicate. You should RT(Free)M we built."
Where were these data sourced from? As far as I know, StackOverflow does not publicize internal analytics with such granularity. If these figures are real, are they leaked?
We can also see that lately there are more questions than answers, which shows that most experts are no longer that active, or that there are more beginners and fewer experts overall.
Stack Overflow used to release their data archives quarterly on BigQuery. Looking at the BQ datasets, they were last updated Nov 2022, which doesn't have the latest 2023 info in the submission.
Thanks for sharing, good to see alternative options popping up. My wish is that the Stack Exchange dataset could one day be provided as a streaming parquet or arrow table, as underfunded grads and post-grads could then more easily/selectively sample the datasets (similar to how Huggingface provides some of its datasets)[1][2].
The Hugginface repo unfortunately prefilters some of the tables/rows according to some criteria, making it less usable for general analytical queries that the BQ or SEDE datasets enable. If anyone knows of an 'XML-streaming' solution that directly samples from the Internet Archive's data dumps, I am all ears.
The point of asking a question on stack exchange is to create a question answer pair as the whole goal of stack exchange is to create a repository of high quality question answer pairs. If it's not possible to select an answer to a question then it isn't a relevant question for stack exchange.
At least in software development, SO is/was very useful, as they provide a curated repository of Q/A where the answers follow the current state of tech.
Every time I started out learning a new framework, SO would be tremendously helpful, because in software development, you mostly have questions that have correct and incorrect answers.
I said SO but I should have said SE in my case - it just doesn't work for research fields because these especially are all about exploring unknowns, yet SE doesn't allow this...
Maybe software engineering is all solved and there's no more ambiguity left to be discussed.
I feel like the moderation culture of Stack Overflow became toxic and counterproductive years ago. It's like the worst parts of Wikipedia, but squared. Just endless arguments over what is or isn't a duplicate or is or isn't relevant, while every interesting question somehow ends up closed, good answers are downvoted, and spam drifts to the top.
I'm sure everyone involved actually wants the site to be useful and pleasant, but somehow the actual result is the exact opposite.
That being said, I don't think the culture or moderation has got worse recently, so I suspect the traffic decline is either a change in Google's algorithms or the impact of ChatGPT (or both).
I think moderation culture changed for a reason. About 8-10 years ago quality of answers started to decline because HR started to source SO for candidates. This created incentive for posting thousands of answers moments after the question is posted that were simply not right. More often than not those wrong answers were upvoted anyway. And HR won't check the quality of answers, they are concerned only with your total score.
Once I was researching something and found wrong answer on SO from some junior dev from India. The answer was wrong and it was evident that the author of the answer didn't even fully read the question. Out of curiosity I checked the author's profile and found out that it was a deliberate strategy: e.g. when there's a question about working with files, he'd post an answer with a link to the language page on "open".
I politely asked the author in the comments to stop posting wrong or useless answers. Instead, he wrote very aggressive response, opened my profile and downvoted as many of my answers as he could.
And there were many similar cases, probably still are.
Everything went downhill once people had real incentive (like actual money, not just internet cred) to game the system for reputation points. The mods had to come up with ever-evolving layers of rules on top of rules that lowered the answer quality and put up huge barriers for an average user to contribute.
Incentives really make or break this type of website. Often on the internet I find myself thinking "Thank God this [website specific metric] isn't worth anything so people have no reason to game it".
I think "Internet Karma" systems are inherently flawed. I know one technical website with karma, where one type of users eventually gained majority of the karma capital, downvoted all opposition into such deep negative karma, that they can't write anything anymore, and now there's strictly one point of view in all the articles and comments.
Which is aligned with the site owners' point of view, so they have no incentive to change anything. Though everyone notices that quality of articles and comments started to decline once the users with karma majority censored all others out.
HN implements some significant limitations to the model that make it work. Things like not being able to downvote replies, can't edit / delete your replies after certain criteria are met, limited upvoting until a certain amount of contribution, etc.
These are constraints that limit the types of people that are interested in remaining here. But HN is by no means "small".
I've a rather high account. So I can revert mod decisions and sometimes do. E.g. reopen a marked-duplicate that isn't duplicate on closer inspection.
I don't do it often (can't be bothered too much) but when I do, I sometimes almost feel the hate radiation over the web. Indeed, people on a rage-tantrum going through all my answers and questions and downvoting those. Lol.
I also have a similar account and you inspired me to vote to reopen maybe twenty questions in a row. All of them were perfectly good questions closed for stupid reasons.
Let's not forget that in india it is not considered wrong to "game" the system, even it is considered unethical by western standards. It is not a coincidence all of these tech support scammers are based in india.
this brings forth an interesting topic, how do we reconcile cultural differences in online platforms ? Nobody can deny they exist, and they create clashes like the aforementioned one.
Do people not remember the "Hacktoberfest" fiasco? People were encouraged to make a PR on public repos in exchange for a free t-shirt, and _obviously_ this led to repos being overrun with spam PRs changing a single line comment.
with loads of people replying and thanking the author, excited for their t-shirt, not really knowing they're _creating unnecessary work_ for actual engineers with limited time.
Incentivized systems like this can work okay in small, self-selected communities, but as soon as it's open up to the entire world, the implied courtesy of said system vanishes, and immediately we're pandering to the lowest common denominator.
I don't think you do. Your platform should have an explicit and expected code of conduct set up in advance, and people should be held to it. If they cannot, they aren't allowed to use the platform.
When it becomes apparent that there are cultural norms that make it difficult or impossible for people from those cultures to use your platform, you re-examine your code of conduct, of course, to make sure you're not unduly excluding people. If you are, you amend the code, you change the platform in ways necessary to support them, and everyone tends to end up better off as a result.
But some cultural norms will be antithetical to the point of the platform, like in this case: SO cannot function properly when people attempt to game the system. In those cases, the CoC remains, and people from that culture will either have to adapt to the platform or find (or make) a similar one that works for them.
The alternative, as we see here, is that your platform degrades until it's not useful to _anyone,_ this "problem culture" included.
gaming the system is prevalent but not considered ethical in india. HRs won’t allow such candidates for example. So there is no question of amending CoC to make room.
There is a difference between something being prevalent and something being culturally accepted. The solution lies in naming and shaming. Making it clear that such people aren’t wanted on these platforms.
But i don’t think anyone should chalk this up to “cultural norms” That’s just being unfair to those of us who are just as fed up with this mindset.
I mean, neither do I, but I'm giving the poster above me the benefit of the doubt, and also kinda recognizing that -- even though I can't think of a concrete example right now -- there's almost certainly a situation where there IS a cultural norm that is otherwise acceptable, but antithetical to a particular platform, so the question is still valid, IMO.
This incentive was always there, from day one. It was a classic thing to put the most basic answer as possible in as quickly as possible just to get it out the door and then refine it later, because whoever posts the answer first would have theirs shown first presuming no upvotes.
This is exactly the reason I gave up on SO. It was so disappointing to see the "frist!" answer get votes, even when it was factually incorrect. By the time it took me to think and write a half decent phrase the original answer had been edited dozens of times. The site was also eager to interrupt me every time to tell me that an existing answer had been edited. They did everything possible to allow the system to be gamed and, well... they succeeded.
That matches my experience. I stopped using it in 2015 after trying to solve something with a new framework I was using and finding that all of the answers on SO were just wrong.
The exact opposite happened, contrary to this often repeated meme. Both SO and Wikipedia became what they were because of their strict moderation and RTFM-implied attitude. It's what drove professionals like me to the site. I'm just a regular user BTW, but it felt like SO people were my peers.
But SO (the company) wanted it to be more accessible, easier for newbies, "nicer", there was a huge uproar over them publicly blasting a moderator over a disagreement on a unilaterally imposed new code of conduct, and recently they even (again unilaterally) effectively reverted the ban on LLM-generated content. This has been going on for years, and moderators have less power than they ever had. Imho this whole thing started much earlier, I think it was 2017 when they tried the SO documentation project and let everyone keep their rep where I first thought they had jumped the shark.
The company has been on a bender for the last few years, and high-ish rep users like me just don't see the value in answering anymore. With recent blog posts it seems even more clear that they are on the direct path to enshittification, against their own actual users and moderators.
Recently it feels like all the actual professionals have left and what remains are "students" asking low effort questions in bad faith while an army of spambots tries to pounce on these.
While this means boosted engagement numbers short-term, it spells death of the site long-term. SO cannot survive on low quality spam, even if other sites (like Reddit) may be able to.
Agreed. If anything, the problem of SO is that it's not evolving strict measures fast enough to stay relevant. There's a constant stream of duplicate and low effort junk that gets only stronger over time, and obviously there are not enough people to moderate all of it. The original idea is great, but like any other place it needed to evolve barriers as userbase skyrockets and it mostly failed to do that so far.
> There's a constant stream of duplicate and low effort junk that gets only stronger over time,
There's a question on meta from a decade ago about what the Roomba (automatic deletion scripts) should be deleting. One of the answers had a bit of point in time information:
That makes it harder to find the original source, more difficult for google to chase, and clusters results.
> The original idea is great, but like any other place it needed to evolve barriers as userbase skyrockets and it mostly failed to do that so far.
Which upper management has been working against by encouraging a reduction of those barriers, disincentivizing people from moderating and curating, and trying to grow engagement from people asking questions.
Mistake I think. Because more and more facts are already covered by existing answers and more and more noobs (no judgement just fact we've all been there and I am still a noob in many regards) come into the industry as computing becomes more popular worldwide, it's unscalable unless the focus shifts towards getting people to find existing questions and answers instead of reducing SNR by asking and answering the same questions again and again in worse ways.
The scalability problem is indeed at the core of Stack Overflows woes.
When it was small, it was easier to handle the questions of the day, guide new users, handle the "fun" questions (and answers) in a way that wasn't off-putting, and generally be a smaller community.
As SO grew, it lost control of the culture that had been established there before (much like Usenet (different thread)) and became a place for people to do hit and run questions - drop their question, come back later, get the answer and move on.
The majority of the users of the site had moved from "community of people sharing information - asking and answering" to "new users without any cultural attachment asking a question and not remaining."
The core group culture became more defensive of their ideals... and lots of friction between management wanting more engagement and new users just wanting people to answer their question ("if you don't like the question, just move on" being a frequent refrain).
From A Group is its own worst enemy:
> 2.) The second thing you have to accept: Members are different than users. A pattern will arise in which there is some group of users that cares more than average about the integrity and success of the group as a whole. And that becomes your core group, Art Kleiner's phrase for "the group within the group that matters most."
> The core group on Communitree was undifferentiated from the group of random users that came in. They were separate in their own minds, because they knew what they wanted to do, but they couldn't defend themselves against the other users. But in all successful online communities that I've looked at, a core group arises that cares about and gardens effectively. Gardens the environment, to keep it growing, to keep it healthy.
> Now, the software does not always allow the core group to express itself, which is why I say you have to accept this. Because if the software doesn't allow the core group to express itself, it will invent new ways of doing so.
As the software didn't allow sufficient and proper moderation and curation tooling, the way that the the core group expressed itself was the more negative and ultimately toxic approaches. Snark and rudeness are the moderation tools of last resort.
A Group continues with:
> 3.) The third thing you need to accept: The core group has rights that trump individual rights in some situations. This pulls against the libertarian view that's quite common on the network, and it absolutely pulls against the one person/one vote notion. But you can see examples of how bad an idea voting is when citizenship is the same as ability to log in.
The current goals of upper management being advertising and engagement are not in alignment with the goals of the original founders (as idealistic as they were) and what remains of the core culture.
> It is by programmers, for programmers, with the ultimate intent of collectively increasing the sum total of good programming knowledge in the world. No matter what programming language you use, or what operating system you call home. Better programming is our goal.
Note that good is italicized in the above quote and is present in the original.
---
Getting people to be able to find existing questions and clean up the SNR of the content out there on SO would improve it... but that would likely make a lot of lines go down rather than up (deleting 10,000 duplicates of one question would show up).
Trying to get Stack Overflow back to a scalable model doesn't further the engagement and upper management goals directly.
Instead, they're focused on more engagement... with not unexpected responses.
The moderation culture is why I stopped engaging with stackoverflow and software.stackexchange. I noticed two problems.
The first is marking so many interesting questions and answers as off-topic or opinion-based, which vastly reduced the scope of interesting content, and discouraged daily visiting to learn instead of just using the site as a reference through google. Relying on google to feed you users is a fool's gambit. Websites need to create recurring visitors, and they do that by providing something that engages. Stackoverflow and software.SE's moderation policies made it far less engaging as they grew more heavy-handed. This left the sites open to changes in google's algorithm affecting their traffic.
The second major problem is duplicate policing and a lack of staleness policies. By not allowing the same question to be asked again it meant that the content of the network has grown gradually more and more stale. Even my own old answers are now often wrong because they are simply outdated. Stackoverflow is filled with questions in the style "what is the right way to use technology X to do Y", and the right answer to that changes every 2 or 3 years, when technology X gets an update or when new insights in how to best do Y are formed. I tried updating a few here and there, when I saw people engaging with wrong answers, but overall the blame is on stackoverflow and its moderators for not working out a more effective mechanism to get rid of stale answers and letting new users answer old questions with answers that have a shot at rising to the top. This also means that as top-voted answers grow more and more stale there is a tipping point where google no longer sees the site as a useful resource and starts lowering it in the ranking, and this is what seems has happened.
I wonder if it's just finally reached a tipping point of frustration with the moderation. I know I stopped going there years ago. At some point as all the good contributors leave you'd expect to get a snowball / downward spiral, even if the site coasted for years before that.
I was in the beta of stack overflow (my numerical id is <2000): it was always toxic and counterproductive, and I stopped seriously using it a year or so into its life (this may have been before it left beta, I don't remember).
It is still the best place to find a lot of answers (though their piece of the pie is rapidly shrinking), but participating in the system was never fun.
Maybe SO can have like super moderators / admin employees that are summonable by paying XXX score and they can bring the hammer down on problem answers and punish the toxic moderators etc.
I'd blame Google - at least for the initial drop. When you look at the "new visits" chart you can see that the drop happens in May, 6 months before ChatGPT.
My main issue, with the main Stack Overflow site (rather than some of the other stack sites), is that I don't get answers to my questions, possibly because no-one knows the answer. If they did, then it's probably been asked before, and since I always look first, I've found the answer to the question that was asked, and so don't eve pose the question, though I might contribute to the answer I found in some way.
This is a large decline. It does not correlate well with the rise of LLMs, which have shifted my own habits away from Stack Overflow recently. Looking through the graphs, the most dramatic shift was the large drop in new visits in the first half of last year, by two thirds or more. That seems late to be a Covid effect. There were some political controversies, but it would surprise me if they had such a large effect.
IMHO people have tired of the smug dismissive assholes that dominate SO and its its insipid gamification and they have, gradually, found alternatives.
For some, it's more welcoming and knowledgeable communities in github issues+discussion.
For others it's special-purpose, interactive forums that provide more guidance and non-hostile support for users of certain platforms/tools (eg Posit Community).
For many, copilot has been fulfilling that need, Does it always give "the correct" answer? No. Does it provide a sketch of a solution that gets you half-way there when dealing with tedious humdrum stuff that you just forgot because it's so boring? YES.
For yet others it's just plain old reddit where you can ask a question and whether it gets smacked down or not depends on how cool the community happens to be.
No, no one sane would. Just like no one sane would trust random SO authors with any authority either. You still have to verify the information somehow, but at least in the LLM case I instantly get replies 24/7, and it's been reading more papers than any of the authors on SO.
You don't necessarily need to trust them for them to be useful.
I can often evaluate a Copilot autocompletion (check that code looks right at first glance, check that it compiles, hover over method+type signatures to see their docs, run the code) in less time than it would take me to find+read a Stack Overflow answer.
ChatGPT has gotten quite a bit more accurate over the last few months. It does seem to have lost some creativity in the process, but it's much more true to the training data now.
I wonder if Google started de-ranking them in search results, or changed something similar that caused people to click through to Stack Overflow less frequently.
It is also possible there's a methodology issue. There's no source for where this data comes from.
I just googled a Microsoft tech question and first was a 3rd party provider plugin, then official MS Learn, then "people also asked...", and 4th was the SO block.
I scrolled down a lot and didn't see any scraper site links.
All, but all, my recent questions there were flagged and deleted immediately. My account there is 11 year old and has over 11k reputation - not much but not nothing. The site, for the past 2-3 years seems moderated by nazis and bots. Dunno if my timeline matches these graphs but I noticed the decline quite early.
Imagine gatekeeping what and how you can ask questions because of some fear of redundancy. Who the fuck cares if the same questions comes up each month, that means it's an important fundamental topic and can be handled differently than just shooting down the poster. The same for answers. It's a horrible website full of elitists. I never really used it often but I did roll my eyes everytime I saw a shut down thread that would coincide with the question I wanted to find an answer for.
I would point directly at two culprits kind of under the same umbrella -- Google usurping StackOverflow answers in its search results (as in, displaying a Q&A format where google's 'answer' is just a portion of a StackOverflow accepted answer) and the general SEO poisoning of search results (ad networks / LLMs scraping and rehosting the content, Google failing to return relevant results etc.)
Until seeing this article, though, I hadn't thought about how much I had unconsciously changed my habits to work around the aforementioned issues -- I swapped to DuckDuckGo, started appending site:stackoverflow.com to searches, performing multiple searches around the same content (like grabbing a few keywords from a scraped article and then feeding that into a search of the official documents), and also more aggressively searching GitHub issues
With the caveat that I still use SO relatively frequently -- everything from 'I can't remember this specific syntax' to 'I have an intractable terraform/AWS/k8s issue and need to see if anyone else is experiencing it'
In limited topics I took part on SO, what frustrated me the most were the "users". I deliberately put word users in quotes because I worked as a consultant and often saw guides on how to ask questions on SO (i.e. use an account with female name and picture, deliberately misspell words/interpunction). There was this Magento company and I was consultant for their (UK) client, one of the devs gave me the tour of how they use SO.
What was frustrating is that day in and day out, similar questions were asked - ones that are googleable.
One day I received a comment from such "user", under a question I answered and explained in detail. The "user" went to discredit me, without arguments or links and was pretty rude through all of the misspellings he did to avoid some automatic triggers. I replied rudely as well, asking him to read other people's answers before discrediting and providing no info.
Result was that I was sent to "cool off" for 2 weeks, and was given a speech from mod as if I was a child. At that point SO changed their ToS and claimed that all content was theirs, including answers I provided.
And that's where I departed from SO because from my experience - I was free debugging service for various agencies who went a little bit overboard with claims of their experience and in case someone was rude to me - I get no protection, however if I'm rude to "users" then I get slap on the wrist.
Working for free, being patronized, interacting with people who are too lazy to read - that were my reasons for abandoning SO.
Besides, when it started - all of the big questions about nearly every language were covered fast and there was very little to do except write SQL for other people and get praise for it as form of payment.
The Aviation Stack Exchange contains the only clear statement on the web about the power dissipation of jetliner engines.[1] Unfortunately, the value given is low by a factor of 2 due to a simple calculus mistake.[2] And even though my Stack Overflow account is more than a decade old, and I posted a slightly successful question on MathOverflow back in 2017, and signed up separately for Aviation, I was not able to comment on the answer or interact on the topic in any way.
Its pretty annoying, for almost a decade all js questions have j query solutions, if you ask for that in reactjs or god forbid, vanilla js, which also went through many api changes, its still a duplicate.
Often itscthe same for css, where grid and flexbox are now standards, the old questions with replies full of vendor prefixes and very hacky solutions should be deprecated or transferred to archives.
Btw, how do the blog posts copying so questuons monetize and how does google not shut them down, the pages dont even work, I thought google would know better?
I still use SO as much as I ever did, but I rarely/never ask/answer/comment because it’s very often already answered. The answer to my question is often 10 years old. So my activity these days makes less traffic (I don’t need to visit to reply to a comment etc) but it’s still useful.
Another reason I rarely ask is of course as others have pointed out: the amount of pedantry is around 10x the amount needed to keep the quality of the site high, and often enough to make the quality worse. Often these days I ask a question, and get attacked despite it not being a dupe or off topic or low quality (I’m a top 1% account as I assume many of us are so I know my way around). Then a perfect answer comes and I can’t accept it because it’s already jumped on by mods. Somehow a useful exchange with a reasonable question and accepted answer has been completely drowned by mods. It’s infuriating.
I used to contribute a lot to Stack Overflow, but then I noticed that I’d get flamed for even asking simple questions; stuff that can really stand between actually getting somewhere with your code, and where a helpful comment could really make a difference.
At some point regular replies were replaced by administrators and mods who are seemingly far more interested in finding faults with your question rather than to actually answer it. The worst one was probably those times I’d ask a question, and then it got labelled “not a question.” Honestly, this unhelpful, bureaucratic, and down right nasty attitude has really disgusted me with the site.
Apparently new users are having even bigger problems, and they get flamed really hard for asking newbie questions. Often people are rudely asked to read the manual, when the entire cause of their problems is that the manual is so poorly written that it’s impossible to make any sense for it—even if answering questions like that is the “raison d’être” of a place like Stack Overflow...
Now, with the prevalence of ChatGPT and services that will give you great answers to almost any code-related question, I honestly think that the fall of Stack Overflow is well deserved.
Same experience: a few months ago I wrote both the question and the answer (I used to do that a lot for tricky things I figured out after a lot of research).
My question got closed, with bullshit reasons. I kept asking to reopen over a few months, just to see what would happen (because clearly it was a good question and a good answer). After 3 months it got voted to be reopen. Since then I got badges for "popular question" and many upvotes there.
Pretty clearly I was closed by people who did not understand the question and probably had not even realized that I had answered it myself.
Another observation is that sometimes it feels like you get answers to your question in contempt. Someone writes an answer but at the same time assure you that your question is not good enough. Or they answer it but don't upvote the question. (If the question is not worthwhile, why answer it? Probably for rep.)
(I personally upvote questions because I had the same problem. I've never upvoted a question just because it was well written. More interesting to see "how many has this problem" than "is the question well researched / well written" I think. So I would still be happy, related to that, if I were you :-))
This specifically happened to me, the second time I asked a question. It may have been warranted, sure, but I put a lot of effort into that question and I got shut down hard. I never logged in again.
My first question turned into a well thumbed tumbleweed.
To claim that "all questions are already answered" is super arrogant. Then they could at least do the courtesy of linking to the already answered question. Granted, they do this sometimes, but not always.
Anyway, it's simply not true that "all questions are answered." Often when I've read the answer for a purportedly same question, the nuances of the other question is often nowhere near the original one, leading you to never fully having your question answered, which is extremely frustrating.
Often what you're looking for is simply a new way to explain an old concept, so that you better learn the inner workings of the problem. People are different, and so different people need different explanations. This is basic pedagogy. And that's also why RTFM does not work for all people. To think so is so arrogant that I'm at a loss for words.
For me, a few months ago, when I had a question, I used to google/bing it and try to find my solution between 2-3 different SO answers (even the questions themselves were often helpful).
Now, AI kinda replaced for me, it's copilot first, then ChatGPT if copilot is not enough, then google/bing if ChatGPT is not enough and maybe finally SO.
Conservative figure from my side would be 15-25% less stackoverflow in a regular day.
Not sure about the statistics, but an anecdote for me: I had a specific programming issue at work I need to solve, and I tried googling for answers in Stack Overflow. There are some answers but ultimately none of what I found can actually solve my issue. Then I ask ChatGPT to suggest a solution, and in about 5 seconds it gave me an answer. Not perfect, but with little editing it is solved.
Now I'm sure that ChatGPT probably scraped StackOverflow, perhaps using the very same answers I got from StackOverflow, but combined with other answers from other sources resulting in the instant answer. That does not necessary mean AI will replace StackOverflow, it just means that people wouldn't ask redundant questions in StackOverflow anymore, just questions that AI can't solve.
SO is weird. Or the people on it are weird. I once provided an answer but without code. Someone posts the same answer with code (effectively my answer); he gets upvoted and since then I get down voted. I get most of my reputation from a really really old answer about what the right mime type should be for JSON[0]. When I look at the question now there are a more than a dozen answers added...and loads of edits by moderators.
I used to follow topics to try to answer questions. It's a great way to learn a topic in depth. But I find that the bulk of questions about {topic} have already been asked, and the stream of questions are like "How do I get {topic} to work with {other niche tool}?". The Venn diagram of people who know both topics is way smaller. If you want an answer, narrowing the question to a single topic really helps.
But I have asked a few questions, and the quality of answers have really declined, mainly because people are rushing to answer and not reading the question. They could address this by delaying voting on answers.
As an analysis, I think a lot of things could be improved.
I'm sceptic that having having to choose between seeing upvotes or downvotes is helpful for the reader to draw conclusions. It would be nice to see this on the same chart.
It is hard to see how much voting and posting follows overall traffic trend.
But, posting and accepted voting had a huge uptick in May 2022, seemingly without any change in traffic. That is interesting. What happened?
In the overall traffic chart, it would be interesting to post some landmark events. Covid shutdown in US (assuming traffic is mostly US), release of ChatGPT 3
I'm not sure what the histograms in the tables are supposed to tell us.
Comment: "You shouldn't do A, it's better to do B"
Closed, duplicate of "How to do C".
Yeah sure, in the beginning, there were many more basic questions (how to increase number by 1, how to get division remainder, how to check if file exists,...), and if you're coding in Perl, you can still find all the answers... if you're working in python, you'll find answers for python2, some for python3, some specific to python2.6, etc... if you ask again, it'll be closed as a duplicate.
I know it's anecdotal, but after a few bad experiences, people just decide not to use that specific site anymore.
My experience was that there was a long period where your complaint was quite true.
I believe that they have heard this complaint enough and have lightened up a little. There are still more and stricter rules than most sites, but the obsession with pruning duplicates seems to have cooled.
And my opinion as someone with high reputation is that letting in silly questions has ruined the site and made me stop using it. SO is not a support forum, it's not your teacher, it's a place for interesting knowledge sharing and if the knowledge is drowned in noise, well it's just noise then.
Stupid people don't realize they are stupid. When they can't understand a question they conclude there's something wrong with the question, not with them. And since they believe the question is silly, they happily close-vote the question.
This way they close questions which require specialized knowledge, experience, or deep understanding of the subject. Precisely the kind of questions I consider interesting to answer.
Sometimes I managed to undelete such questions, but SO made it easy to delete and hard to undelete.
The first question for example is what would be a low-quality question in SO, and could easily have been a homework assignment. It doesn't show any effort from your side, you just pose the question and expect other people to answer it fully, without explaining what you attempted, why it worked/didn't work, where you got stuck, etc.
> It doesn't show any effort from your side, you just pose the question
You probably assumed I asked that question? If so, the assumption was incorrect, I answered it.
> without explaining what you attempted, why it worked/didn't work, where you got stuck
I don't think any of that is possible to do.
The question, and my answer, are too simple to decompose into smaller parts. Not enough runway to start and get stuck.
> expect other people to answer it fully
My expectation was rather different. I wrote my answer on Jul 16, 2019, and expected it to stay there.
Instead, on the next day some people have decided the question was bad, and closed it. Then on the next month, some other people have deleted it, along with my answer.
To close it, 5 people clicked once/each. To undelete it, I spent quite a few hours. Sadly, that's not a rare exception: moderation on SO is horrible.
Ah sorry, if you answered it it's totally different, sorry for assuming the question was yours! Some of my best answers are also on "simple questions".
The question is asking about the best way of doing X, which has a clear substep of "any way of doing X". Not only that, it's asking about a code solution, not just maths/combinatronics. Not a single line of code in a question about finding the optimal (not any) solution for doing some algebra in code is, IMHO, reason enough to close it.
Edit: for example, I'm not asking and so have no stakes on the question but can already try to think about brute-forcing it, which already shows more effort than the author SHOWS at attempting a solution.
To be fair, English is not my native language. But when I see a question “what’s the best way of doing X”, when it doesn’t have any criteria for what’s the best would be, and no other ways of doing X in the question, I consider “what’s the best way” part a redundant figure of speech. I view such questions an equivalent of “what’s a good enough way of doing X”.
> it's asking about a code solution, not just maths/combinatronics
Please read this: https://stackoverflow.com/help/on-topic According to that article, questions about math which don’t imply a code solution are offtopic on stackoverflow.com. They should be closed, and possibly moved to other stachexchange sites. According to that article, the OP’s question is good. The question was about a specific programming problem, and is a practical, answerable problem unique to software development.
Again, they are asking how to solve a math problem, in code. That's a two big-step problem, no attempt to solve it on their own. Big problems:
- Does not show any attempt or willingness to try to solve the problem first on their own.
- Does not even give any indication of where the problem comes from, why it might be interesting, etc., it's just a "how to calculate X?", which could easily be a homework problem.
- It is about finding a (possibly) mathematical solution, and then implement it in C++. Two very big and different problem, asking the audience to do them both. Again no attempt to fix either of these two problems on their own.
- I'll concede the optimal thing might be a language issue.
> I’m not sure that’s actually possible to do.
But that's not my point, my point is that I already showed more willingness to try to solve this problem than OP. And THAT is a big problem. It's not on topic about any of those points, in fact if you remove the bit where OP is asking us to give them the full solution in C++ it could be a good question for the Mathematics SE!
> attempt or willingness to try to solve the problem first on their own… indication of where the problem comes from, why it might be interesting
None of that is required to ask questions on stackoverflow. For details, read “How do I ask a good question?” and “What types of questions should I avoid asking?” help articles. You’re inventing arbitrary restrictions.
Another thing is, “why it might be interesting” is subjective. Personally, I found the question interesting, that’s why I have answered it. You probably think otherwise, but note it only takes 3-5 votes to kill the question. Any question at all is guaranteed to have at least 3-5 people on that site who find it uninteresting, opinion-based, need more focus, duplicate, etc.
> Two very big and different problem, asking the audience to do them both.
Two big problems don’t have solutions which can be both explained in 3 short sentences. As you can see from my answer, the problem formulated in that question has such solution.
> I already showed more willingness to try to solve this problem than OP
You have not. However, you have demonstrated willingness to delete interesting questions based on arbitrary and subjective criteria, despite the question is perfectly in line with the stackoverflow guidelines. Which BTW is very on-topic, because I think that’s the main reason for the fall of SO being discussed here.
> SO is not a support forum, it's not your teacher
This, to me, is the mistake that SO made. Developers helping developers is the engine that runs the site, that's why people come.
The body of interesting knowledge is an emergent property of the support forum/peer-to-peer teacher.
Eventually they tried to put the cart before the horse and traffic has dropped.
It's OK that you're done answering "how do I change font color with jQuery" for the thousandth time and are only interested in the occasional very interesting question, because there are people behind you who do want to answer that question. That will help them grow to get where you are.
If we don't allow new generations of users to go through that process we went through, then StackOverflow has an expiration date.
Perl is a good example because it's stagnant. Googling Python or Java? If the result is up to date, it's a low quality content farm. If it isn't from a content farm, it's a page from 2008 that is no longer relevant at all!
Echos my experience perfectly. It's actually impossible to ask a question and get an answer, also impossible to ask a clarifying question on an existing question/answer.
That makes it much, much less useful than it otherwise could be.
I found the people who took on Rust questions to be actively hostile on SO when I started learning Rust. The Reddit community was much better, so I’ve tended to hang out there for Rust q&a. Most of my time in the Stack Overflow world these days is mostly in the more specialized StackExchange sites. tex.se tends to be pretty good, as does latin.se japanese.se seems dedicated to stomping out anything that is remotely a translation question.
Reminds me of being on IRC in 97 asking questions about Assembly. When someone did finally respond it was "Did you read the manual yet"? It's ironic that SO became what it was replacing.
But did you even _try_ to read the manual? Did you lift a finger to figure out what the issue is? Some of the best answers are ones that directly quote the manual and then add clarity since manual language can be terse.
Yes I showed him the exact page that talked about my issue and he never responded. Still a silly response. He could have answered it in 30 seconds and saved me 10 hours.
I think ChatGPT is amazing foe this because you will rarely have to deal with anyone who has this type of attitude when you need help in the future
Honestly still much better than asking a forum. The hostility from forum regulars always seems more intense--less masked as a "terse down to brass tacks" attitude and more-so blatant laziness, misplaced frustration, etc. Also, forum posters tend to waste much more time as every other post is likely to be a joke or a tangent idea, as if it were casual dialogue in a chatroom.
The same mechanisms that make SO kind of brutal have also helped revolutionize asynchronous online Q&A.
I have asked a lot of questions, got good and useful answers on some. I have also answered a lot. Some of my answers and questions have been edited afterwards, most of the edits where good.
My experience was not perfect. Some of questions have been marked as duplicate when in fact they were not duplicates, but only superficially similar to other questions. Some of the edits people did to my answers where incorrect.
But overall, my experience with the site is still mostly positive. It saddens me to see that others are not enjoying the site as I did.
Stack Overflow was the
"what is wrong with the api?"
"What api to use?"
"How to improve api usage/syntax?"
place.
"Whats is wrong with the api?": moved to github, because its the source.
"What api to use?": stackoverflow is still the place to go (or chatgpt).
"How to improve api usage/syntax?" stackoverflow / chatpgt.
Github could really dominate with its own LLM in the web and Stackoverflow could regain some shares with an integrated LLM.
PS: I don't like Github Copilot because of UX and code upload.
Sadly, I'm sure ChatGPT wouldn't be as good without SO but now SO is far outshone. The problem is that the community got mostly Eternal Septembered. So for things like OptaPlanner it's great because the guy uses it as his Q&A site (great idea) and he's obviously very knowledgeable and skilled.
But for everything else, it's mostly filled with folks who are not smart enough to comprehend the question and therefore often answer some other thing which they've pattern matched.
Interestingly, just a few weeks before launching ChatGPT, ZDNet magazine made an interview [0] with StackOverflow’s CEO to discuss how the site became the world's most popular programming site. Unfortunately, this doesn’t seem to be the case anymore.
The absolute "best" answers are snarky that this problem is not applicable because they would be Facebook in scale or something if they had this problem. The person answering like that can do this because they've got lots of "Karma" and get tons of upvotes because other people want to feel superior to the one asking the questions. And the question or answer doesn't get removed, even when reported.
Stopped being active then. That was a few years ago.
The problem for me is not that SO became difficult as all popular/large systems inevitably become, it's the seeming lack of action to solve some of the issues involved.
The reputation thing is still nonsense. I can ask a noob question and end up with 10K reputation because of the likes it gets or I can answer a complicated question that takes all 25 years of experience to articulate well and get nothing. People can offer a bounty but it should also be possible to "reward" people who obviously know what they are doing and not reward people for asking questions.
The problem with duplicate questions is big, but again, the search isn't great at finding them, especially if you don't quite know how to ask the question.
They don't do much about people who just appear with 1 reputation asking something like "What does Null Reference Exception mean?". I ended up being allowed to moderate questions but the moderation queues get stuck constantly so I couldn't even help with that - after all, the crowd-sourcing is a great way to solve the volume of moderation needed and some of us are happy to help but as soon as you start seeing "Moderation queue full" after only moderating 2 posts, then you give up.
for FIVE YEARS, because of a rollback war on some of my own answers. some idiot was gaming the system and adding minor punctuation here and there, just to get review points. so yeah, good riddance.
I have since deleted my account on SO, which was actually a somewhat involved process. They did not however allow me to remove the answers that I provided.
thanks for the valuable comment. I realize now that my 1,800 answers mean nothing, and that the only important thing is my comment etiquette. I will try to do better, random stranger.
Would this not be due to SO’s demise, but due to the fact the tech has become more stable and boring to the point that there are fewer new questions and answers?
I wonder is this is the cannery in the coal mine for the internet as a whole.
Will the internet grow more and more quite as folks chat with local AI bots.
But what will ChatGPT 10 be trained on, if no one is posting blog posts, answering questions on stack overflow, or updating Wikipedia? OpenAI will have to change the logo to a snake swallowing its own tail.
We're already losing ton of indexable, searchable content from knowledgeable people due to the content being created and shared via private chatrooms, and now that chat bots can collect a handful of results and put it in a cute spoken language format, I worry that this "information highway" is about to get seriously gate-kept and dwindle into a pay-to-play public library.
The culture of programming has been changed by much of these "quick answer" tools.
The first generation of SO people were folks who cut their teeth pre (or early) internet when it required quite a bit of effort to learn a technology. Mistakes without googlable solutions, patchy documentation, bad or non existent internet forced people to work hard to figure things out that it's in that effort where, in my experience, learning happens.
These people contributed to the early SO and genuinely enriched it. Made it a source of high quality topical information. Over the years though, it's become a source of cut/paste code and then perhaps cut/pasted code (driven by the gamification). Few people go there anymore to contribute well written deeply thought out answers. It's mostly fly bys I imagine.
I used to train freshers and over the years, I can distinctly see the decrease in quality and the increasing tendency to have SO, ChatGPT, whatever just "solve this problem for me" rather than "I want to learn this and get better at my job".
I think, SO, for the most part was a genuinely well intentioned effort with really good outcomes.
Spolsky's earlier talks describing his philosophies about a good QA site, I thought, were insightful. To their credit, they did organically reach the top of Google rankings and for a long time, the quality of answers and insights on the site were really good. I learned a lot of from reading answers to various questions by Alex Martelli and Raymond Hettinger (of Python fame).
I'm sure they made some mistakes along the way and those contributed to the current downfall. My larger point, however, is that this and other tools and sites which make things "easy" have contributed to a decline in the quality of (especially new) programmers and that has indirectly taken a toll on the site.
To extend your chef analogy, the rant would be "Here I am trying to create interesting dishes with modern ingredients but people these days haven't tasted really food and can't digest anything other than a quick burger, canned soda and frozen fries."
I was promoted to one of those "mod" type people who reviews new answers, first posts, etc, a few years back.
I can tell you that I quit in the last 6 months. All new answers are answers on questions from first time posters, answering questions that have good solid answers from 8+ years ago (mostly 10 years), and it is pretty obvious, they are just doing it to get points. The question/answer is not something that has changed in the last 8+ years, they are just doing it to get reputation or whatever.
Most good solid questions...I just do not see too many today. I am not saying this is the reason for the decline, but there is a deluge of posts answering very old questions with a slight modification of an old answer. I quit putting effort into it, as it was just non-productive.
I recall that about ten years ago there were reports that edits to Wikipedia were slowing precipitously. People wondered if this meant the imminent death of Wikipedia. With the benefit of hindsight, it seems more plausible that Wikipedia had merely shifted from its adolescent phase to its mature phase, and that all the low-hanging fruit had simply been harvested. If we look at the stats, Wikipedia's edit velocity maxed out in early 2007 (37 days per 10M edits), then fell to half that in late 2014 (73 days per 10M edits), and has actually increased a fair bit since then: https://en.wikipedia.org/wiki/Wikipedia:Time_Between_Edits
It's entirely possible that the same is happening to SO.
While I understand it's a lot of work, and some (most?) might be posting low quality answers, my main problem with stack overflow has been exactly that over time, the top answer becomes stale.
The question is still relevant, but the best practice or library, platform whatever changed... That is a problem the stack overflow model has a hard time adapting too.. and it just gets worst over time..
From my purview, and again I am a tiny data point of one, the new answers to old questions are really:
Original accepted upvoted to +20 answer:
"Take Y and drive it into X, grep output for Z and place in AZ for processing with the -kombucha switch"
Answer posted yesterday by "NewUserXX123", a first post:
"Take A and drive it into B, grep output for C and place it in DD for processing with the -kombucha switch"
It gets pretty tiring responding with a custom comment and downvoting, only to have "NewUserXX123" then cuss you out in a DM about how you are a total @!#$%()&!@$%.
That's going to depend on your domain, because most fronted d questions I come across end in "there is a jquery plugin for that" which don't help modern sites without jquery.
Why was there such a heavy handed approach to moderation in the first place though? Why not let the community ask the same question over and over again, and let the answers naturally shift with the time?
I'm just not sure what value there was to have a team of people volunteer their time to making sure there were absolutely zero duplicated questions/answers? What problem is that solving? If anything, it is adding problems by not letting the content naturally evolve in time.
If people want to assess the quality of answers, it's better to have multiple data points, and SO could have invested those resources in algorithmically linking similar questions and making it easy to navigate by both answer reputation and time.
Along similarly lines, it seems best to tackle the people trying to game the system algorithmically as well - if content is word for word duplicate, that's a problem which can be solved by computers instead of people (similar text is a solved problem).
It seems the only use case for moderators on SO is for removing truly inappropriate content - it's wild to me that moderators were spending significant amounts of time actually removing technical questions and answers.
It reminds me a bit of reddit moderation, where reddit communities enforce these non-sensical rules and by extension require hugely heavy handed moderation to 'curate' their communities. Like, the headphones subreddit disallows pictures of headphones in boxes. Why? If the community is interested in headphones, what's the difference if it's a box or not? It's not like if you take it out of the box the headphones look different than any picture you can find on the internet.
Seems like many of the problems of moderation are artificial rules endlessly being enforced by real people which ends up just being pseudo 'make busy' work.
When it comes to software, it’s even less sensical to have one archival correct answer, given that software evolves over time, and what was an acceptable or necessary workaround 5-10 years ago could be a complete anti pattern today.
It would more sense to establish lineage. Lock older questions from more answers after some point in time, and instead of closing and linking to original, do it the other way. Keep new questions open and link to past variations. Then in the old threads indicate newer guidance may exist and link forward.
They talked about it in the Stack Exchange podcast years ago.
IIRC, the vision of the perfect question was one with a canonical answer. They didn’t want to be Quora. I think that concept made and makes a lot of sense, but doesn’t capture the “meta” issues surrounding how you litigate the form of the question, especially as questions get more nuanced.
> and it is pretty obvious, they are just doing it to get points
Gamifying reputation was one of the worst things we've done. I was part of this in previous positions and after a while it became apparent that it brought out the worst in people. There are better ways of incentivizing usage rather than stupid internet points.
Agreed and I don't know this for sure but the points lust, it seems to me, is not limited to questioners and answers, some of the behaviour I've seen from mods seems to be driven by that too.
Just to repeat, I don't know that because I don't know what gets you points but the behaviour is hard to understand otherwise.
I had a question where I had finished with a one word sentence "Thanks." . A mod had removed that part of the question, when I looked at their record they had an awful lot of edits which consisted of similar, tiny, non-substantive changes. It did make me wonder about their motivation.
> "I had a question where I had finished with a one word sentence "Thanks." . A mod had removed that part of the question, when I looked at their record they had an awful lot of edits which consisted of similar, tiny, non-substantive changes. It did make me wonder about their motivation."
"Do not use signature, taglines, greetings, thanks, or other chitchat.
"Every post you make is already “signed” with your standard user card, which links directly back to your user page. Your user page belongs to you, so fill it with information about your interests, links to stuff you’ve worked on, or whatever else you like!"
"Thanks and other statements of appreciation are unnecessary, and, like other chitchat, should not be included."
"If you use signatures, taglines, greetings, thanks, or other chitchat, they will be removed to reduce noise in the questions and answers."
----
Removing it doesn't get the editor any points, they are spending their time cleaning up your question to site standards for no reward at all.
What would you think if you clicked on a Wikipedia article about Mozart and instead of the current introduction which says:
> Wolfgang Amadeus Mozart[a][b] (27 January 1756 – 5 December 1791) was a prolific and influential composer of the Classical period. Despite his short life, his rapid pace of composition resulted in more than 800 works of virtually every genre of his time.
it opened with:
> Hello everyone, I studied classical music in college but have been out of the scene for a while and now I'm getting back in, I was wondering about the history of Mozart and I remember that his middle name was Allen or Almond or something? Can anyone help? Thanks for any information. || Hi, I learned about him in middle school and we all joked that his middle name was Armadillo but I think that's wrong, haha. || Greets all, it's in Olivier Hallengrunsch's Classical Composers as 'Amadeus', and that's well regarded. It also says he lived (27 January 1756 – 5 December 1791). I think we can all agree he was prolific, right? Thanks and regards, Jason [xxKiller; AMD Ryzen2 32GB RAM 2TB Western Digital SSD; BMW 330 2l aircooled] || Why's nobody mentioning how short that life was, smh || Good evening all and sundry, m'lady (tips hat), forsooth would anyone speak to how many works he is believed to have composed, all considered? Methinks such knowledge would be a most hearty addition to this esteemed gentleman's biography - Martin, [Fort Lauderdale TX Ren. Faire organiser 1997-1997] || etc.
StackOverflow isn't a forum, it's a collaborative reference work. Meta-chat would be edited out of a wikipedia page and goes on a separate 'talk' page (equivalent: meta stackexchanges or the stackexchange chat). What if you then went to the Wikipedia talk page and said "Is overzealous moderation of questions ruining Wikipedia? I want to be able to edit questions and greetings into pages but there are hoards of awful literal-minded jobsworths cruising the site just looking for the slightest reason to edit a spelling or grammar mistake or revert my changes. It just leaves me feeling unwelcome"?
Why would you expect to feel welcome when you're spoiling what others are trying to build up and insulting them for caring??
Stackoverflow has https://meta.stackoverflow.com/ and https://chat.stackoverflow.com/ (you see it when the comments on a question or answer go on too long, there's an automatic "comments are not for extended discussion, take it to chat" reply, or you can invite people to a chatroom about a question).
I wonder if this might be partly because they have not made an effort to make their software be more intelligent about surfacing answers to questions as well as evaluating answers. I would think that, at some point, it isn't necessarily the fault of the question authors or those responding.
One of the things I say with some frequency is that all failures are engineering problems. Blaming the users/customers, in the end, fails to take advantage of an opportunity to improve the product, learn from mistakes and improve process.
I hear your point and thank you for your service as one. That said, how much do we think this is more attributed to the existence of coding AIs like Co-pilot or ChatGPT. I hardly use stack overflow anymore thanks to OpenAI
> answering questions that have good solid answers from 8+ years ago (mostly 10 years),
I hit upon those useless answers all the time. I have no need to know about solutions that might have worked a decade ago and are obsolete. Stack Overflow is useless now.
I couldn't figure out how to gain any points other than answering my own question, but I needed points to contribute in any way. It's like SO is set up to be difficult to use. I can't imagine why anyone keeps using it.
Well, there's a few issues. Some, may be sour grapes, on my part, but some, were inevitable.
SO has been a great site for me. Not too long ago, it was always my first stop, when looking for correct solutions to difficult problems.
I'm a highly experienced engineer, and a first-class debugger. I always get my bug ... eventually.
What SO would give me, was a correct solution, very quickly. I would ask a question, and have two or three excellent answers, in a matter of minutes.
I could definitely have found the answer, myself, but I would have had to do stuff like set up playgrounds, or even full applications.
Nowadays, they require question askers to do exactly that. In The Days of Yore, I could ask a question, without having to have gone through an hour of debugging and prototyping, and have a great answer.
But, at its heart, it's another wiki documentation site, and wikis don't age well. The WordPress Codex is damn near worthless, and even the great PHP docs are showing their age.
I use stack overflow every other day and it's fine? It's imperfect advice but there are some gems on there and it's pretty easy for me anyways to spot when something is dated, not right, or whatever. Maybe it's less useful to me now because I understand the domains I'm in better?
If you select window size "7 days" and filter the Traffic section for "new visits", you see a massive and permanent two-thirds drop between end of April and mid May 2022. Apparently some change in the Google algorithm pushed down Stack Overflow heavily.
One of two things is happening here. I'm not sure if its one or the other or really both.
1. Copilot and LLM trained on code is reducing usage of StackOverflow.
2. People are just coding less and searching for less with layoffs and slow down due to burnout.
But it's definitely connected to these points. The degree of magnitude for either though is difficult to assess.
If it is mostly #1 though that's the worst thing isn't it? Because SO is weakened and then the LLMs gradually become less effective. We generate less curiosity and public discussion overtime as people use AI and LLMs more. That's sad for software engineering as a whole. One of the best parts of this job is collaboration and problem solving. Not just individually but as a community as well.
People has complained about bad moderation on StackOverflow since forever so I don’t think it’s that.
What I see recently is that the answers start being out of date, and Google sometimes bizarrely returns these weird copies of StackOverflow instead of the main site. Can it be that?
I've noticed I haven't used stackoverflow that much, I've always been a lurker, but the answers have gotten less and less highlighted. Now instead I am nearly always in the github issues of the project I am working with.
Also ChatGPT has really been a big help, as for most basic questions I can just ask it and get a fair result. This is more tricky to do when searching on google.
I.e.
chatgpt: How do i sort a slice of structs in golang with an inner field of "Name"
Usually somewhat correct the first time.
google: golang sort slice struct (I hope to find a good answer, but I usually have to go through 1-3 pages
This is more documentation based, but this is also what stackoverflow did in the past quite well, that is answers to common questions for a language.
I guess I’ve always interacted with SO in a relatively passive way.
There’s just too much toxicity that’s plausibly deniable.
And with it, lost much of its usefulness.
Every time I try to look for an answer against knowing better, I still get disappointed.
Either the question isn’t there or someone asked it but it was marked as duplicate with a link to a decade old (and for some reason never more recent) question that is vaguely of a similar topic if you squint hard enough.
It’s silly anyways because some of the frameworks I work with are extremely volatile in the sense that they fundamentally change from year to year, so the chances that an older question is stale are all but guaranteed.
I don’t even bother asking the question myself because I see the copious amounts of pedantry while I browse.
Never mind discussions about “the right way” of doing things, I’m talking about entire comment threads about the right terminology of the “well akschually” variety.
Think arguments about whether something should be called a method or a function or whether MVVM is possible with framework X because the author had intended X, Y or Z, when none of it is even remotely relevant nor helpful to the question at hand.
Other forms of pedantry are entirely rewriting the OP’s question because a variable name or function name was disliked, even though the original question as asked was not confusing in the slightest.
The single time I asked a question, it was genuinely challenging and relatively low-level matter, or as close to it as you get in my field.
Crickets, aside from 1 extremely low quality answer that didn’t actually answer anything that is.
I tried to focus on answering people’s questions instead, especially on a few relatively new frameworks that come with new conventions as those generate a lot of questions and I managed to become really proficient in them.
I figured it would be a good way to give back to the developer community as a whole since it was that very community that enabled me to become proficient in it in the first place.
But stopped doing that too because it’s useless. Either good questions get closed, mods fuck around editing things around or the high level clique just upvotes each other’s low effort drivel.
SO is all but completely useless to me, it has become my last resort and only if I’m really stuck, and every time I dread it like I dread pulling teeth because every time it proves to be an exercise in wasting my time.
I'm in the top 4% this year based on a Q/A I wrote in 2021 which it gathers a few votes a month. The ranking is an indicator of how pointless SO has become if I'm globally in the top 4% this year having posted nothing.
For me, there's no incentive to answer questions. It takes too long to get an answer, and the reply is usually some snide comment because too many assume it's an XY problem. It's generally worse than asking ChatGPT even when the AI gets it horribly wrong.
As for answering questions, I'd rather do that on my blog. Posting to SO is a time investment with zero return.
See "new visits" in the first chart. You can reduce the window for extra clarity. Something very specific must have happened in that week, nothing that can be explained with a "grand theory of decay".
I was surprised to see big humps over early springtimes. As if springtime causes people to feel more willing to open up and communicate, similarly to how we associate springtime with heightened romantic drive.
Another cool thing I've discovered by reading this right after the Usenet thread[0] is that it is typical for large-scale social networks to stop being humanity's darlings after ~15 years of age. Which is also surprisingly similar to how people are supposed to become completely self-sufficient around 16, as if there's no reasonable expectaction of substantial external care for them.
Usenet: born in 1979 [u1], stagnated in 1993, 14 years later.
XMPP: born in 1999 [x1], stagnated in 2013 [x2], 14 years later.
Stack Overflow: born in 2008 [s1], stagnation reported today here, 15 years later.
This is interesting. So, train llms on "everything", including SO, and then replace "everything", including SO. I think we'll see companies such as google, bing, etc... paying to scrap different websites.
But I stopped contributing seriously maybe 2 years ago. I just got so sick of SO behaving like complete jerks again and again, and lost any desire to keep enriching their shareholders.
This is 100% because of ChatGPT but the problem is that ChatGPT got all its information from StackOverflow... That means fewer people contributing to SO which means ChatGPT will deteriorate and people won't know why.
GPT written code gets committed to github (After human review, so its working code only). Github data gets shared with OpenAI (Microsoft). So GPT continues to improve.
The recent deterioration is due to reduction in parameter count for GPT-4 to save money.
Stack Overflow ought to be Google's dream site since it deals directly with questions and answers in reasonably natural language.
But I find the moderation a little heavy handed and an unhealthy amount of people using the platform to be a 'smart ar5e' while offering solutions based on their idea of best practice and not directly answering the question.
Classic case is anything to do with a strict vanilla JavaScript requirement being solved by something like jQuery.
Or a PHP requirement being solved by yet another layer of abstraction.
I like SO but asking or answering questions there can be highly annoying. There's always some very pedantic gate-keeper who might be technically proficient but lacks any common social skills and nails you down for any slight errors. It's good people care but seriously, there's a difference between constructive criticism and just being an a-hole.
It's the hamburger feedback but without any buns around it. And I just wanted to contribute, on my own time often, to help somebody else with the same problem...
ChatGPT. Both that people use it instead but also that some people try to game SO with answers from it. So the moderation of the site has probably become harder.
SEO is another factor. At least I don't get as many results on SO as I used to. That may be because of many reasons but if the popularity of a site starts to fall Google search might speed that up.
EDIT: I see now that the drop started long before ChatGPT became relevant. But it won't help SO I think. Github copilot and Google featured snippets may also be factors.
My unbaked opinion is that a majority of the forum was a javascript help board and with the advent of more advanced and silly web frontends the ability to 'copy paste' a snippet is less viable for javascript guys and so a lot of the value of being a front end snippet repository is lost to the fact you now need a primer on the flavor of javascript, the build tools, the package management and it's 200 LOC across three files to get your react bloat to work.
I don't understand why they killed their job board. We used to pay them a few hundreds a month for a a couple of job ads that we were quite happy with. But then they "decided to focus on enabling our community of developers and technologists to discover and learn about companies rather than just giving them a way to apply to an open role." Which meant they became a platform for recruiters that can spend monthly 10Ks+. That killed Stack Overflow for me.
SO has been around so long a lot of the answers are no longer relevant. I don't place much trust that an answer from 2010 will solve my problem. You often see [2023 UPDATE], but that answer is usually buried or hidden within other answers.
I firmly believe that ChatGTP is the SO killer. You get your answer instantly, it's usually good enough, and you don't have to worry about a mod closing your question as low quality or duplicate.
> SO has been around so long a lot of the answers are no longer relevant. I don't place much trust that an answer from 2010 will solve my problem.
I did have problem like this before too and I just made my search better by appending the `after:<insert year here>` to my google search to ensure the latest info
Proactively asking on SO is a no go to solve any problem. Just digging for anyone who already answer your question, or open a issue on the github repo which host the library you use. SO used to be a place where people could ask anything they want, but now only difficult questions could be answered, or else it will be downvoted, marked as duplicated or answered by some toxic people, definitely not a good beginner experience.
For me, I stopped contributing just before the downfall and was in the top <0.1% . Basically, the mods wrecked the whole thing for me, they obliterated a lot of answers from multiple people that were really good. There is no way to challenge that at all. Not to mention the mods got into big silly dramas with each other. So I just stopped. Now it feels like there is a lot of stale content is on there.
The biggest reason beyond moderation is honestly that SO has an identity crisis.
Back in the day SO would be practically the ONLY place you would ask any questions regarding issues in programming or tech, but over time things like asking for help about specific frameworks or open source products has become more common to ask it on those projects github or discord.
And then you got LLMs being able to summarize an entire manual for you.
One of the bad side-effects of this is the effect on "free" support channels like GCP SO tags. Questions rarely get answered, and you often will get an "answer" reminding you this is a volunteer forum. This can be quite bad when the question is "what does error xyz mean and why is it crashing everything". It makes community driven support a no-go
SO by design gets its traffic from Google Search. If Google for whatever reason is sending less traffic, then SO will decline.
Yes, AI coding assistants will mean less frequent Googling occurs during development, but also, when Googling is required it usually goes beyond SO and more towards primary sources as the former's results strongly overlap with the AI assistants capabilities.
There's still lots of good stuff on Stack Overflow. For example, I just ran into this answer while looking for something else. What a great answer to a beginner level Rust question:
As it ages, Stack Overflow is developing a problem where the accepted answers to questions are obsolete and no longer work.
APIs get added and removed, new languages change syntax, etc. This is particularly the case in mobile.
Stack Overflow has no good mechanism for ousting obsolete answers in favor of the current right answer.
Lots of people discussing tech or quality reasons for the decline. I'd like to suggest something far worse, there is a huge decline in the number of junior professionals in our field due to global economic pressures. There are less people needing to ask a question because they already know the answers.
This downward trend makes me wonder where is all that traffic going to? Surely developers are still searching for answers, where are they finding them? It can't only be ChatGPT since the trend started in February 2022, but it surely is a really strong contender now.
For me at least, I often go directly to the official docs. They're much better in 2023 than they were in 2013, pretty much across the board. And if that doesn't work, it seems you're more likely to find your specific problem being asked in the issues of the official Github, Gitlab, etc. than you are on SO. Even Reddit and some language/technology-specific forums will have better results now.
The fun thing is... I think SO saw that coming. They made that huge push to be the central documentation hub for everything from huge projects to internal teams. It just wasn't well thought out and expensive iirc, and had an abysmal adoption rate as a result
As a full time Dev. I'm on ChatGPT (paid) and use co-pilot.
My teams first go-to is ChatGPT as well. SO is no longer 'that site' to go to ( been on SO for 10+ years ). I have 0 love for that site. It was 'The resource' before, now it's just another site..
I mostly gave up a very long time ago. Craft an answer, update it to stay relevant and some mod turns it into a wiki. Another mod decides it’s time to move it to a different Stack site so it stops attracting cred on the site I posted it. Honestly I have better things to do.
I don't know if the charts presented in the link are misleading or not, but I did notice a visible reduction in quality of answers (during a period of 5-6 years) as I can't find relatable questions to issues I'm facing.
I can’t remember the last time I saw the accepted answer being up to date and relevant. Things change, and the things you needed to work around in Java or whatever ten years ago are now approached in an entirely different way etc, but SO is stuck in a time warp.
Do you think there may be secular trends here? We do see COVID-induced peaks in that data around 2020. I'm not suggesting that "less people are coding" - but just that lots of media-related sites are coming down from 2020/2021 peaks.
I haven't had an SO account in over a decade. In fact I had mine deleted per my request circa 2010 and it was Jeff Atwood himself who was handling that request. I still have that email interaction with him somewhere in my inbox :)
As long as OpenAI and others in the LLM space are for profit companies should start charging them heavy fees for their data. Surely there will be more novel knowledge in the world produced after 2023, and these companies will need that
I will get only two results of asking questions in SO. No answers or marked as duplicated.
Haven't used `site:stackoverflow.com` for a long time but using `site:reddit.com` to get more informative and generous answers.
The problem with stackoverflow is simple: it eschews it’s core use case to be something unnecessarily autistic.
Instead of being a platform that encourages those with experience to mentor and lead, and those without to seek experience, without punishment, it takes the opinion that “dumb questions are the ones that have already been asked and it’s the responsibility of the newbie to know if their question has already been asked.” It seeks to be the training set for its replacement rather than be its replacement.
Then a new product comes around to support the core use case newbies want more directly:
Yeah, of course we fucking left. Because surprise, if you’re constantly learning, you’ll always be a newbie at another thing. If SO had been an actual community, it would merged with an LLM rather than being eaten by them.
Let us all increase our questioning, answering, voting on Stack Overflow! With private chats like Discord and Slack, and chatbots like Bard and ChatGPT, a lot of the shared knowledge is not indexable. That's bad.
The technical qualifications of the responses are lacking.
They usually seem to only know magical incantations that will get things to work and don't have deep familiarity because really competent people aren't googling for answers like we are so they never land on the page to give their 2¢
There needs to be different exclusion and barriers but not through the wacky constitutional sheriff system they are using.
Mailing lists are decent because you know you're going out on blast so social norms kick in. If you mail say LKML, most people imagine important kernel people taking their time to read your blabberings and that likely stops those who can't improve the silence.
SO OTOH seems too unbounded and freewheeling where popularity is more important than responsibility. It's the wrong cadence between invitation and expectation.
StackOverflow lost 90% of the relevance to me the day they shut down their jobs section. It was brilliant and it worked really well, got me several jobs and possibly shaped my life for a few years.
Why can’t we just Pair Program more? There is enough open source out there. Enough user groups meeting. Enough hack-a-thons. Just make them Pair Programming sessions. Please! please I beg of you!
I used to answer questions until they kept telling me that answers should not be short but long with examples. I didn't want to waste my time so i stopped answering all together
The comments im seeing here cite moderation practices and scrutinizing culture as reasons for the downfall. But those arent new. I think its safe to say its largely ChatGPT.
Also: Chatgpt can give you answers based on what’s already on SO. In other words based on the content that is a result of the very moderation practices being criticised
I agree, but I'm afraid the downvotes are inevitable. SO for a long time had by a very long shot the best signal-to-noise ratio of any free-as-in-beer Q&A channel. Basically any live comms channels (IRC, Matrix, and so on) disincentivise clear, thoughtful responses in favour of streams of consciousness. And forums is where information goes to die, as every question turns into a huge back-and-forth until the question is sufficiently refined to receive a useful answer, which is usually extremely specific ("run this command") rather than generally applicable ("you need to frobnicate the foon, for example by running this command").
Unfortunately anything popular is a target, and once they stopped trying to innovate to keep ahead of systems gaming they were going to lose eventually. It's amazing it took this long.
really interesting as this isn't like digg or myspace, it was built on information, rather than interaction, and will probably fall like the tower of babel.
but i suppose it will still be a very valuable dataset to anyone looking to train a coding model. its the single most valuable resource that could be archived for one
I don't use SO sites anymore because of the incessant "accept our cookies" popups that aren't even relevant to me because I don't live where EU cookie laws or GDPR applies.
I used to rely on StackOverflow to stay plugged in to the developer world. But now with newsletters, developer influencers, and more forums with code block formatting, it's just not necessary anymore. I think the status of having stackoverflow points has been translated to twitter followers.
ChatGPT offers an effective substitute for Stack Overflow, fully encompassing its role for simple questions.
For more compelx problems, Stack Overflow was never a good
solution; it's better suited for tackling simpler questions. But for simpler questions you have chatGPT.
one of the things that changed is Google is using results from other SM platforms. I just noticed that my medium flutter articles pop up on flutter subject searches.
"Every institution fails due to an excess of it's own first principle."
I have watched/experienced SO go from a super useful/helpful site with a balance that pitted people's desire to be recognized with a desire to be helped, and slowly be subsumed by those overbent on organization and bureaucracy. And is it did so, it became less useful, to me as a questioner as well as an answerer.
Now I have to put up with stochastic parroting from GPTs to try and steer me in the right direction. Yay.
However, there is no commensurate decrease in posts/votes during the same time period. Posts/votes remained relatively constant through 2022 (modulo normal seasonal fluctuations), until February 2023 when both fell off a cliff (I assume due to the rise of LLMs). Traffic data are sourced from Google Analytics, while post/vote data are computed internally by StackOverflow [1]. I wonder if the apparent precipitous drop in traffic in May 2022 is simply an artifact of Google Analytics suddenly changing how it tracks traffic/visitors.
[0] https://i.imgur.com/qMj7Lge.png
[1] https://news.ycombinator.com/item?id=36856249