How could you even tell if you've personally invented something? You don't know what caused a thought to pop into your head. Heck I often forget things I've said, so I clearly forget things I've heard. I could be fully convinced an idea was my own and still be dead wrong.
Luckily you can build on the ideas of others without being a cargo cultist. Simply verify the sturdiness of the foundation before you go adding an extra wing to the house.
I would have just called it a regular cult ("give me all your money and you'll receive great riches in return").
It's true that adherents to crypto have a faulty understanding of traditional finance just as the islanders have a faulty understanding of airplanes and canned food. But rather than painstakingly recreating the minute details of traditional finance in hopes of capturing the same success, crypto cults go out of their way to avoid recreating those details or understanding why they're necessary, if burdensome.
Yes, Stack Overflow is also cargo cult programming.
Of course sometimes you have to use tools handed to you by others and sometimes you won't understand how they work (I don't know how CPUs work but it hasn't stopped me from programming).
This speech is largely about how you shouldn't willfully delude yourself. I think we've all had that situation where we're hunting down a bug and rearranging the code just so seems to resolve it, but we don't understand why. At a certain point the temptation is high to shrug your shoulders and move on even without fully understanding the mechanics of the fix. But if you do that, it will likely come back to bite you/someone one day. In the case of science it's even worse than engineering because the entire point of the endeavor should be to advance understanding, rather than to get certain results.
they're not saying lifestyle expenses are a tax in themselves, but rather the inflation on the lifestyle expense is a tax (ie. "cost of living"), as well as the need for certain expenses to begin with (eg. the need for a nanny because both parents need to work long hours bc of high COL and commute times)
Obviously no one would rather pay 4x as much for the same thing. In that sense no one chooses the price of their house as they'd clearly pay $1 if they could.
But yes, you choose to pay a lot to live near work (you value your time), in a place with great weather and public schools and natural beauty (since we seem to be talking Bay Area here). The $300k house in BFE Ohio does not have these properties.
At one point in my life I chose to live in a basement for about 10% of my monthly post tax income. Why don't you choose that? Turns out you do, in fact, have agency in choosing your living conditions.
We have agency in that we could choose to renounce careers in tech. That’s it.
There is a price/quality tradeoff in every housing market; all complaints about price can be interpreted as insufficient willingness to compromise on quality. A cardboard box under a bridge is free! “You’re actually obscenely wealthy because you don’t live in a basement and drive 6 hours to work” is not the argument you think it is. Tech workers are telling you how they feel about Bay Area weather, schools, and transportation every time they complain about RTO. We all want nothing more than to get the fuck away from here. Nothing more except, perhaps, to do the work we were meant to do.
Moving to New York or Los Angeles for affordable housing is ridiculous on its face. Austin, Denver, and I would add Miami were basically flash-in-the-pan situations: first movers got some great deals, but the housing markets have priced it in by now & the forward-looking job market outlooks are uncertain.
Seattle is interesting in that it's clearly a durable tech job center and is meaningfully cheaper than San Francisco. It's still twice as expensive as a normal place ($862k vs. $400k) and its street conditions reflect a housing crisis every bit as severe as San Francisco's, but it's true that you could keep your career while paying ~30% less there. So I guess the delta between SF and SEA could be interpreted as a lifestyle splurge.
Where is the right place to build? Drought seems no more problematic than the wildfires, flooding, etc other areas have to deal with. In fact it's something we can probably engineer our way out of (by conserving more and building dams and so forth) whereas hurricanes are totally unavoidable if you build in Florida or Houston.
Also it obviously wasn't one person who decided to build a massive city. Lots of people are moving there because they like whatever trade offs the area offers.
The water issue can be permanently solved with a desalination plant and a canal to the Sea of Cortez, similar projects have been done all over the Middle East. I think AZ actually has really bright prospects both figuratively and literally.
So, nowhere. There are no longer plentiful resources anywhere. We have entered an age of scarcity. Increasing constrictions seem to be inevitable, which people cannot grasp having seen unmitigated expansion for countless decades.
Some of that scarcity is artificially induced, but that does not change the fact that land, water, and other resources are becoming exponentially more expensive to obtain.
Being on the water can help too. Dubai might be in a desert, but a desert with direct sea (and ocean) access. Arizona, PHX in particular, has no such resource.
Per 100g doesn't make it easy to compare a bag of chips to a can of soda.
Unless you carry around a food scale to measure out all your snacks, I fail to understand why 100g^-1 is useful.
I would however be interested in a pie chart that displays the calories from different macros. Then tic tac would be 100% sugar and you could compare to soda, also 100% sugar, or chips, which are maybe 65% fat and 5% protein and 30% carbs.
Sure, but now we're back to the current US system, where you have to multiply or divide by the amount in the container (either servings per container, or grams per container) to make sense of the measurement.
Standardizing to 100g doesn't actually make it easier to compare foods, because I don't know how 100g of popcorn compares to 100g of chocolate or 100g of soup. It just doesn't mean anything to me.
From my experience, it does make it easier if you need to compare similar foods.
For example, I want to buy a yogurt with highest amount of protein. Serving sizes may vary between 150 and to 500gr. With standardized labels it is very easy task.
In my country we have calories listed for both 100g/container and nutrients are always listed for 100g and sometimes for full contaiber. I almost always use 100g part.
In the US, serving sizes are generally standardized per type of food, so two quart containers of yogurt will always have the same serving size in the nutrition facts.
There is quite a bit of water in soda. While there is a lot of sugar, it is about 10% which is a bit less than 100.
Specifying per 100g (or some other size) makes it quite easy to compare two different products. Of course you also have to take into consideration how much of the product you will consume
Which is a pretty crucial distinction, no? No one would get upset if CNET announced they were deleting clickbait and blogspam.
With articles such as "The Best Home Deals from Urban Outsitters' Fall Forward Sale" currently gracing their front page, I'm wondering how long HN commenters expect to need access to this content.
I'll just assume you neglected to read TFA, because if you had, you would have discovered that it links to an official Google source that states CNET shouldn't be doing this.[1]
I could imagine CNET‘s SEO team got an average rank goal instead of absolute SEO traffic. So by removing low ranked old pages the avg position of their search results goes closer to the top even though total traffic sinks. I‘ve seen stuff like this happen at my own company as well, where a team‘s KPIs are designed such that they‘ll ruin absolute numbers in order to achieve their relative KPI goals, like getting an increase in conversion rates by just cutting all low-conversion traffic.
In general, people often forget that if your target is a ratio, you can attack the numerator or the denominator. Often the latter is the easier to manipulate.
Even if it's not a ratio. When any metric becomes a target it will be gamed.
My organization tracks how many tickets we have had open for 30 days or more. So my team started to close tickets after 30 days and let them reopen automatically.
Meanwhile that's not necessarily a bad outcome. In theory it makes the data better by focusing on deaths that might or might not have been preventable, rather than making every hospital look responsible for inevitable deaths.
Of course the actual behavior in the article is highly disturbing.
This is why KPIs or targets should NEVER be calculated values like averages or ratios. The team is then incentivized to do something hostile such as not promote the content as much so that the ratio is higher, as soon as they barely scrape past the impressions mark.
When deciding KPIs, Goodhart's law should always be kept in mind: when a measure becomes a target, it ceases to be a good measure.
It's really hard to not create perverse incentives with KPIs. Targets like "% of tickets closed within 72 hours" can wreck service quality if the team is under enough pressure or unscrupulous.
Sure they can, e.g. on-time delivery (or even better shipments missing the promised delivery date) is a ratio. Or inventory turn rates, there you actually want people to attack the denominator.
Generaly speaking, an easy solution is to attach another target to either the nominator or denominator, a target that requires people to move that in value in acertqin direction. That might even be a different team thanthe one having goals on the ratio.
> Sure they can, e.g. on-time delivery (or even better shipments missing the promised delivery date) is a ratio. Or inventory turn rates, there you actually want people to attack the denominator.
These are good in that they’re directly aligned with business outcomes but you still need sensible judgement in the loop. For example, say there’s an ice storm or heat wave which affects delivery times for a large region – you need someone smart enough to recognize that and not robotically punish people for failing to hit a now-unrealistic goal, or you’re going to see things like people marking orders as canceled or faking deliveries to avoid penalties or losing bonuses.
One example I saw at a large old school vendor was having performance measured directly by units delivered, which might seem reasonable since it’s totally aligned with the company’s interests, except that they were hit by a delay on new CPUs and so most of their customers were waiting for the latest product. Some sales people were penalized and left, and the cagier ones played games having their best clients order the old stuff, never unpack it, and return it on the first day of the next quarter - they got the max internal discount for their troubles so that circus cost way more money than doing nothing would have, but that number was law and none of the senior managers were willing to provide nuance.
Yeah, every part of this was a “don’t incentivize doing this”. I doubt anyone would ever be caught for that since there was nothing in writing but it was a complete farce of management. I heard those details over a beer with one of the people involved and he was basically wryly chuckling about how that vendor had good engineers and terrible management. They’re gone now so that caught up with them.
That only says that Google discourages such actions, not that such actions are not beneficial to SEO ranking (which is equal to the aforementioned economic incentive in this case).
So whose word do we have to go on that this is beneficial, besides anonymous "SEO experts" and CNET leadership (those paragons of journalistic savvy)?
Perhaps what CNET really means is that they're deleting old low quality content with high bounce rates. After all, the best SEO is actually having the thing users want.
In my experience SEO experts are the most superstitious tech people I ever met. One guy wanted me to reorder HTTP header fields to match another site. He wanted our minified HTML to include a linebreak just after a certain meta element just because some other site had it. I got requests to match variable names in minified JS just because googles own minified JS had that name.
> In my experience SEO experts are the most superstitious tech people I ever met.
And some are the most data-driven people you'll ever meet. As with most people who claim to be experts, the trick is to determine whether the person you're evaluating is a legitimate professional or a cargo-culting wanna-be.
I’ve always felt there is a similarity to day traders or people who overanalyze stock fundamentals. There comes a time when data analysis becomes astrology…
> There comes a time when data analysis becomes astrology.
Excellent quote. It's counterintuitive but looking at what is most likely to happen according to the datasets presented can often miss the bigger picture.
This. It is often the scope and context that determines logic. It is easy to build bubbles and stay comfy inside. Without revealing much, I asked a data scientist whose job it is to figure out bids on keywords and essentially control how much $ is spent on advertising something at a specific region about negative criteria. As in, are you sure you wouldn’t get this benefit even if you stopped spending the $ and his response was “look at all this evidence that our spend caused this x% increase in traffic and y% more conversions” and that was 2 years ago. My follow up question was - okay, now that the thing you advertised is popular, wouldn’t it be the more organic choice in the market, and we can stop spending the $ there?
His answer was - look at what happened when we stopped the advertising in this small region in Germany 1.5 years ago!
My common sense validation question still stands. I still believe he built a shiny good bubble 2 years ago, and refuses to reason with wider context, and second degree effects.
Leos are generally given the “heroic/action-y” tropes, so if you are, for example, trying to pick Major League Baseball players, astrology could help a bit.
Some of the most superstitious people I've ever met were also some of the most data-driven people I've ever met. Being data-driven doesn't exclude unconscious manipulation of the data selection or interpretation, so it doesn't automatically equate to "objective".
The data analysis I've seen most SEO experts do is similar to sitting at a highway, carefully timing the speed of each car, taking detailed notes of the cars appearance, returning to the car factory and saying that all cars need to be red because the data says red cars are faster.
One SEO expert who consulted for a bank I worked at wanted us to change our URLs from e.g. /products/savings-accounts/apply by reversing them to /apply/savings-accounts/products on the grounds that the most specific thing about the page must be as close to the domain name as possible, according to them. I actually went ahead and changed our CMS to implement this (because I was told to). I'm sure the SEO expert got paid a lot more than I did as a dev. A sad day in my career. I left the company not long after...
Unfortunately though, this was likely good advice.
The yandex source code leak revealed that keyword proximity to root domain is a ranking factor. Of course, there’s nearly a thousand factors and “randomize result” is also a factor, but still.
SEO is unfortunately a zero sum game so it makes otherwise silly activities become positive ROI.
I think you're largely correct but Google isn't one person so there may be somewhat emergent patterns that work from an SEO standpoint that don't have a solid answer to Why. If I were an SEO customer I would ask for some proof but that isn't the market they're targeting. There was an old saying in the tennis instruction business that there was a bunch of 'bend your knees, fifty please'. So lots of snakeoil salesman but some salesman sell stuff that works.
That's a bit out there, but Google has mentioned in several different ways that pages and sites have thousands of derived features and attributes they feed into their various ML pipelines.
I assume Google is turning all the site's pages, js, inbound/outbound links, traffic patterns, etc...into large numbers of sometimes obscure datapoints like "does it have a favicon", "is it a unique favicon?", "do people scroll past the initial viewport?", "does it have this known uncommon attribute?".
Maybe those aren't the right guesses, but if a page has thousands of derived features and attributes, maybe they are on the list.
So, some SEO's take the idea that they can identify sites that Google clearly showers with traffic, and try to recreate as close a list of those features/attributes as they can for the site they are being paid to boost.
I agree it's an odd approach, but I also can't prove it's wrong.
Is minified "code" still "source code"? I think I'd say the source is the original implementation pre-minification. I hate it too when working out how something is done on a site, but I'm wondering where we fall on that technicality. Is the output of a pre-processor still considered source code even if it's not machine code? These are not important questions but now I'm wondering.
Source code is what you write and read, but sometimes you write one thing and people can only read it after your pre processing. Why not enable pretty output?
Plus I suspect minifying HTML or JS is often cargo cult (for small sites who are frying the wrong fish) or compensating for page bloat
It doesn't compensate bloat, but it reduces bytes sent over the wire, bytes cached in between and bytes parsed in your browser for _very_ little cost.
You can always open dev tools in your browser and have an interactive, nicely formatted HTML tree there with a ton of inspection and manipulation features.
In my experience usually the bigger difference is made by not making it bloated in the first place... As well as progressive enhancement, nonblocking load, serving from a nearby geolocation etc. I see projects minify all the things by default while it should be literally the last measure with least impact on TTI
It does stuff like tree shaking as well; it's quite good. If your page is bloated, it makes it better. If your page is not bloated, it makes it better.
I suppose the difference is that someone debugging at that level will be offered some sort of "dump" command or similar, whereas someone debugging in a browser is offered a "View Source" command. It's just a matter of convention and expectation.
If we wanted browsers to be fed code that for performance reasons isn't human-readable, web servers ought to serve something that's processed way more than just gzipped minification. It could be more like bytecode.
Let's be honest, a lot of non-minified JS code is barely legible either :)
For me I guess what I was getting at is that I consider source the stuff I'm working on - the minified output I won't touch, it's output. But it is input for someone else, and available as a View Source so that does muddy the waters, just like decompilers produce "source" that no sane human would want to work on.
I think semantically I would consider the original source code the "real" source if that makes sense. The source is wherever it all comes from. The rest is various types of output from further down the toolchain tree. I don't know if the official definition agrees with that though.
>If we wanted browsers to be fed code that for performance reasons isn't human-readable,
Worth keeping in mind that "performance" here refers to saving bandwidth costs as the host. Every single unnecessary whitespace or character is a byte that didn't need to be uploaded, hence minify and save on that bandwidth and thus $$$$.
The performance difference on the browser end between original and minified source code is negligible.
Last time I ran the numbers (which admittedly was quite a number of years ago now), the difference between minified and unminified code was negligible once you factored in compression because unminified code compresses better.
What really adds to the source code footprint is all of those trackers, adverts and, in a lot of cases, framework overhead.
The way I see it, if someone needs to minify their JavaShit (and HTML?! CSS?!) to improve user download times, that download time was horseshit to start with and they need to rebuild everything properly from the ground up.
Isn’t this essential what WebAssembly is doing? I’ll admit I haven’t looked into it much, as I’m crap with C/++, though I’d like to try Rust. Having “near native” performance in a browser sounds nice, curious to see how far it’s come.
Minifying HTML is basically just removing non-significant whitespace. Run it through a formatter and it will be readable.
If you dislike unreadable source code I would assume you would object to minifying JS, in which case you should ask people to include sourcemaps instead of objecting to minification.
I mean, isn't that precisely why open source advocates advocate for open source?
Not to mention, there is no need to "minify" HTML, CSS, or JavaShit for a browser to render a page unlike compiled code which is more or less a necessity for such things.
Minifying code for browsers greatly reduces the amount of bandwidth needed to serve web traffic. There's a good reason it's done.
By your logic, there's actually no reason to use compiled code at all, for almost anything above the kernel. We can just use Python to do everything, including run browsers, play video games, etc. Sure, it'll be dog-slow, but you seem to care more about reading the code than performance or any other consideration.
I already alluded[1] to the incentives for the host to minify their JavaShit, et al., and you would have a point if it wasn't for the fact that performance otherwise isn't significantly different between minified and full source code as far as the user would be concerned.
I'm not talking about the browser's performance, I'm talking about the network bandwidth. All that extra JS code in every HTTP GET adds up. For a large site serving countless users, it adds up to a lot of bandwidth.
Somebody mentioned negligible/deleterious impacts on bandwidth for minified code in that thread, but they seemed to have low certainty. If you happen to have evidence otherwise, it might be informative for them.
>In computing, a compiler is a computer program that translates computer code written in one programming language (the source language) into another language (the target language).
According to the Open Source Definition of the OSI it's not:
> The program must include source code [...] The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor [...] are not allowed.
The popular licenses for which this is a design concern are careful to define source code to mean "preferred form of the work for making modifications" or similar.
Google actually describes an entirely plausible mechanism of action here at [1]. old content slows down site crawling, which can cause new content to not be refreshed as often.
Sure, one page doesn’t matter, but thousands will.
>Removing it might mean if you have a massive site that we’re better able to crawl other content on the site. But it doesn’t mean we go “oh, now the whole site is so much better” because of what happens with an individual page.
Parsing this carefully, to me it sounds worded to give the impression removing old pages won’t help the ranking of other pages without explicitly saying so. In other words, if it turns out that deleting old pages helps your ranking (indirectly, by making Google crawl your new pages faster), this tweet is truthful on a technicality.
In the context of negative attention where some of the blame for old content being removed is directed toward Google, there is a clear motive for a PR strategy that deflects in this way.
The tweet is also clearly saying that deleting old content will increase the average page rank of your articles in the first N hours after it is published. (Because the time to first crawl will decrease, and the page rank is effectively zero before the first crawl).
CNet is big enough that I’d expect Google to ensure the crawler has fresh news articles from it, but that isn’t explicitly said anywhere.
And considering all the AI hype, one could have hoped that the leading search engine crawler would be able to "smartly" detect new contents based on a url containing a timestamp.
Apparently not if this SEO trick is really a thing...
EDIT : sorry my bad it's actually the opposite. One could expect that a site like CNET would include a timestamp and a unique ID in their URL in 2023. This seems to be the "unpermalink" of a recent cnet article.
I did the tweet. It is clearly not saying anything about the "average page rank" of your articles because those words don't appear in the tweet at all. And PageRank isn't the only factor we use in ranking pages. And it's not related to "gosh, we could crawl your page in X hours therefore you get more PageRank."
It's not from Google PR. It's from me. I'm the public liaison for Google Search. I work for our search quality team, not for our PR team.
It's not worded in any way intended to be parsed. I mean, I guess people can do that if they want. But there's no hidden meaning I put in there.
Indexing and ranking are two different things.
Indexing is about gathering content. The internet is big, so we don't index all the pages on it. We try, but there's a lot. If you have a huge site, similarly, we might not get all your pages. Potentially, if you remove some, we might get more to index. Or maybe not, because we also try to index pages as they seem to need to be indexed. If you have an old page that doesn't seem to change much, we probably aren't running back ever hour to it in order to index it again.
Ranking is separate from indexing. It's how well a page performs after being indexed, based on a variety of different signals we look at.
People who believe removing "old" content aren't generally thinking that's going to make the "new" pages get indexed faster. They might think that maybe it means more of their pages overall from a site could get indexed, but that can include "old" pages they're successful with, too.
The key thing is if you go to the CNET memo mentioned in Gizmodo article, it says this:
"it sends a signal to Google that says CNET is fresh, relevant and worthy of being placed higher than our competitors in search results."
Maybe CNET thinks getting rid of older content does this, but it's not. It's not a thing. We're not looking at a site, counting up all the older pages and then somehow declaring the site overall as "old" and therefore all content within it can't rank as well as if we thought it was somehow a "fresh" site.
That's also the context of my response. You can see from the memo that it's not about "and maybe we can get more pages indexed." It's about ranking.
Suppose CNET published an article about LK99 a week ago, then they published another article an hour ago. If Google hasn’t indexed the new article yet, won’t CNET rank lower on a search for “LK99” because the only matching page is a week old?
If by pruning old content, CNET can get its new articles in the results faster, it seems this would get CNET higher rankings and more traffic. Google doesn’t need to have a ranking system directly measuring the average age of content on the site for the net effect of Google’s systems to produce that effect. “Indexing and ranking are two different things” is an important implementation detail, but CNET cares about the outcome, which is whether they can show up at the top of the results page.
>If you have a huge site, similarly, we might not get all your pages. Potentially, if you remove some, we might get more to index. Or maybe not, because we also try to index pages as they seem to need to be indexed.
The answer is phrased like a denial, but it’s all caveated by the uncertainty communicated here. Which, like in the quote from CNET, could determine whether Google effectively considers the articles they are publishing “fresh, relevant and worthy of being placed higher than our competitors in search results”.
You're asking about freshness, not oldness. IE: we have systems that are designed to show fresh content, relatively speaking -- matter of days. It's not the same as "this article is from 2005 so it's old don't show it." And it's also not what is being generally being discussed in getting rid of "old" content. And also, especially for sites publishing a lot of fresh content, we get that really fast already. It's essential part of how we gather news links, for example. And and and -- even with freshness, it's not "newest article ranks first" because we have systems that try to show the original "fresh" content or sometimes a slightly older piece is still more relevant. Here's a page that explains more ranking systems we have that deal with both original content and fresh content: https://developers.google.com/search/docs/appearance/ranking...
Ha, I actually totally agree with you, apparently my comment gave the wrong impression. I was just arguing with the GP's comment which was trying to (fruitlessly, as you point out) read tea leaves that aren't even there.
While CNET might not be the most reliable side, Google telling content owners to not play SEO games is also too biased to be taken at face value.
It reminds me of Apple's "don't run to the press" advice when hitting bugs or app review issues. While we'd assume Apple knows best, going against their advice totally works and is by far the most efficient action for anyone with enough reach.
Considering how much paid-for unimportant and unrelated drivel I now have to wade through every time I google to get what I am asking for, I doubt very much that whatever is optimal for search-engine ranking has anything to do with what users want.
Do the engineers at Google even know how the Google algorithm actually works? Better than SEO experts who spend there time meticulously tracking the way that the algorithm behaves under different circumstances?
My bet is that they don't. My bet is that there is so much old code, weird data edge cases and opaque machine-learning models driving the search results, Google's engineers have lost the ability to predict what the search results would be or should be in the majority of cases.
SEO experts might not have insider knowledge, but they observe in detail how the algorithm behaves, in a wide variety of circumstances, over extended periods of time. And if they say that deleting old content improves search ranking, I'm inclined to believe them over Google.
Maybe the people at Google can tell us what they want their system to do. But does it do what they want it to do anymore? My sense is that they've lost control.
I invite someone from Google to put me in my place and tell me how wrong I am about this.
Once upon a time, Matt Cutts would come on HN give a fairly knowledgeable and authoritative explanation of how Google worked. But those days are gone and I'd say so are days of standing behind any articulated principle.
I work for Google and do come into HN occasionally. See my profile and my comments here. I'd come more often if it were easier to know when there's something Google Search-related happening. There's no good "monitor HN for X terms" thing I've found. But I do try to check, and sometimes people ping me.
The engineers at Google do know how our algorithmic systems work because they write them. And the engineers I work with at Google looking at the article about this found it strange anyone believes this. It's not our advice. We don't somehow add up all the "old" pages on a site to decide a site is too "old" to rank. There's plenty of "old" content that ranks; plenty of sites that have "old" content that rank. If you or anyone wants our advice on what we do look for, this is a good starting page: https://developers.google.com/search/docs/fundamentals/creat...
There is. Which is why I specifically talked only about writing for algorithmic systems. Machine learning systems are different, and not everyone fully understands how they work, only that they do and can be influenced.
It's really hard to get a deep or solid understanding of something if you lack insider knowledge.
The search algorithm is not something most Googlers have access too but I assume they observe what their algorithm does constantly in a lot of detail to measure what their changes are doing.
I think in this context, saying that it's not a thing that google doesn't like old content just means that google doesn't penalize sites as a whole for including older pages, so deleting older pages won't help boost the site's ranking.
This is not the same as saying that it doesn't prioritize newer pages over older pages in the search results.
The way it's worded does sound like it could imply the latter thing, but that may have just been poor writing.
That Googler here. I do use Google! And yeah, I get sometimes people want older content and we show fresher content. We have systems designed to show fresher content when it seems warranted. You can imagine a lot of people searching about Maui today (sadly) aren't wanting old pages but fresh content about the destruction there.
But we do show older content, as well. I find often when people are frustrated they get newer content, it's because of that crossover where there's something fresh happening related to the query.
If you haven't tried, consider our before: and after: commands. I hope we'll finally get these out of beta status soon, but they work now. You can do something like before:2023 and we wouldn't show pages from before 2023 (to the best we can determine dates). They're explained more here:
https://twitter.com/searchliaison/status/1115706765088182272
Maybe not related to the age of the content but more content can definitely penalize you. I recently added a sitemap to my site, which increased the amount of indexed pages, but it caused a massive drop in search traffic (from 500 clicks/day to 10 clicks/day). I tried deleting the sitemap, but it didn't help unfortunately.
100K+. Mostly AI and user generated content. I guess the sudden increase in number of indexed pages prompted a human review or triggered an algorithm which flagged my site as AI generated? Not sure.
it seems incredibly short-sighted to assume that just because these actions might possibly give you a small bump in SEO right now, they won't have long-term consequences.
if CNET deletes all their old articles, they're making a situation where most links to CNET from other sites lead to error pages (or at least, pages with no relevant content on them) and even if that isn't currently a signal used by google, it could become one.
Technically, you're supposed to do a 410 or a 404, but when some pages being deleted have those extremely valuable old high-reputation backlinks, it's just wasteful, so i'd say it's better to redirect, to the "next best page" like maybe a category or something, or the homepage, as the last resort. Why would it be problematic? Especially if you do a sweep and only redirect pages that have valuable backlinks.
I was only talking about mass redirecting 404s to the homepage, which I've heard is not great, I think what you're saying is fine -- but that sounds like more of a well thought out strategy.
It's not that we discourage it. It's not something we recommend at all. Not our guidance. Not something we've had a help page about saying "do this" or "don't do this" because it's just not something we've felt (until now) that people would somehow think they should do -- any more than "I'm going to delete all URLs with the letter Y in them because I think Google doesn't like the letter Y."
People are free to believe what they want, of course. But we really don't care if you have "old" pages on your site, and deleting content because you think it's "old" isn't likely to do anything for you.
Likely, this myth is fueled by people who update content on their site to make it more useful. For example, maybe you have a page about how to solve some common computer problem and a better solution comes along. Updating a page might make it more helpful and, in turn, it might perform better.
That's not the same as "delete because old" and "if you have a lot of old content on the site, the entire site is somehow seen as old and won't rank better."
Your recommendations are not magically a description of how your algorithm actually behaves. And when they contradict, people are going to follow the algorithm, not the recommendation.
Yeah, Google’s statement seems obviously wrong. They say they don’t tell people to delete old content, but then they say that old content does actually affect a site in terms of it’s average ranking and also what content gets indexed.
What the Google algorithm encourage/discourage and what google blog or documentation encourage/discourage are COMPLETELY different things. Most people here are complaining about the former, and you keep responding about the latter.
No one has demonstrated that simply removing content that's "old" means we think a site is "fresh" and therefore should do better. There are people who perhaps updated older content reasonably to keep it up-to-date and find that making it more helpful that way can, in turn, do better in search. That's reasonable. And perhaps that's gotten confused with "remove old, rank better" which is a different thing. Hopefully, people may better understand the difference from some of this discussion.
This is another problem of the entire SEO industry. Websites trust these SEO consultants and growth hackers more than they trust information from Google itself. Somehow, it becomes widely accepted that the best information on Google ranking is from those third parties but not Google.
I'm not sure it is so cut and dried. Who is more likely to give you accurate information on how to game Google's ranking: Google themselves, or an SEO firm. I suspect that Google has far less incentive to provide good information on this than an SEO firm would.
Google will give you advice on how to not be penalized by Google. They won’t give you advice on how to game the system in your favor.
The more Google helps you get ahead, the more you end up dominating the search results. The more you dominate the results, the more people will start thinking to come straight to you. The more people come straight to you, the more people never use Google. The less people use Google, the less revenue Google generates.
I would like to know what dollar amount Google makes on people typing things like “Amazon” into google search and then clicking the first paid result to Amazon.
It’s the same on YouTube - the majority of the people who work there seem to have no idea how “the algorithm” actually works - yet they still produce all sorts of “advice” on how to make better videos.
There’s an easy proof that those SEO consultants have a point: find a site that according to Google’s criteria will never rank, which has rocketed to the top of the search rankings in its niche within a couple months. That’s a regular thing and proves that there are ways to rank on Google that Google won’t advise.
It could be premature to place fault with the SEO industry. Think about the incentives: Google puts articles out, but an SEO specialist might have empirical knowledge from working for a various number of web properties. It's not that I wouldn't trust Google's articles, but specialists might have discovered undocumented methods for giving a boost.
The good ones will share the data/trends/case studies that would support the effectiveness of their methods.
But the vast majority are morons, grifters, and cargo culters.
The Google guidance is generally good and mildly informative but there’s a lot of depth that typically isn’t covered that the SEO industry basically has to black box test to find out.
> Websites trust these SEO consultants and growth hackers more than they trust information from Google itself.
That's because websites' goals and Google's goals are not aligned.
Websites want people to engage with their website, view ads, buy products, or do something else (e.g. buy a product, vote for a party). If old content does not or detracts from those goals, they and SEO experts say, it should go because it's dragging the rest down.
Google wants all the information and for people to watch their ads. Google likes the long tail; Google doesn't care if articles from the 90's are outdated because people looking at it (assuming the page runs Google ads) or searching for it (assuming they use Google) means impressions and therefore money for them.
Google favors quantity over quality, websites the other way around. To oversimplify and probably be incorrect.
Google actively lies on an infinite number of subjects. And SEO is a completely adversarial subject where Google has an interest in lying to prevent some behaviors. While consultants and "growth hackers" are very often selling snake oil, that doesn't make Google an entity you can trust either.
Hey, don't do that. That's bad. But if you keep doing it, you'll get better SEO. No, we won't do anything to prevent this from being a way to game SEO.
"Google says you shouldn't do it" and "Google's search algorithm says that you should do it" can both be true at the same time. The official guidance telling you what to do doesn't track with what the search algorithm uses to decide search placement. Nobody's going to follow Google's written instructions if following the instructions results in a penalty and disobeying them results in a benefit.
They say "Google doesn't like "old" content? That's not a thing!"
But who knows, really? They run things to extract features nobody outside of Google knows that are proxies for "content quality". Then run them through pipelines of lots of different not-really-coordinated ML algorithms.
Maybe some of those features aren't great for older pages? (broken links, out-of-spec html/js, missing images, references to things that don't exist, practices once allowed now discouraged...like <meta keywords>, etc). And I wouldn't be surprised if some part of overall site "reputation" in their eyes is some ratio of bad:good pages, or something along those lines.
I have my doubts that Google knows exactly what their search engines likes and doesn't like. They surely know which ads to put next to those maybe flawed results, though.
I don’t know man, I read it but I’ve learned to judge big tech talk purely by their actions and I don’t think there’s a lot of incentive built into their system that supports this statement.
My understanding is that if you have a very large site, removing pages can sometimes help because:
- There is an indexing "budget" for your site. Removing pages might make reindexing of the rest of the pages faster.
- Removing pages that are cannibalising on each other might help the main page for the keywords to rank higher.
- Google is not very fond of "thin wide" content. Removing low quality pages can be helpful, especially if you don't have a lot of links to your site.
- Trimming the content of a website could make it easier for people and Google to understand what the site is about and help them find what they are looking for.
Google search ranking involves lots of neural networks nowadays.
There is no way the PR team making that tweet can say for sure that deleting old content doesn't improve rank. Nobody can say that for sure. The neural net is a black box, and it's behaviour is hard to predict without just trying it and seeing.
Speaking from experience as someone who is paid for SEO optimization there's a list a mile long of things Google says "doesn't work" or you "shouldn't do" but in fact work very well and everyone is doing it.
I remember these kind of sources right from inside in the Matt Cutts era 15+ years ago encouraging and advising so many things which later proven to not be the case. I wouldn't take this only because it was written by the official guide.
Google says so many things about SEO which are not true. There are some rules which are 100% true and some which they just hope their AI thinks they are true.
There's an awful lot of SEO people on Twitter that claim to be connected to Google, and the article he links on the Google domain as a reference doesn't say anything on the topic that I can find. I'm reluctant to call that an official source.
Journalist here. Danny Sullivan works for Google, but spent nearly 20 years working outside of Google as a fellow journalist in the SEO space before he was hired by the company.
1st paragraph is correct , 2nd not quite - Matt Cutts was a distinguished engineer (looking after web spam at Google) who took on the role of the search spokesperson - it’s that role Danny took over as “search liaison”
No. But it's also complicated, as Matt did thinks beyond web spam. Matt worked within the search quality team, and he communicated a lot from search quality to the outside world about how Search works. After Matt left, someone else took over web spam. Meanwhile, I'd retired from journalism writing about search. Google approached me about starting what became a new role of "public liaison of search," which I've done for about six years now. I work within the search quality team, just as Matt did, and that type of two-way communication role he had, I do. In addition, we have an amazing Search Relations team that also works within search quality, and they focus specifically on providing guidance to site owners and creators (my remit is a bit broader than that, so I deal with more than just creator issues).
I'm the source. I officially work for Google. The account is verified by X. It's followed by the official Google account. It links to my personal account; my personal account links back to it. I'm quoted in the Gizmodo story that links to the tweet. I'm real! Though now perhaps I doubt my own existence....
He claims to work for Google on X, LinkedIn, and his own website. I am inclined to believe him because I think he would have received a cease and desist by now otherwise.
He claims to work for google as search "liason". He's a PR guy. His job is to make people think that google's search system is designed to improve the internet, instead of it being designed to improve google's accounting.
I actually work for our search quality team, and my job is to foster two-way communication between the search quality team and those outside Google. When issues come up outside Google, I try to explain what's happened to the best I can. I bring feedback into the search quality team and Google Search generally to help foster potential improvements we can make.
Yes. All this is saying that you do not write any code for the search algorithms. Do you know how to code? Do you have access to those repos internally? Do you read them regularly? Or are you only aware of what people tell you in meetings about it.
Your job is not to disseminate accurate information about how the algorithm works but rather to disseminate information that google has decided it wants people to know. Those are two extremely different things in this context.
I work on these kind of vague "algorithm" style products in my job, and I know that unless you are knee deep in it day to day, you have zero understanding of what it ACTUALLY does, what it ACTUALLY rewards, what it ACTUALLY punishes, which can be very different from what you were hoping it would reward and punish when you build and train it. Machine learning still does not have the kind of explanatory power to do any better than that.
No. I don't code. I'm not an engineer. That doesn't mean I can't communicate how Google Search works. And our systems do not calculate how much "old" content is on a site to determine if it is "fresh" enough to rank better. The engineers I work with reading about all this today find it strange anyone thinks this.
Probably not; anyone can claim to work for these companies with no repercussions, because is it a crime? Maybe if they're pricks that lower these companies' public opinion (libel), but even that requires a civil suit.
But lying on the internet isn't a crime. I work for Google on quantum AI solutions in adtech btw.
He’s been lying a long time, considering that he’s kept the lie up that he’s an expert on SEO for nearly 30 years at this point, and I’ve been following his work most of that time.
But Texas is also the size of the four states to its east. Georgia is a coastal state. If you pull from Brunswick, GA and send it to just outside St. Louis, that distance is covered within the single state of Texas.
That said, we don't pull sea water to supply water for industry - desalination is too costly.
Luckily you can build on the ideas of others without being a cargo cultist. Simply verify the sturdiness of the foundation before you go adding an extra wing to the house.