I'm shadow banned by DuckDuckGo and Bing

beej71 · on Jan 15, 2023

I have exactly this problem. Beej's Guide to Network Programming is indexed just fine. Beej's Guide to C won't index.

The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.

All the other guides index just fine.

I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.

Recently I split the C guide in two. I'll have to check to see if that made any difference.

But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.

KRAKRISMOTT · on Jan 15, 2023

It is interesting since Beej's guide is probably the most famous C/Unix programming tutorial after K&R.

beej71 · on Jan 15, 2023

I don't know if I'd go _that_ far. There are a lot of great books out there.

But I'm pretty sure mine is the greatest that you can't pay money for. ;)

wan_ala · on Jan 15, 2023

The statement isn't too far from the truth ngl. I think it'd probably be in the top 10 most famous C guides out there.

TeMPOraL · on Jan 16, 2023

C? I don't know, I never heard of it. Network programming guide, on the other hand, is arguably the most famous out there.

beej71 · on Jan 16, 2023

Well, when you search for the C guide, for the love of God, don't use DDG. ;)

mharig · on Jan 17, 2023

beej.us/guide is the first hit

beej71 · on Jan 17, 2023

Unfortunately beej.us/guide/bgc is not.

abdullah2993 · on Jan 15, 2023

Whats K&R?

jinwoo68 · on Jan 15, 2023

Kernighan & Ritchie: https://www.amazon.com/Programming-Language-2nd-Brian-Kernig...

sigio · on Jan 15, 2023

http://cslabcms.nju.edu.cn/problem_solving/images/c/cc/The_C...

et-al · on Jan 15, 2023

Could it be due to some overly zealous prude filter and the unfortunate coincidence that "beej" is American slang for blowjob (oral sex)?

beej71 · on Jan 15, 2023

I've definitely considered that. (I'm well-aware of my nickname's connotations, and I just don't care. :) )

But they index everything else on my site and don't prude out over that...

michaelmrose · on Jan 15, 2023

Actually outright porn is indexed you know or so I hear.

SonOfLilit · on Jan 15, 2023

Indexed, but only returned if the search query was explicitly and unambiguously about porn.

frickinLasers · on Jan 16, 2023

Not true, I've had pages of x-rated results with SafeSearch on, with DDG and Bing, a couple of times. The search only needs to be sufficiently weird.

see: https://news.ycombinator.com/item?id=31334670

SonOfLilit · on Jan 16, 2023

Where "weird" involves words like "licking" and "party" (not saying it's not a bug, just that it's a statistics vs actual language understanding bug in a feature and not absence of that feature). I bet there's no way to compose all of the words "spatula", "serotonin", "pion" and "deconstruction" along with words like is/a/an/of/how/what/when that would turn safe search off, despite any query of this format would pretty weird.

irrational · on Jan 15, 2023

Porn is one thing, the word Beej is quite another. I wish this was a joke.

Johnny555 · on Jan 15, 2023

That's pretty obscure slang, I'm American and have never heard it.

beej71 · on Jan 15, 2023

I used to say, "'Beej', like 'B.J. Hunnicutt' from MASH."

But MASH is getting a little too far removed these days. :)

TeMPOraL · on Jan 16, 2023

Took me a while to figure it out, but yeah - seems "Beej" is pronounced as "(Bee)(j)", which matches the pronunciation of "BJ". But I don't think it's relevant to site indexing, unless search engines started to take homophones into account.

EDIT: but maybe they did, ever since voice assistants became a thing?

alar44 · on Jan 15, 2023

It's very common.

Johnny555 · on Jan 15, 2023

Is it very common?

If I search "Beej" on Google without Safesearch enabled, I get 14.3M results, if I turn on the Safesearch filter, it still returns 14.3M results.

If I repeat the same experiment with "blowjob", it's 1.5B results vs 23M.

If I search for "Beej" on Bing with SafeSearch Off, I get 2,840,000 results, while with Safesearch on Strict, I get 2,800,000 results. I couldn't search for "blowjob" at all with Safesearch on Strict.

alar44 · on Jan 15, 2023

It's generally spoken, not typed out.

cyberpunk · on Jan 15, 2023

Wow it's beej. I owe you rather a lot of beers, Guide to Network Programming is directly responsible for my entire career. Sorry for the low value post! :}

beej71 · on Jan 15, 2023

Entire career? That's pretty high praise... Well, if you're ever in Bend, OR drop me a DM and I'll start to collect. :)

mprime1 · on Jan 15, 2023

My hero! Thank you for your work.

(Yes, not adding much insightful conversation. I don’t care if I get downvoted.)

richardjam73 · on Jan 16, 2023

It seems to show up for me. Perhaps it is fixed now.

https://duckduckgo.com/?q=c+guide+stdalign&t=ffab&ia=web

brings up your guide as the 6th result.

beej71 · on Jan 16, 2023

I have to add "beej" to the search, and then it hits my C Library Reference Guide. But the C Tutorial Guide is nowhere to be found. :(

daflip · on Jan 15, 2023

i assume you have >0 backlinks to the site?

beej71 · on Jan 15, 2023

Some rando backlink checker says it's about 1300.

lelanthran · on Jan 15, 2023

> i assume you have >0 backlinks to the site?

I always wondered about this. What exactly is a backlink, and why should I need one?

vgel · on Jan 15, 2023

Backlinks are links to your website used by algorithms like PageRank^1 to weight how important your site is. Roughly speaking, more links, especially from sites with their own high pagerank = higher rank.

[1]: https://en.m.wikipedia.org/wiki/PageRank

sethhochberg · on Jan 15, 2023

Some other website linking to your website. The general premise is that if other sites with some known reputation are linking to you, you have a bit more credibility than a completely isolated site that nobody else references.

crazygringo · on Jan 15, 2023

I may have found the answer, and I've seen this before (it happened to me once). It's when a different spam site copies your content wholesale, and a search engine decides they're the "original" site, and you're the spammy copycat.

Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)

How to fix it really depends on what techniques they're using to mirror your site, of which there are many.

Example search and resulting URL:

https://www.bing.com/search?q=%22Megan+Smith+explaining+the+...

https://www.scien.cx/2022/12/25/megan-smith-explaining-the-g...

Compare with Google getting it right:

https://www.google.com/search?q=%22Megan+Smith+explaining+th...

https://daverupert.com/2022/12/megan-smith-general-magic-pro...

gary_0 · on Jan 15, 2023

Recently on HN there was "Someone is proxy-mirroring my website, can I do anything?": https://news.ycombinator.com/item?id=33952114

It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.

irrational · on Jan 15, 2023

Nowadays, when I search for things, the results are often clearly pages that have come from a program scrapping sites and then merging them into one page. You can tell because the pages are not really coherent and quickly start to repeat themselves. I assume they are getting money through ads on the pages, though I never actually see the ads because of my blockers. I wish there was a button in the browser that I could click to report the page as spam to all search engines.

411111111111111 · on Jan 15, 2023

> I wish there was a button in the browser that I could click to report the page as spam to all search engines.

The spammers would use that offensively to destroy the original content creators with that system

CamperBob2 · on Jan 15, 2023

The way it should work is that the spam reports are used only to filter the results you see.

TeMPOraL · on Jan 16, 2023

That's precisely why I started to use and pay for Kagi - it fetches search results from Google and Bing, but allows to prioritize, deprioritize, pin or block specific domains in your search results.

I'm still surprised no one else seems to be offering this feature.

8n4vidtmkvmk · on Jan 15, 2023

Seems reasonable. If enough human-like users (with gmail accounts, yt activity, and other indicators) ban a particular site enough times that should offer some evidence that it's low quality too.

LAC-Tech · on Jan 15, 2023

uBlacklist is really good for that. Just click "block this site" in the search results.

https://chrome.google.com/webstore/detail/ublacklist/pncfbmi...

MandieD · on Jan 16, 2023

I don't think it reports to any other search engines, and am not sure it affects any but your own subsequent results, but Kagi has a feedback scale on each result: "block", "lower", "normal", "raise", and "pin".

m-i-l · on Jan 15, 2023

> It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.

The problem is that the two work hand-in-hand, thanks to the advertising driven search model, and the search engines owning the main advertising platforms.

It should be easy for search engines to identify an original site from the SEO spammer rip-offs - the original site is going to have no adverts (or certainly fewer) while the SEO spammer copies are going to be covered in adverts. The problem is that the search engines have no incentive to do so, in fact if anything they have the incentive to send people to the sites with more adverts.

And of course the whole problem has been created by the search engines in the first place - there would be no point in SEO spammers making advert-laden ripoff sites if it wasn't to rake in advertising revenue.

kshacker · on Jan 15, 2023

No more hostile than real world, we are just finding out it is a reflection of our world, of course the difference being the global interconnectedness which magnifies the celebrities but also the crooks.

psychoslave · on Jan 15, 2023

Well, it seems to me that none electronic outcomes tends to be slower and harder to copy at scale like digital material, doesn't it?

lapcat · on Jan 15, 2023

It's not clear that this is happening with all of the (many) sites that are mysteriously deindexed by Bing. See my comment: https://news.ycombinator.com/item?id=34389279

TrueGeek · on Jan 15, 2023

What should you do when another site copies your content like this then?

O1111OOO · on Jan 15, 2023

> What should you do when another site copies your content like this then?

Have we gotten to the point where websites (and their content) need to be verified like Twitter, Instagram, Facebook, and TikTok do for personal accounts?

If so, will search engines be the ones verifying - using this as a new revenue scheme (with the dangers inherent in this... ie; pay to be listed or ranked higher)?

lapcat · on Jan 15, 2023

I actually did add a Bing authentication XML file to my site, but Bing deindexed me again anyway.

Bing is the problem. It's broken.

linmob · on Jan 15, 2023

Can confirm, authentication files don't help. Bing is seriously broken, and their support is all but helpful. This is an excerpt from one of their replies:

> "Thank you for your patience during our investigation. After further review, it appears that your site did not meet the standards set by Bing to remain indexed the last time it was crawled. To ensure that this was not a false flag, I also escalated the issue to our Product Team and they manually reviewed your site and confirmed that it is in violation of our Webmaster Guidelines detailed here:

https://www.bing.com/webmaster/help/webmaster-guidelines-30f....

We are not able to provide specifics for these types of issues but we recommend that you review our Webmaster Guidelines, especially the section Things to Avoid, and thoroughly check your site for any deliberately or accidentally employed SEO techniques that may have adversely affected your standing in Bing and Bing-powered search results."

Before snarking, please check that link and the long lists of things - I did not find to find my website https://linmob.net to be offending their "things to avoid list".

That was a reply to my first ticket requesting re-indexation, later tickets only got what I would call "non-replies".

edgyquant · on Jan 15, 2023

No we’ve gotten to the point where Bing and DDG need disrupted. The answer is that these companies are ruining things not that DNS and simple search is wrong

thewebcount · on Jan 15, 2023

I’ve moved from DDG to Kagi.com. It’s paid, but no ads and no incentives to remove legit sites that I’ve seen.

edgyquant · on Jan 15, 2023

I have as well and have gotten at least one coworker to do the same.

posix86 · on Jan 15, 2023

No the problem here are Bing & co., their algorithm is clearly not good enough.

crazygringo · on Jan 15, 2023

Various things from addressing the problem directly if possible (block the IP address range they use to scrape your content with, insert JavaScript to strip the content client-side depending on the domain it's being served from), to changing their search engine behavior (canonical meta tags, contact the search engine to let them know, build up links on the web to make your site higher ranked).

The more sophisticated and popular the copycat site is (scraping from a distributed network, stripping most HTML tags, etc.), the harder it becomes, and the only thing is to contact the search and hope they can manually mark your domain as the authoritative one. Your success may vary according to your popularity/importance.

EarlKing · on Jan 15, 2023

> strip the content client-side depending on the domain it's being served from

That would almost certainly be regarded as a "doorway page"[1], resulting in you getting manually ranked downward or even deindexed entirely.

[1] https://developers.google.com/search/docs/essentials/spam-po...

crazygringo · on Jan 15, 2023

You may have misunderstood, it doesn't have anything to do with doorway pages which are about presenting extra highly redundant content.

I'm talking about, if content on legitimatesite.com includes JavaScript that detects if it's being loaded on any other domain, then erase the entire article's HTML from the DOM.

Obviously this is easily defeated by stripping out JavaScript, so it's useful only for very primitive mirroring.

EarlKing · on Jan 15, 2023

Hmmm, yeah "doorway page" probably wasn't the term I was looking for. However, that would almost definitely be regarded as some sort of spam or SEO tactic by crawlers and lead to further penalizations, which was my point.

gkbrk · on Jan 15, 2023

You send a DMCA.

pungentcomment · on Jan 15, 2023

Chances are that the site will be hosted outside the US and the DMCA is a US only copyright law afaik. I don't think it's applicable outside the US.

colejohnson66 · on Jan 15, 2023

In those cases, you can send one to DDG and Bing, which are in the US. It won’t affect the actual website or other search engines, but it’s better than nothing.

emurlin · on Jan 15, 2023

Indeed you can. I've had similar issues before (a site scraping content, though apparently from dev.to instead of my own domain and a YouTube channel making a 'video' with text and TTS from a post).

In the first case, I sent a DMCA to Google & Bing as well as to Cloudflare. Cloudflare responds by giving the name of the actual host, and I sent another DMCA to that host (they were US based, otherwise YMMV). The content was delisted (not the site, even though it was made up entirely of verbatim scraped content) from search engines and from the site.

Bottom line is you can send a DMCA notice to search engines and it appears to be effective. Actually, in case search engines demote sites like this in some way, I would send the DMCA notice to search engines _first_, because if the content gets removed from the original site they may not be able to verify the duplicate content.

bkirkby · on Jan 15, 2023

Ianal, but I think ddg could come back and claim that their service doesn't copy the content or that their abstract is fair use, so a dmca wouldn't apply to them.

If that happens I'd use language about the exclusive "public display" right that you have over your work.

I'm not aware of this claim being used for a dmca, but I'd like to see how such a claim turned out.

colejohnson66 · on Jan 15, 2023

IANAL either, but the DMCA has been used (by the RIAA, MPAA, and friends) to takedown pirate content on Google. I’d assume the same arguments would apply here. It can’t hurt to try.

yrro · on Jan 15, 2023

Linking to a site that infringes copyrights may make the linker liable on the grounds that they are committing contributory copyright infringement. It's cheaper for search engines to delist sites than fight that battle--particularly if their corporate masters also rely on licensing media for distribution themselves.

elwebmaster · on Jan 15, 2023

IANAL but could this be what the spam site has done to get OP delisted?

StreamBright · on Jan 15, 2023

This is exactly a perfect use case for a blockchain. In fact if people are interested we should create a POC.

0cf8612b2e1e · on Jan 15, 2023

How do all of those Stack Overflow mirrors stay up if there is a mechanism to pull the copy-cat?

psychoslave · on Jan 15, 2023

Maybe there is a retrocommission program to which they are affiliated?

supermatt · on Jan 15, 2023

This is not a shadow ban. A shadow ban is when it appears to you that you are not banned, but from others perspective you are.

tasuki · on Jan 15, 2023

Yes. Unfortunately people increasingly use "shadow ban" to just mean "ban", perhaps it sounds cool?

dack · on Jan 15, 2023

When you first hear a term used, I think it's natural for people just try to figure out what it means in context without looking up the official definition (I've caught myself doing this subconsciously before).

Imo a logical interpretation of "shadow ban" would be when you are banned but they didn't tell you they banned you, and regular "ban" is when they tell you you were banned. It makes enough sense that people don't think they need to look it up to confirm.

edit: funny enough, I did double-check the wikipedia page to make sure my understanding was correct, but upon reading further it does acknowledge the expanding of the definition: https://en.wikipedia.org/wiki/Shadow_banning

rspoerri · on Jan 15, 2023

Shadow banning is when they are doing extra steps to prevent you from realizing that you are banned. Usually giving you the impression of a working service, while everybody else will not be able to see your contributions.

Banning is just stopping the service for you, wether they tell you actively or not depends ob the service. No search service is actively informing you about the usage of your data, neigther are they telling you they stopped servicing you.

edgyquant · on Jan 15, 2023

This… this is not a logical interpretation of shadow ban, at best it’s a random guess. Interpretation implies some understanding of the data

garfij · on Jan 15, 2023

What makes "invisible to everyone but you" a more "logical" interpretation of "shadow ban" than "banned but didn't tell you"? Are shadows invisible to everyone but you?

gardenhedge · on Jan 15, 2023

When a user is (normal) banned, they are normally told and their access/ability is restricted.

When a user is shadow banned they are normally not told they are banned and are still able to access and perform functions. I think this secrecy is where the word "shadow" comes in. The user is in the dark about the ban...

edgyquant · on Jan 15, 2023

I didn’t say they were (though I think so.) I said it wasn’t the more obvious “logical” interpretation.

cactusplant7374 · on Jan 15, 2023

Elon wants to redefine the term as well but for the purpose of a coordinated witch hunt. People like using shadow ban because it sounds more malicious vs. content that is no longer actively promoted by a company. It's hard to claim you're a victim if the reason is you're just not that interesting or popular.

ilyt · on Jan 15, 2023

There is difference between "content that is no longer actively promoted by a company" and "the company explicitly put your content on a no-show list". Sure, it's not shadow banning but "just" stealth banning (as you're still not informed that you got banned, or reason for it), but banning nonetheless

cactusplant7374 · on Jan 15, 2023

Is it even a ban? Your followers can still see it. People that don't follow that go to your profile can still see it. Elon says no one has the right to "freedom of reach" but never describes this concept as a ban.

Jensson · on Jan 15, 2023

If you get delisted from search and don't show up on exact match searches then I'd say that they banned you in some way. That is what Twitter does and what Bing did here. We call what Twitter does "shadow ban" since they still show you to yourself, to people subscribing to you and people with a direct link, but nobody else can find you, it isn't a total shadow ban but they are doing something very similar to a shadow ban.

What shall we call that instead, "bubble ban", since it is like a shadow ban for specific bubbles and for everyone else it is as if that bubble never existed on the site?

3np · on Jan 15, 2023

> What shall we call that instead

Deindexed/delisted. Same as we called it before "shadow ban" was widely known as a term.

Jensson · on Jan 15, 2023

Delisted would imply that you wouldn't see your listing yourself, or that when you try to post a new listing they would stop you. Shadow banning is when you post to a listing, it says that your listing was successfully posted, but it wasn't really posted.

Twitter does that, it doesn't tell you that what you post wont be reached by the people you are posting to. Most people don't have many followers, they just reply to tweets and those replies will show up for the original tweeters. Twitter shadow banning you means that the items you post no longer shows up as responses. Sure the small subset that follows you can still see them, but 99.99999% of twitter wont see it, so it is a 99.99999% of a shadow ban.

If they told you that any of this happened anywhere it wouldn't be a shadow ban.

3np · on Jan 16, 2023

> Delisted would imply that you wouldn't see your listing yourself, or that when you try to post a new listing they would stop you.

It does not. Delisting = removed from list. Deindexed: removed from index.

Banning implies denying access. There's an important distinction there with Twitter (user authenticates and publishes through Twitter) vs Google Search (indexes public sites). If Google silently stops showing your Google Ads and doesn't inform you, that would be more appropriately described as "shadow banning".

There's no "user" or "account" in a web search engine so the term doesn't really fit.

michaelmrose · on Jan 15, 2023

There is no such thing as a total or less than total shadow ban. If you are visible with anyone with a direct link you aren't shadow banned it

Bubble ban is also poor verbiage because not appearing in anyone's feed or search results is the default condition not inherently a punishment or redaction.

A search is inherently a selection process and it's perfectly valid to say some content isn't fit to appear anywhere in a listing.

I like reddits choice of "quarantined"

Jensson · on Jan 15, 2023

> I like reddits choice of "quarantined"

It isn't the same at all, since Reddit tells you about it and it still shows up in search etc. Shadow quarantined, yeah that works. The word "shadow" comes from not telling the user that anything is different, and Twitter doesn't tell you when this happens, it just delists your posts from everywhere except your followers.

Anyway, it is extremely disingenuous to say that Twitter doesn't shadow ban.

Consider this scenario: Person A is shadow moderated by Twitter, has a friend B. A replies to B's tweet, but his friend doesn't follow him, they just talk. And now B will never see A's tweet, and A has no idea that B can't see it, and neither do B, both thinks that they can see each other. What is this if not "shadow banned"? Twitter makes these people post things thinking it will be seen, but it wont, wasting their time and potentially hurting their mental health since nobody responds.

For all intents and purposes this is "shadow banning", when you hurt people like this but say that you absolutely don't shadow ban you are so dishonest that I'd still call it a lie.

cactusplant7374 · on Jan 15, 2023

Reddit also shadow bans new accounts. You won't realize it until you open your user profile link in incognito.

bogwog · on Jan 15, 2023

That's not a ban at all. Hobby Lobby refusing to sell dildos is a ban, but deciding not to display unpopular merchandise in advertising isn't a ban.

extheat · on Jan 15, 2023

Well if you are silently placed on a hide list (by manual human intervention without you being able to know without a third party), then by all means that is a shadow ban.

Edit for clarity.

colanderman · on Jan 15, 2023

No, a shadow ban specifically is designed to mislead you into thinking you are not banned, to delay you attempting to work around the ban.

Regardless of whether you were notified, if you can see that you are banned, it is by definition not a shadow ban.

rvnx · on Jan 15, 2023

In this case, a shadow ban would mean that if the owner of this website searched for his site on Bing/DDG, it would appear normally to him, but would be hidden from everyone else.

ilyt · on Jan 15, 2023

That would be silent ban (if you're not getting informend that you are, or reason for it), or just plain ban.

Shadow ban is explicity "pretend to user they are not banned, but don't show it to everyone else"

Like say getting shadow banned on reddit or HN, you will see your stuff when you're logged in, but anonymous or other people wont.

Search equivalent would be you getting your own site when you're searching but nobody else does

causality0 · on Jan 15, 2023

"Shadow ban" would be a good term to use for that if it didn't already have a meaning that was different. But it does.

extheat · on Jan 15, 2023

Hence why I said silently. If I’m posting something and it appears to me as usual on my feed but not others, by all means that’s a shadow ban.

InspiredIdiot · on Jan 15, 2023

Nope. Still wrong. I understand there is some way it feels like that from your perspective but DDG and Bing don't own your feed. So they are hiding nothing from you. They are 100% ~up-front~ (edit) consistent about not choosing to show your site (ban it) and the fact that they don't control your feed doesn't make it any more accurate to apply the word "shadow" to their ban.

InspiredIdiot · on Jan 15, 2023

I have become what I hate most. Someone who didn't read the article. I've heard too many instances of the term being misapplied and jumped to a conclusion based on the discussion.

If you use Bing webmaster tools (a logged-in account for the use of the domain owner/content creator) and you can see indications that Bing indexed the content and no indications that errors preventing it from being eligible to show then it is certainly at least closer to a shadow ban than I originally thought.

Still, any reference to a "feed" entirely misses the point unless Bing is the one also serving that feed. I can't see any evidence that Bing displays a feed to the poster.

paulcole · on Jan 15, 2023

He’s kinda Trump-esque in his willingness to just redefine words (and basically pretend the original use never existed) to his own benefit and have his followers jump on board. Another great Elon example of this is Tesla’s “Autopilot.”

jimmygrapes · on Jan 15, 2023

I really don't think the willingness to refine words willy-nilly is a behavior I would associate first with Trump or Elon. Using the wrong word and insisting it means the same thing, maybe, but certainly not redefining words.

philippejara · on Jan 15, 2023

Trump-esque? what words has trump even done that for? Changing the meaning of words to fit what you want has been happening long before the printing press even was invented by people in power, vying for power, reporting on things they have a biased interest in or just wanting a quick win on an argument. Not everything needs to be a trump analogy.

paulcole · on Jan 15, 2023

Yeah I guess you’re right here. It’s more like they both go beyond ignoring the real world and just say whatever nonsense they feel like. Then they have a bunch of simps who won’t even question it.

ceejayoz · on Jan 15, 2023

Trump popularized the term “fake news”, generally meaning “real news I don’t like”.

andrewflnr · on Jan 15, 2023

I only ever see it used either (a) sarcastically or (b) to describe actual falsehood. I don't think he was successful in redefining it.

lokar · on Jan 15, 2023

And whatever covfefe is

ryandrake · on Jan 15, 2023

People forget: for a very, very brief moment during the 2016 US presidential campaign, the problem of actual "fake news" was in the spotlight. Literal fake news, where a pseudo-news site would put up stories that was entirely fictional, to promote a political POV, and present them as news. Trump swiftly neutralized the term by co-opting it and using it, as you say, to mean "real news I don't like". But for just a moment in time, it was originally used to describe actual fiction posing as news.

rapnie · on Jan 15, 2023

Is there also a term for being algorithmically suppressed on a social media platform, I wonder? I.e. a much more subtle, harder to dectect mechanism whereby the algorithms ensure you get some exposure, but never the same as other unsuppressed people would get based on similar activity. Or only exposure to a certain limited subset of the graph based on some metrics (e.g. just your 'friends' so no one points out you are effectively shadow banned).

Marazan · on Jan 15, 2023

Yes, its called Algorithmic Supression.

That doesn't sound as cool and victimey though.

rapnie · on Jan 15, 2023

Could call it a RoboBlock in popular language, with reference to a roadblock. An obstruction to engage enforced by our machine overlords. Or better maybe, a RoboGag.

spoiler · on Jan 15, 2023

How about Algorithmic Oppression

cactusplant7374 · on Jan 15, 2023

If the algorithm is actually a true measure of engagement or usefulness of your site is it just plain 'ol search engine justice?

jokowueu · on Jan 15, 2023

AlgoOped

colanderman · on Jan 15, 2023

"Automatically downranked/downweighted/penalized" are terms I've heard.

hutzlibu · on Jan 15, 2023

I think unlike in the case here, (soft) shadow banning would be appropriate to describe it, even though not 100% technical correct.

cma · on Jan 15, 2023

I believe Musk when condemning old twitter for doing it and deboosting when he promises new Twitter will do it.

badrabbit · on Jan 15, 2023

That is what is happening, you think you are not showing up because bad SEO or better results. You have to find out through experimentatiom that you are restricted. The moderator didn't let you know that they have taken punitive action against you.

supermatt · on Jan 18, 2023

That would just be a ban. You dont need to be told you are banned.

seanhunter · on Jan 15, 2023

Indeed. Some people seem to be trying to make it apply any time the top result of an algorithmic ranking isn't what they think it should be (eg if they have been deboosted rather than shadow banned).

JadoJodo · on Jan 15, 2023

I suspect OP used it in the way that “I believe all my posts are showing up in the places that they should, but unbeknownst to me, they are being suppressed.” In this instance, I could see a “search engine shadowban” being an appropriate moniker.

InspiredIdiot · on Jan 15, 2023

I think a good test of whether this application of the term makes any sense is: Could any search engine ban ever not be a shadow ban? We already have a term: ban. Let's just use that one and stop conflating things and being unnecessary imprecise and incendiary. It helps certain parties' (edit plural possessive) agenda but does not help us clearly communicate.

ffhhj · on Jan 15, 2023

The author is extending the concept to include an inadverted ban. Why would Bing warn him anyways, since there is no user account? Welcome to Cancelbannia.

Dylan16807 · on Jan 15, 2023

I think being banned from search could be part of a shadow ban, but when the entire service is search that's just a ban.

mkl · on Jan 15, 2023

If it was a shadow ban he would see his site when he searched but we wouldn't. This is definitely not that.

Dylan16807 · on Jan 15, 2023

Are you talking about normal sites? Not search engines? If so I don't really agree. I don't think you have to go through that much effort as part of implementing a shadow ban. If posts show up to the user in most places, then good enough it's a shadow ban.

Only because this is a search engine do I say this solidly isn't a shadow ban.

mkl · on Jan 15, 2023

Well, I don't understand the distinction you're making, as search doesn't seem special here to me. The key feature that makes something a shadow ban is that everything looks normal to the shadow-banned person, but the rest of us don't see their stuff or see it in some restricted way. That seems like it would apply equally to search, forum comments, etc.

travisgriggs · on Jan 15, 2023

Clearly we need a new term. I nominate

Ghost Banned

or

Ghostdexed

zxcvbn4038 · on Jan 15, 2023

Check with Stanford first, “ghost” might be offensive to the living impaired. =P

stephencanon · on Jan 15, 2023

Outdexed, clearly.

donatj · on Jan 15, 2023

This triggered me to DuckDuckGo my own site and immediately I notice the top result is someone rehosting my OSS on a page loaded with pages of crap SEO content.

Scrolling further, I don’t seem to find my own site either… https://donatstudios.com

I’ve added my site into Bing webmaster tools, we’ll see if it helps I guess.

HomeDeLaPot · on Jan 15, 2023

Wow. I think I've used your circle generator on that other site without realizing it. That sucks. I've bookmarked yours now!

donatj · on Jan 15, 2023

A bunch of sites popped up hosting it loaded with SEO garbage and ads. I’d licensed it MIT so while they certainly can, it sure doesn’t feel nice.

esperent · on Jan 15, 2023

I've writing and freely shared a lot of OSS code and I plan to continue doing that. All MIT licensed.

But I keep thinking that this is a limitation of MIT. It's written with the purest of good intentions but without any way to prevent bad actors from exploiting that.

I wonder if there's a better kind of license that provides a close level of freedom but would prevent some of the most obvious exploitations (e.g. packaging and selling the code, or reposting on a site with advertising)?

There's copyleft licensing like LGPL but perhaps that goes too far in the other direction and besides, it seems to be very unpopular amongst web developers so I'm afraid if I release any LGPL code it won't get used much or attract contributors.

Is there a happy medium between these?

PeterisP · on Jan 15, 2023

Every open source license will by definition allow packaging and selling the code or reposting it on a site with advertising, permitting to do that is one of the key freedoms; if you're forbidden to use the code for commercial purposes, then that's not an open source license. See https://opensource.org/osd

MIT will allow that, Apache/BSD/Mozilla licenses will allow that, GPL and LGPL will allow that, Creative commons CC-BY and CC-BY-SA will allow that - the only difference is extra conditions e.g. GPL will require the seller/redistributor of the code to keep the same license, CC-BY will require leaving attribution to the author, etc, but all of them will allow someone else to redistribute the code for commercial purposes.

bcrosby95 · on Jan 15, 2023

I'm not a lawyer, but my understanding of LGPL is that it wouldn't prevent people from doing what they're doing in this scenario.

bo1024 · on Jan 15, 2023

One of the Creative Commons licenses like cc-by?

tinus_hn · on Jan 15, 2023

Same thing happened with StackExchange and Wikipedia, but they appear to have fixed it.

mtlynch · on Jan 15, 2023

>One “out there” reason I can think is that I use Amazon Affiliate links on my Bookshelf and my /Uses page and that triggers a shadow ban?

It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.

Per Amazon:

>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.

https://affiliate-program.amazon.com/help/node/topic/GHQNZAU...

Per FTC:

>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.

https://www.ftc.gov/business-guidance/resources/ftcs-endorse...

mananaysiempre · on Jan 15, 2023

For some reason, Beej’s Guide to C Programming is also banned from Bing (and consequently DDG) [1], with the standard robotic non-explanations given when the author asked, even though the rest of the site is not.

[1] https://beej.us/guide/bgc/whynoddg.html

Pelam · on Jan 15, 2023

I can find Beej.us with DDG, but not daverupert.com. Maybe Beej got the problem resolved somehow.

mananaysiempre · on Jan 15, 2023

He says in the link it’s specifically the C guide, the rest of the website is fine. Though... yeah, DDG queries like “beej c guide strlen” give reasonable results for me, if with an unjustifiably high-ranked position for the mirror at http://docs.hfbk.net/beej.us. Bing ones only include the mirror and the other guides (and a Scribd-hosted PDF copy, of all things, as the first result below a huge navigation card referring to https://beej.us/guides but without the C guide among the links).

projektfu · on Jan 15, 2023

Incidentally, what is the "legal status" of Scribd hosting a partial preview of works that they tell you are BY-NC (attribution/non-commercial use only) and telling you to become a member to be able to view the whole thing? Is that not a commercial use?

eps · on Jan 15, 2023

https://beej.us/guide/bgclr/ is the 2nd result for "beej's c", with first being site's homepage.

beej71 · on Jan 15, 2023

Ah, I was just wondering if splitting the book in two volumes (900 pages is more than Amazon can print!) would impact this.

But it still doesn't index the first volume...?

Thanks for the info.

pseudolus · on Jan 15, 2023

You've indicated that you've used Bing's tools to see if your website has been indexed but are silent as to whether you've actually manually submitted your site to be indexed by Bing using their url submission tool [0]. If you do submit the URL and then, after a decent interval, your site still doesn't show up then there might be something to your claim.

[0] https://www.bing.com/webmasters/help/url-submission-62f2860b

Liquix · on Jan 15, 2023

To be fair, a decade-old SFW blog with 2.2k crawler hits ought to be automatically indexed by any major search engine.

beej71 · on Jan 15, 2023

I'm not the OP, but I have a site with the same issue, and I did manually submit. No impact.

lapcat · on Jan 15, 2023

See "Bing and DuckDuckGo removed my business web site" https://lapcatsoftware.com/articles/bing.html and "My website disappeared from Bing and DuckDuckGo, Part 2" https://www.jessesquires.com/blog/2022/07/25/my-website-disa...

[EDIT] I just published a new blog post "Bing and DuckDuckGo removed my business web site AGAIN" https://lapcatsoftware.com/articles/bing2.html

Sigh.

mg · on Jan 15, 2023

Looking around his page, I see some interesting aspects on his atom page:

https://daverupert.com/atom.xml

First, he sends it with a "content-type: application/xml" header. In contrast to most sites that send it with "content-type: application/atom+xml". Which seems to have the nice effect that it renders in Firefox instead of opening the usual "What should Firefox do with this file?" popup.

Secondly, he provides this nice header text "Yahaha, you found me! This is my RSS feed.". It seems to be fetched via this part of the code:

<?xml-stylesheet href="/pretty-feed-v3.xsl" type="text/xsl"?>

Pretty nice. Are those best practices? Or will "content-type: application/xml" mess with users who have a native feed reader installed and expect the reader to kick in when they click on a feed url?

rzzzt · on Jan 15, 2023

XSLT is a template language where you match snippets of XML using XPath expressions and output... anything else, in this case HTML using templates that can make use of the attributes, inner content, etc. of the captured XML portion.

"An elegant weapon for a more... civilized age."

Semaphor · on Jan 15, 2023

> "An elegant weapon for a more... civilized age."

One of the ideas people had back then was having a product catalog in XML that your tool would give you, and that you could upload to your website and display as a nice website via XSLT.

It was such a fascinating technology when I learned about it, but I’m not sure if it ever saw any serious use? Maybe in enterprise?

baald · on Jan 15, 2023

Well, I'm having the ... pleasure ... of learning it now, as much of CheetahMail's templating system uses it (at least as initially configured by their professional services team. The company I work for just switched over and had them do the initial implementation). Just a data point.

Semaphor · on Jan 16, 2023

We are using a mail system that uses a special version of VbScript. Not sure you’d want to trade ;)

antod · on Jan 16, 2023

It had some use. I'm pretty sure IE used to support client side transformations when rendering XML that linked an XSLT file. I know one SaaS used to do that before they rearchitected their front end.

But XSLT was not fun work with.

Semaphor · on Jan 16, 2023

IIRC all browsers supported it, I played around with it back then. I just wondered about in-the-wild use. Though the others comments say there is/was some.

rzzzt · on Jan 16, 2023

I had a brief encounter with XSLT in a university course that used Oracle's software suite for demonstration purposes. I think it was related to the XSQL framework: https://docs.oracle.com/cd/B10501_01/appdev.920/a96621/adx09...

jbotdev · on Jan 15, 2023

It’s still used by the SEC to render company filings on their EDGAR site. I can’t find the exact files/docs, but I’ve rendered filings like this using the XML and XSLT they provide: https://www.sec.gov/ix?doc=/Archives/edgar/data/320193/00003...

mileza · on Jan 15, 2023

I don't know if that's related to your question, but I've used XSLT templates with Apache FOP before to render PDF documents.

leni536 · on Jan 15, 2023

I would never call XSLT elegant, it is horrible IMO.

masklinn · on Jan 15, 2023

XSLT may well be the worst possible implementation of an elegant underlying concept.

marginalia_nu · on Jan 15, 2023

I got you, fam.

https://search.marginalia.nu/site/daverupert.com

sct202 · on Jan 15, 2023

You can submit a ticket to bing via their webmaster tools website. I've done it before in the past and a real human did respond at the time. In my experience bing will straight deindex full websites for unknown reasons while Google will ranking penalize but leave you searchable if their algo feels you deserve it.

linmob · on Jan 15, 2023

Did you manage to get re-indexed? I did not, and after the first attempt, replies to my tickets became quite rare.

henriquez · on Jan 15, 2023

Did you perhaps abbreviate Microsoft as M$ (with a dollar sign)-? That really pisses them off.

ducklingquack · on Jan 15, 2023

Had the same problem due to negative SEO campaigns by naughty competitors. Wrote to Bing Webmaster Tools Support team (https://www.bing.com/webmasters/help/webmaster-support-24ab5...) and after a lengthy process got a response that ”the issue” had been addressed.

It’s been a few months since and my website is indeed back in the search results so I advice whoever is having this problem to reach out to Bing.

vcg3rd · on Jan 15, 2023

Your site doesn't get a hit, but it's referenced enough to find it. One hit was https://www.seoaudit365.com/domain/daverupert.com

It says your IP doesn't direct to your site. I wonder if that's the problem.

tambre · on Jan 15, 2023

It would be quite surprising if reverse records were the reason. The vast majority of sites certainly don't have them pointing back. Impossible to do with many providers and probably basically all CDNs.

Joel_Mckay · on Jan 15, 2023

Negative SEO scammers use intentional search-policy violations to push down the rank of perceived competitors for a few weeks.

While it is more likely the poster will get a few people to check on the situation and naively drive up page rank... a personal site is just a rounding error for traffic in a long-tail distribution known as the modern web.

Most search engines will correlate user-side telemetry traffic against crawler and web stats. i.e. if the bots tend to prefer your site for abnormal reasons, the ranking algorithm may blacklist a signature, domain, and IP sets for several weeks as punishment.

Note too, it is still common for a human employee to manually check a suddenly popular site that pops up out of obscurity. i.e. this catches the more sophisticated cheats, and may have legal repercussions in severe cases.

In summary, if you mess with modern search engines, than expect the ban hammer to fall eventually. ;)

TomK32 · on Jan 15, 2023

The W3 HTML Validator is complaining about a few things, not sure which one could make bing think the website is not worth indexing, but it's worth a shot fixing those issues. https://validator.w3.org/nu/?doc=https%3A%2F%2Fdaverupert.co...

omgmajk · on Jan 15, 2023

At the bottom of the page there is a comment that says "Some results have been removed", but unlike Google you can't see them. Would be interesting to know if the domain is among the removed results or if the site has not just been indexed yet.

dazc · on Jan 15, 2023

Sites get de-indexed from DDG and Bing all the time because negative SEO attacks (intended for your Google traffic) have a devastating effect.

Why, Bing does not totally ignore bad links.

The good news, all sites I've seen affected by this recover after a few weeks.

badrabbit · on Jan 15, 2023

HN taught me first hand on how horrible shadow banning is. You all tolerate this here so it's mighty hypocritical of you to criticize Bing.

I'll say it again: It comes down to how you treat people. Treat others the way you want to be treated. No one wants to be shadowbanned and we can all agree it is a decidedly cowardly and cruel thing to do.

And you can't use "quality" or anything short of being coerced as an excuse. Techniques and technologies to moderate people without shadowmodding at scale are mot just there but very well established. A site for technologists has no excuse to shadowmod other than elitism amorality.

maxbond · on Jan 16, 2023

HN's "shadow bans" aren't hidden and allow you to keep posting. Isn't that more tolerant than most other forms of banning?

I have showdead turned on. I see fresh accounts that are automatically dead on each comment which are legit contributions, probably because they're using Tor or a widely abused VPN; that's the only common miscarriage of HN moderation I regularly see, and I vouch for house comments. Those accounts should be in the clear after a week or something like that. I have a couple comments I feel shouldn't be dead, but I can see how others would feel differently, and I have I believe 3 dead comments out of >3000 (many of which expressed views others vocally disagreed with, and I generally feel my views are not particularly popular on HN). But most of the dead comments I see are obviously harmful to discussion. The last time I saw hate speech from a banned HN account - was earlier today. What is it I'm missing here?

It's all well and good to say, treat others as you'd like to be treated. But I don't want to be harassed either. So I forgo harassing people sure. But what's to be done about the people harassing me?

Are you perhaps unaware of a phenomenon called the paradox of tolerance where, if you extend universal tolerance to everyone, including those who use their speech to silence others (through threats, harassment, shouting over people, poisoning the well, etc), you still end up with a forum in which not everyone can share their ideas?

badrabbit · on Jan 16, 2023

Dead comments are not shadowbanned, just banned. I wasn't referring to that. And the problem with shadowmoderation is it happens without attempting to tell the person to change behavior so people who will cooperate don't get a chance to do that. If someone does not cooperate and acts with bad or false intentions you should ban them explicitly. I never suggested universal tolerance I only suggested transparent moderation.

maxbond · on Jan 16, 2023

The only people I've seen get banned without warning are transparent trolls (usually people who make an account to make each comment, with that comment being trolling or hate speech, because they understand that that will be banned). People who aren't purely trolling do seem to get a public message from dang. Does your experience differ? How?

badrabbit · on Jan 15, 2023

Look at my post history and karma lol. I get exactly 3 karma and it gets hidden. I should do a line chart and post it somewhere outside HN lol.

RandomWorker · on Jan 16, 2023

https://daverupert.com/robots.txt

Maybe change this? Simply add:

User-agent: * Disallow:

To allow all crawlers to the site.

kordlessagain · on Jan 15, 2023

No mention of robots.txt on the post, nor here in the comments.

After a bit of poking around, it would appear Bing may need an allow block to crawl. I don't know what DDG does, but the author's site effectively has nothing in the robots.txt file, other than a commented out Disallow block. From doing this before in the past, I suggest include the following:

  User-agent: *
  Allow: /

Twirrim · on Jan 15, 2023

Okay, so this prompted me to look at Big Webmaster tools. It looks like they recognise my sitemap file, but haven't bothered to index it in nearly a decade. "Last processed: 6/6/2014". I know I don't post new content that frequently but... WTF?

Anon4Now · on Jan 15, 2023

The background image doesn't render on the homepage in Firefox. This makes the blog links appear as very light blue text on a white background. Makes me wonder how it renders for the crawler and whether it's getting flagged for invisible content.

jasmer · on Jan 16, 2023

I wonder if we need basic regs given that search is a public good, including best practices such as SE transparency. If they search you, they have to say something about the parameters and the results etc..

fruit2020 · on Jan 15, 2023

Tangential question to the site index. How does one get the TLD and ccTLD zone data, apparently it’s not that open. There are some ccTLDs which give this info, for example .ch.

supermatt · on Jan 15, 2023

The Centralized Zone Data Service: https://czds.icann.org

Jakob · on Jan 15, 2023

I asked myself the same thing last week and published a how to, like the sibling post says, the czds is the way: https://www.jakobstoeck.de/2023/download-and-query-all-domai...

cramjabsyn · on Jan 15, 2023

It’s not a shadow ban if you can easily see that you’ve been banned.

The point of a shadow ban is for the banned user to not notice.

This sounds like a regular old ban/blocklist.

Macha · on Jan 15, 2023

It's not a missing meta description at least, the same is true for my site and I'm still ranking relatively highly for some keywords on DDG.

obarthelemy · on Jan 15, 2023

"Southern gentleman" conjures up the image of a white-suited Kevin Spacey in "Midnight in the Garden of Good and Evil". Not sure it makes for a strong claim to respectability/morality, both fictionally and IRL ;-p

imwillofficial · on Jan 15, 2023

This article was a waste of time. Nothing was learned of value.

svnpenn · on Jan 15, 2023

numbers look like garbage, I think this is the culprit:

    font-family: system-ui, 'Noto Emoji', sans-serif;

teekert · on Jan 15, 2023

M$ is such a black box. At some point getting on their spam list, for no reason and still I’m not 100% certain, is the reason I stopped self hosting email.

reaperducer · on Jan 15, 2023

Blame SpamCop.

I recently discovered that one of my services is on stage MS email naughty list, and found out it's because MS uses SpamCop.

I contacted SpamCop and it was very responsive. Unfortunately, the solution SpamCop suggested was to move the entire project to a different provider.

sergiomattei · on Jan 15, 2023

Perhaps you haven’t been indexed yet?

freitasm · on Jan 15, 2023

The blog seems to be going since at least 2009. Long enough to be indexed.

Perhaps some server-side filtering going on, blocking the bingbot user-agent?

shirback · on Jan 15, 2023

Microsoft thinks you're Dave Rubin because.. well, it's Microsoft.

Hani1337 · on Jan 15, 2023

duckduckgo is a joke. use gibiru