I have exactly this problem. Beej's Guide to Network Programming is indexed just fine. Beej's Guide to C won't index.
The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.
All the other guides index just fine.
I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.
Recently I split the C guide in two. I'll have to check to see if that made any difference.
But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.
Where "weird" involves words like "licking" and "party" (not saying it's not a bug, just that it's a statistics vs actual language understanding bug in a feature and not absence of that feature). I bet there's no way to compose all of the words "spatula", "serotonin", "pion" and "deconstruction" along with words like is/a/an/of/how/what/when that would turn safe search off, despite any query of this format would pretty weird.
Took me a while to figure it out, but yeah - seems "Beej" is pronounced as "(Bee)(j)", which matches the pronunciation of "BJ". But I don't think it's relevant to site indexing, unless search engines started to take homophones into account.
EDIT: but maybe they did, ever since voice assistants became a thing?
If I search "Beej" on Google without Safesearch enabled, I get 14.3M results, if I turn on the Safesearch filter, it still returns 14.3M results.
If I repeat the same experiment with "blowjob", it's 1.5B results vs 23M.
If I search for "Beej" on Bing with SafeSearch Off, I get 2,840,000 results, while with Safesearch on Strict, I get 2,800,000 results. I couldn't search for "blowjob" at all with Safesearch on Strict.
Wow it's beej. I owe you rather a lot of beers, Guide to Network Programming is directly responsible for my entire career. Sorry for the low value post! :}
Backlinks are links to your website used by algorithms like PageRank^1 to weight how important your site is. Roughly speaking, more links, especially from sites with their own high pagerank = higher rank.
Some other website linking to your website. The general premise is that if other sites with some known reputation are linking to you, you have a bit more credibility than a completely isolated site that nobody else references.
I may have found the answer, and I've seen this before (it happened to me once). It's when a different spam site copies your content wholesale, and a search engine decides they're the "original" site, and you're the spammy copycat.
Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)
How to fix it really depends on what techniques they're using to mirror your site, of which there are many.
It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.
Nowadays, when I search for things, the results are often clearly pages that have come from a program scrapping sites and then merging them into one page. You can tell because the pages are not really coherent and quickly start to repeat themselves. I assume they are getting money through ads on the pages, though I never actually see the ads because of my blockers. I wish there was a button in the browser that I could click to report the page as spam to all search engines.
That's precisely why I started to use and pay for Kagi - it fetches search results from Google and Bing, but allows to prioritize, deprioritize, pin or block specific domains in your search results.
I'm still surprised no one else seems to be offering this feature.
Seems reasonable. If enough human-like users (with gmail accounts, yt activity, and other indicators) ban a particular site enough times that should offer some evidence that it's low quality too.
I don't think it reports to any other search engines, and am not sure it affects any but your own subsequent results, but Kagi has a feedback scale on each result: "block", "lower", "normal", "raise", and "pin".
> It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.
The problem is that the two work hand-in-hand, thanks to the advertising driven search model, and the search engines owning the main advertising platforms.
It should be easy for search engines to identify an original site from the SEO spammer rip-offs - the original site is going to have no adverts (or certainly fewer) while the SEO spammer copies are going to be covered in adverts. The problem is that the search engines have no incentive to do so, in fact if anything they have the incentive to send people to the sites with more adverts.
And of course the whole problem has been created by the search engines in the first place - there would be no point in SEO spammers making advert-laden ripoff sites if it wasn't to rake in advertising revenue.
No more hostile than real world, we are just finding out it is a reflection of our world, of course the difference being the global interconnectedness which magnifies the celebrities but also the crooks.
> What should you do when another site copies your content like this then?
Have we gotten to the point where websites (and their content) need to be verified like Twitter, Instagram, Facebook, and TikTok do for personal accounts?
If so, will search engines be the ones verifying - using this as a new revenue scheme (with the dangers inherent in this... ie; pay to be listed or ranked higher)?
Can confirm, authentication files don't help. Bing is seriously broken, and their support is all but helpful.
This is an excerpt from one of their replies:
> "Thank you for your patience during our investigation. After further review, it appears that your site did not meet the standards set by Bing to remain indexed the last time it was crawled. To ensure that this was not a false flag, I also escalated the issue to our Product Team and they manually reviewed your site and confirmed that it is in violation of our Webmaster Guidelines detailed here:
We are not able to provide specifics for these types of issues but we recommend that you review our Webmaster Guidelines, especially the section Things to Avoid, and thoroughly check your site for any deliberately or accidentally employed SEO techniques that may have adversely affected your standing in Bing and Bing-powered search results."
Before snarking, please check that link and the long lists of things - I did not find to find my website https://linmob.net to be offending their "things to avoid list".
That was a reply to my first ticket requesting re-indexation, later tickets only got what I would call "non-replies".
No we’ve gotten to the point where Bing and DDG need disrupted. The answer is that these companies are ruining things not that DNS and simple search is wrong
Various things from addressing the problem directly if possible (block the IP address range they use to scrape your content with, insert JavaScript to strip the content client-side depending on the domain it's being served from), to changing their search engine behavior (canonical meta tags, contact the search engine to let them know, build up links on the web to make your site higher ranked).
The more sophisticated and popular the copycat site is (scraping from a distributed network, stripping most HTML tags, etc.), the harder it becomes, and the only thing is to contact the search and hope they can manually mark your domain as the authoritative one. Your success may vary according to your popularity/importance.
You may have misunderstood, it doesn't have anything to do with doorway pages which are about presenting extra highly redundant content.
I'm talking about, if content on legitimatesite.com includes JavaScript that detects if it's being loaded on any other domain, then erase the entire article's HTML from the DOM.
Obviously this is easily defeated by stripping out JavaScript, so it's useful only for very primitive mirroring.
Hmmm, yeah "doorway page" probably wasn't the term I was looking for. However, that would almost definitely be regarded as some sort of spam or SEO tactic by crawlers and lead to further penalizations, which was my point.
In those cases, you can send one to DDG and Bing, which are in the US. It won’t affect the actual website or other search engines, but it’s better than nothing.
Indeed you can. I've had similar issues before (a site scraping content, though apparently from dev.to instead of my own domain and a YouTube channel making a 'video' with text and TTS from a post).
In the first case, I sent a DMCA to Google & Bing as well as to Cloudflare. Cloudflare responds by giving the name of the actual host, and I sent another DMCA to that host (they were US based, otherwise YMMV). The content was delisted (not the site, even though it was made up entirely of verbatim scraped content) from search engines and from the site.
Bottom line is you can send a DMCA notice to search engines and it appears to be effective. Actually, in case search engines demote sites like this in some way, I would send the DMCA notice to search engines _first_, because if the content gets removed from the original site they may not be able to verify the duplicate content.
Ianal, but I think ddg could come back and claim that their service doesn't copy the content or that their abstract is fair use, so a dmca wouldn't apply to them.
If that happens I'd use language about the exclusive "public display" right that you have over your work.
I'm not aware of this claim being used for a dmca, but I'd like to see how such a claim turned out.
IANAL either, but the DMCA has been used (by the RIAA, MPAA, and friends) to takedown pirate content on Google. I’d assume the same arguments would apply here. It can’t hurt to try.
Linking to a site that infringes copyrights may make the linker liable on the grounds that they are committing contributory copyright infringement. It's cheaper for search engines to delist sites than fight that battle--particularly if their corporate masters also rely on licensing media for distribution themselves.
When you first hear a term used, I think it's natural for people just try to figure out what it means in context without looking up the official definition (I've caught myself doing this subconsciously before).
Imo a logical interpretation of "shadow ban" would be when you are banned but they didn't tell you they banned you, and regular "ban" is when they tell you you were banned. It makes enough sense that people don't think they need to look it up to confirm.
edit: funny enough, I did double-check the wikipedia page to make sure my understanding was correct, but upon reading further it does acknowledge the expanding of the definition: https://en.wikipedia.org/wiki/Shadow_banning
Shadow banning is when they are doing extra steps to prevent you from realizing that you are banned. Usually giving you the impression of a working service, while everybody else will not be able to see your contributions.
Banning is just stopping the service for you, wether they tell you actively or not depends ob the service. No search service is actively informing you about the usage of your data, neigther are they telling you they stopped servicing you.
What makes "invisible to everyone but you" a more "logical" interpretation of "shadow ban" than "banned but didn't tell you"? Are shadows invisible to everyone but you?
When a user is (normal) banned, they are normally told and their access/ability is restricted.
When a user is shadow banned they are normally not told they are banned and are still able to access and perform functions. I think this secrecy is where the word "shadow" comes in. The user is in the dark about the ban...
Elon wants to redefine the term as well but for the purpose of a coordinated witch hunt. People like using shadow ban because it sounds more malicious vs. content that is no longer actively promoted by a company. It's hard to claim you're a victim if the reason is you're just not that interesting or popular.
There is difference between "content that is no longer actively promoted by a company" and "the company explicitly put your content on a no-show list". Sure, it's not shadow banning but "just" stealth banning (as you're still not informed that you got banned, or reason for it), but banning nonetheless
Is it even a ban? Your followers can still see it. People that don't follow that go to your profile can still see it. Elon says no one has the right to "freedom of reach" but never describes this concept as a ban.
If you get delisted from search and don't show up on exact match searches then I'd say that they banned you in some way. That is what Twitter does and what Bing did here. We call what Twitter does "shadow ban" since they still show you to yourself, to people subscribing to you and people with a direct link, but nobody else can find you, it isn't a total shadow ban but they are doing something very similar to a shadow ban.
What shall we call that instead, "bubble ban", since it is like a shadow ban for specific bubbles and for everyone else it is as if that bubble never existed on the site?
Delisted would imply that you wouldn't see your listing yourself, or that when you try to post a new listing they would stop you. Shadow banning is when you post to a listing, it says that your listing was successfully posted, but it wasn't really posted.
Twitter does that, it doesn't tell you that what you post wont be reached by the people you are posting to. Most people don't have many followers, they just reply to tweets and those replies will show up for the original tweeters. Twitter shadow banning you means that the items you post no longer shows up as responses. Sure the small subset that follows you can still see them, but 99.99999% of twitter wont see it, so it is a 99.99999% of a shadow ban.
If they told you that any of this happened anywhere it wouldn't be a shadow ban.
> Delisted would imply that you wouldn't see your listing yourself, or that when you try to post a new listing they would stop you.
It does not. Delisting = removed from list. Deindexed: removed from index.
Banning implies denying access. There's an important distinction there with Twitter (user authenticates and publishes through Twitter) vs Google Search (indexes public sites). If Google silently stops showing your Google Ads and doesn't inform you, that would be more appropriately described as "shadow banning".
There's no "user" or "account" in a web search engine so the term doesn't really fit.
There is no such thing as a total or less than total shadow ban. If you are visible with anyone with a direct link you aren't shadow banned it
Bubble ban is also poor verbiage because not appearing in anyone's feed or search results is the default condition not inherently a punishment or redaction.
A search is inherently a selection process and it's perfectly valid to say some content isn't fit to appear anywhere in a listing.
It isn't the same at all, since Reddit tells you about it and it still shows up in search etc. Shadow quarantined, yeah that works. The word "shadow" comes from not telling the user that anything is different, and Twitter doesn't tell you when this happens, it just delists your posts from everywhere except your followers.
Anyway, it is extremely disingenuous to say that Twitter doesn't shadow ban.
Consider this scenario: Person A is shadow moderated by Twitter, has a friend B. A replies to B's tweet, but his friend doesn't follow him, they just talk. And now B will never see A's tweet, and A has no idea that B can't see it, and neither do B, both thinks that they can see each other. What is this if not "shadow banned"? Twitter makes these people post things thinking it will be seen, but it wont, wasting their time and potentially hurting their mental health since nobody responds.
For all intents and purposes this is "shadow banning", when you hurt people like this but say that you absolutely don't shadow ban you are so dishonest that I'd still call it a lie.
Well if you are silently placed on a hide list (by manual human intervention without you being able to know without a third party), then by all means that is a shadow ban.
In this case, a shadow ban would mean that if the owner of this website searched for his site on Bing/DDG, it would appear normally to him, but would be hidden from everyone else.
Nope. Still wrong. I understand there is some way it feels like that from your perspective but DDG and Bing don't own your feed. So they are hiding nothing from you. They are 100% ~up-front~ (edit) consistent about not choosing to show your site (ban it) and the fact that they don't control your feed doesn't make it any more accurate to apply the word "shadow" to their ban.
I have become what I hate most. Someone who didn't read the article. I've heard too many instances of the term being misapplied and jumped to a conclusion based on the discussion.
If you use Bing webmaster tools (a logged-in account for the use of the domain owner/content creator) and you can see indications that Bing indexed the content and no indications that errors preventing it from being eligible to show then it is certainly at least closer to a shadow ban than I originally thought.
Still, any reference to a "feed" entirely misses the point unless Bing is the one also serving that feed. I can't see any evidence that Bing displays a feed to the poster.
He’s kinda Trump-esque in his willingness to just redefine words (and basically pretend the original use never existed) to his own benefit and have his followers jump on board. Another great Elon example of this is Tesla’s “Autopilot.”
I really don't think the willingness to refine words willy-nilly is a behavior I would associate first with Trump or Elon. Using the wrong word and insisting it means the same thing, maybe, but certainly not redefining words.
Trump-esque? what words has trump even done that for? Changing the meaning of words to fit what you want has been happening long before the printing press even was invented by people in power, vying for power, reporting on things they have a biased interest in or just wanting a quick win on an argument. Not everything needs to be a trump analogy.
Yeah I guess you’re right here. It’s more like they both go beyond ignoring the real world and just say whatever nonsense they feel like. Then they have a bunch of simps who won’t even question it.
People forget: for a very, very brief moment during the 2016 US presidential campaign, the problem of actual "fake news" was in the spotlight. Literal fake news, where a pseudo-news site would put up stories that was entirely fictional, to promote a political POV, and present them as news. Trump swiftly neutralized the term by co-opting it and using it, as you say, to mean "real news I don't like". But for just a moment in time, it was originally used to describe actual fiction posing as news.
Is there also a term for being algorithmically suppressed on a social media platform, I wonder? I.e. a much more subtle, harder to dectect mechanism whereby the algorithms ensure you get some exposure, but never the same as other unsuppressed people would get based on similar activity. Or only exposure to a certain limited subset of the graph based on some metrics (e.g. just your 'friends' so no one points out you are effectively shadow banned).
Could call it a RoboBlock in popular language, with reference to a roadblock. An obstruction to engage enforced by our machine overlords. Or better maybe, a RoboGag.
That is what is happening, you think you are not showing up because bad SEO or better results. You have to find out through experimentatiom that you are restricted. The moderator didn't let you know that they have taken punitive action against you.
Indeed. Some people seem to be trying to make it apply any time the top result of an algorithmic ranking isn't what they think it should be (eg if they have been deboosted rather than shadow banned).
I suspect OP used it in the way that “I believe all my posts are showing up in the places that they should, but unbeknownst to me, they are being suppressed.” In this instance, I could see a “search engine shadowban” being an appropriate moniker.
I think a good test of whether this application of the term makes any sense is: Could any search engine ban ever not be a shadow ban? We already have a term: ban. Let's just use that one and stop conflating things and being unnecessary imprecise and incendiary. It helps certain parties' (edit plural possessive) agenda but does not help us clearly communicate.
The author is extending the concept to include an inadverted ban. Why would Bing warn him anyways, since there is no user account? Welcome to Cancelbannia.
Are you talking about normal sites? Not search engines? If so I don't really agree. I don't think you have to go through that much effort as part of implementing a shadow ban. If posts show up to the user in most places, then good enough it's a shadow ban.
Only because this is a search engine do I say this solidly isn't a shadow ban.
Well, I don't understand the distinction you're making, as search doesn't seem special here to me. The key feature that makes something a shadow ban is that everything looks normal to the shadow-banned person, but the rest of us don't see their stuff or see it in some restricted way. That seems like it would apply equally to search, forum comments, etc.
This triggered me to DuckDuckGo my own site and immediately I notice the top result is someone rehosting my OSS on a page loaded with pages of crap SEO content.
I've writing and freely shared a lot of OSS code and I plan to continue doing that. All MIT licensed.
But I keep thinking that this is a limitation of MIT. It's written with the purest of good intentions but without any way to prevent bad actors from exploiting that.
I wonder if there's a better kind of license that provides a close level of freedom but would prevent some of the most obvious exploitations (e.g. packaging and selling the code, or reposting on a site with advertising)?
There's copyleft licensing like LGPL but perhaps that goes too far in the other direction and besides, it seems to be very unpopular amongst web developers so I'm afraid if I release any LGPL code it won't get used much or attract contributors.
Every open source license will by definition allow packaging and selling the code or reposting it on a site with advertising, permitting to do that is one of the key freedoms; if you're forbidden to use the code for commercial purposes, then that's not an open source license. See https://opensource.org/osd
MIT will allow that, Apache/BSD/Mozilla licenses will allow that, GPL and LGPL will allow that, Creative commons CC-BY and CC-BY-SA will allow that - the only difference is extra conditions e.g. GPL will require the seller/redistributor of the code to keep the same license, CC-BY will require leaving attribution to the author, etc, but all of them will allow someone else to redistribute the code for commercial purposes.
>One “out there” reason I can think is that I use Amazon Affiliate links on my Bookshelf and my /Uses page and that triggers a shadow ban?
It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.
Per Amazon:
>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.
>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.
For some reason, Beej’s Guide to C Programming is also banned from Bing (and consequently DDG) [1], with the standard robotic non-explanations given when the author asked, even though the rest of the site is not.
He says in the link it’s specifically the C guide, the rest of the website is fine. Though... yeah, DDG queries like “beej c guide strlen” give reasonable results for me, if with an unjustifiably high-ranked position for the mirror at http://docs.hfbk.net/beej.us. Bing ones only include the mirror and the other guides (and a Scribd-hosted PDF copy, of all things, as the first result below a huge navigation card referring to https://beej.us/guides but without the C guide among the links).
Incidentally, what is the "legal status" of Scribd hosting a partial preview of works that they tell you are BY-NC (attribution/non-commercial use only) and telling you to become a member to be able to view the whole thing? Is that not a commercial use?
You've indicated that you've used Bing's tools to see if your website has been indexed but are silent as to whether you've actually manually submitted your site to be indexed by Bing using their url submission tool [0]. If you do submit the URL and then, after a decent interval, your site still doesn't show up then there might be something to your claim.
First, he sends it with a "content-type: application/xml" header. In contrast to most sites that send it with "content-type: application/atom+xml". Which seems to have the nice effect that it renders in Firefox instead of opening the usual "What should Firefox do with this file?" popup.
Secondly, he provides this nice header text "Yahaha, you found me! This is my RSS feed.". It seems to be fetched via this part of the code:
Pretty nice. Are those best practices? Or will "content-type: application/xml" mess with users who have a native feed reader installed and expect the reader to kick in when they click on a feed url?
XSLT is a template language where you match snippets of XML using XPath expressions and output... anything else, in this case HTML using templates that can make use of the attributes, inner content, etc. of the captured XML portion.
> "An elegant weapon for a more... civilized age."
One of the ideas people had back then was having a product catalog in XML that your tool would give you, and that you could upload to your website and display as a nice website via XSLT.
It was such a fascinating technology when I learned about it, but I’m not sure if it ever saw any serious use? Maybe in enterprise?
Well, I'm having the ... pleasure ... of learning it now, as much of CheetahMail's templating system uses it (at least as initially configured by their professional services team. The company I work for just switched over and had them do the initial implementation). Just a data point.
It had some use. I'm pretty sure IE used to support client side transformations when rendering XML that linked an XSLT file. I know one SaaS used to do that before they rearchitected their front end.
IIRC all browsers supported it, I played around with it back then. I just wondered about in-the-wild use. Though the others comments say there is/was some.
You can submit a ticket to bing via their webmaster tools website. I've done it before in the past and a real human did respond at the time. In my experience bing will straight deindex full websites for unknown reasons while Google will ranking penalize but leave you searchable if their algo feels you deserve it.
Had the same problem due to negative SEO campaigns by naughty competitors. Wrote to Bing Webmaster Tools Support team (https://www.bing.com/webmasters/help/webmaster-support-24ab5...) and after a lengthy process got a response that ”the issue” had been addressed.
It’s been a few months since and my website is indeed back in the search results so I advice whoever is having this problem to reach out to Bing.
It would be quite surprising if reverse records were the reason. The vast majority of sites certainly don't have them pointing back. Impossible to do with many providers and probably basically all CDNs.
Negative SEO scammers use intentional search-policy violations to push down the rank of perceived competitors for a few weeks.
While it is more likely the poster will get a few people to check on the situation and naively drive up page rank... a personal site is just a rounding error for traffic in a long-tail distribution known as the modern web.
Most search engines will correlate user-side telemetry traffic against crawler and web stats. i.e. if the bots tend to prefer your site for abnormal reasons, the ranking algorithm may blacklist a signature, domain, and IP sets for several weeks as punishment.
Note too, it is still common for a human employee to manually check a suddenly popular site that pops up out of obscurity. i.e. this catches the more sophisticated cheats, and may have legal repercussions in severe cases.
In summary, if you mess with modern search engines, than expect the ban hammer to fall eventually. ;)
At the bottom of the page there is a comment that says "Some results have been removed", but unlike Google you can't see them. Would be interesting to know if the domain is among the removed results or if the site has not just been indexed yet.
HN taught me first hand on how horrible shadow banning is. You all tolerate this here so it's mighty hypocritical of you to criticize Bing.
I'll say it again: It comes down to how you treat people. Treat others the way you want to be treated. No one wants to be shadowbanned and we can all agree it is a decidedly cowardly and cruel thing to do.
And you can't use "quality" or anything short of being coerced as an excuse. Techniques and technologies to moderate people without shadowmodding at scale are mot just there but very well established. A site for technologists has no excuse to shadowmod other than elitism amorality.
HN's "shadow bans" aren't hidden and allow you to keep posting. Isn't that more tolerant than most other forms of banning?
I have showdead turned on. I see fresh accounts that are automatically dead on each comment which are legit contributions, probably because they're using Tor or a widely abused VPN; that's the only common miscarriage of HN moderation I regularly see, and I vouch for house comments. Those accounts should be in the clear after a week or something like that. I have a couple comments I feel shouldn't be dead, but I can see how others would feel differently, and I have I believe 3 dead comments out of >3000 (many of which expressed views others vocally disagreed with, and I generally feel my views are not particularly popular on HN). But most of the dead comments I see are obviously harmful to discussion. The last time I saw hate speech from a banned HN account - was earlier today. What is it I'm missing here?
It's all well and good to say, treat others as you'd like to be treated. But I don't want to be harassed either. So I forgo harassing people sure. But what's to be done about the people harassing me?
Are you perhaps unaware of a phenomenon called the paradox of tolerance where, if you extend universal tolerance to everyone, including those who use their speech to silence others (through threats, harassment, shouting over people, poisoning the well, etc), you still end up with a forum in which not everyone can share their ideas?
Dead comments are not shadowbanned, just banned. I wasn't referring to that. And the problem with shadowmoderation is it happens without attempting to tell the person to change behavior so people who will cooperate don't get a chance to do that. If someone does not cooperate and acts with bad or false intentions you should ban them explicitly. I never suggested universal tolerance I only suggested transparent moderation.
The only people I've seen get banned without warning are transparent trolls (usually people who make an account to make each comment, with that comment being trolling or hate speech, because they understand that that will be banned). People who aren't purely trolling do seem to get a public message from dang. Does your experience differ? How?
No mention of robots.txt on the post, nor here in the comments.
After a bit of poking around, it would appear Bing may need an allow block to crawl. I don't know what DDG does, but the author's site effectively has nothing in the robots.txt file, other than a commented out Disallow block. From doing this before in the past, I suggest include the following:
Okay, so this prompted me to look at Big Webmaster tools. It looks like they recognise my sitemap file, but haven't bothered to index it in nearly a decade. "Last processed: 6/6/2014". I know I don't post new content that frequently but... WTF?
The background image doesn't render on the homepage in Firefox. This makes the blog links appear as very light blue text on a white background. Makes me wonder how it renders for the crawler and whether it's getting flagged for invisible content.
I wonder if we need basic regs given that search is a public good, including best practices such as SE transparency. If they search you, they have to say something about the parameters and the results etc..
Tangential question to the site index. How does one get the TLD and ccTLD zone data, apparently it’s not that open. There are some ccTLDs which give this info, for example .ch.
"Southern gentleman" conjures up the image of a white-suited Kevin Spacey in "Midnight in the Garden of Good and Evil". Not sure it makes for a strong claim to respectability/morality, both fictionally and IRL ;-p
M$ is such a black box. At some point getting on their spam list, for no reason and still I’m not 100% certain, is the reason I stopped self hosting email.
The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.
All the other guides index just fine.
I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.
Recently I split the C guide in two. I'll have to check to see if that made any difference.
But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.