> the documentation is poorly written (all LLM vendors seem to have an internal ...

ComplexSystems · 2025-05-11T00:08:42 1746922122

DeepSeek's documentation has a different problem, which is that there are spelling errors and weird grammatical constructions everywhere:

"DeepSeek API does NOT constrain user's rate limit. We will try out best to serve every request. However, please note that when our servers are under high traffic pressure, your requests may take some time to receive a response from the server. During this period, your HTTP request will remain connected, and you may continuously receive contents in the following formats..."

The documentation is still mostly easy to read, so it doesn't *really" matter, but I always thought this was bizarre. I mean, I get the language barrier reading manuals from Chinese products off of Amazon or whatever, but this is a company that does nothing but work with language all day long, and even at one point had the world's leading English-speaking language model. Shouldn't they be able to produce professional-looking documentation without spelling and grammatical errors?

comradesmith · 2025-05-11T01:24:18 1746926658

What’s the problem? Can you point out a specific thing you would change from that quote?

lolinder · 2025-05-11T01:44:09 1746927849

Are you a native English speaker?

"does NOT constrain user's rate limit" should be "does NOT rate limit incoming requests" or similar.

"We will try out best" should be "our best".

"when our servers are under high traffic pressure" is at least grammatical, but it's awkward. Normally you'd say "when our servers are dealing with high load" or something similar.

"your requests may take some time to receive a response from the server" is again grammatical but also awkward. "Our response times may be slower" would be more natural.

The last sentence is also awkward but the whole thing would need to be restructured, which is too much for an HN comment.

Basically: everything about this screams English as a second language. Which does mean that it's unlikely to have been LLM generated, because from what I've seen DeepSeek itself does a pretty good job with English!

taylorius · 2025-05-11T07:52:17 1746949937

I'm a native English speaker, and I partially disagree with your claims of awkwardness.

"when our servers are under high traffic pressure" - this is a bit awkward I agree, but only the last three words.

If we rearrange it to "when our servers are under pressure from high traffic", I think it sounds good. It's using a metaphor, and I think that should be encouraged. It's interesting. And the phrase "high traffic" conveys some drama.

"your requests may take some time to receive a response from the server" - I think that's fine, to be honest. I like it.

I think you are conflating "awkwardness" with linguistic flair. Technical documentation English has become standardised to a large degree, which of course is useful, and efficient. But it is also a narrow usage of English, and breaking out of its straitjacket does not make language awkward.

ricardobeat · 2025-05-11T09:09:36 1746954576

That’s a very generous interpretation. I don’t know mandarin but these are likely a transfer of grammar constructs from the primary language to english, in the same way the Dutch will say “make a picture” or “the house of my parents”, which can be justly classified as awkward rather than as linguistic flair.

If someone was editing my writing, it would feel a bit patronizing if they said grammar mistakes (many of which come from my mother tongue Portuguese) are “adding flair”, as they are not a stylistic choice.

taylorius · 2025-05-11T10:46:40 1746960400

I'm not claiming it was intentional on their part. My point was solely one of language, so how the sentence came to be written that way is out of scope. And given the word swap I suggested, I don't think it is awkward at all (unlike your examples from Dutch, which definitely are).

As for it being patronising, why is telling a non-native speaker their sentence is interesting unacceptable, but telling them it's awkward is ok? (Assuming both are genuinely held opinions).

I'll reiterate my point that common English usage (non-awkward?) has narrowed enormously in the last 50 years. I think that this is a bad thing.

collingreen · 2025-05-11T16:35:15 1746981315

Your point of how the social norms for English have changed over the last 50 years could be interesting but what does it have to do with the parents point of "these docs seem human written and not spell checked which is very different from the other ai companies AND which is weird anyway for a megacompany with ai tools that write English well".

ComplexSystems · 2025-05-11T16:31:59 1746981119

Which part is it that has linguistic flair? Is it "The prices listed below are in unites of per 1M tokens", or "The expense = number of tokens × price"? Or maybe "you may continuously receive contents in the following formats"?

reliabilityguy · 2025-05-11T03:00:20 1746932420

> "Our response times may be slower" would be more natural.

How can the time be slower? Response times may be longer, but not slower

lolinder · 2025-05-11T03:08:49 1746932929

In colloquial English my construction is just fine, but sure, you'd be welcome to pick longer too.

Some examples of my usage in the wild ("response times may be slower" is present verbatim on each page):

https://github.com/aquasecurity/trivy/discussions/8133

https://www.ameristarstaffingny.com/the-negative-effects-of-...

https://oci.wi.gov/Pages/Regulation/Bulletin20200320Regulato...

https://playrix.helpshift.com/hc/en/27-questbound/faq/13930-...

brabel · 2025-05-11T07:07:15 1746947235

This sentence is a good example where the native speaker's version is worse (in this case because it's just non-sense, as the parent commenter already pointed out).

lolinder · 2025-05-11T14:19:57 1746973197

Sounds like you're the kind of person who will insist to Spanish speakers that a double negative is logically incoherent. Good luck with that approach to language!

reliabilityguy · 2025-05-11T13:27:41 1746970061

> In colloquial English my construction is just fine,

Maybe. However, in my opinion, it’s better to write in such a way that leaves zero chance for misunderstanding.

lolinder · 2025-05-11T14:17:54 1746973074

No real human being would misunderstand because, as you note, time can't go slower. This is just an excuse for pedantry.

fnord123 · 2025-05-11T10:32:11 1746959531

When you remark on improvements, up is generally better and down is generally worse. So saying "response times will be higher" gives an immediate sentiment of improvement. But, obviously, a moments thinking helps you re-orient and realize it's better. This is why plots often have "lower is better" in the legend, to help readers understand.

I often use 'slower' and 'faster' as a native speaker to help reinforce the meaning of the direction.

reliabilityguy · 2025-05-11T13:25:59 1746969959

> "response times will be higher" gives an immediate sentiment of improvement.

Higher as opposed to lower? It makes no sense to me.

lolinder · 2025-05-11T14:24:35 1746973475

Exactly.

"Response times will be higher" sounds very confusing as a way of saying we'll take less time to respond, right? So why should "response times will be lower" mean we'll take more time if the opposite construct is confusing?

Far better to just use the comparative forms that we already have for time specifically to make it perfectly clear.

lolinder · 2025-05-11T14:21:30 1746973290

Yes, this is a good explanation for the phenomenon! Thanks.

numpad0 · 2025-05-11T18:24:50 1746987890

If your problem is that the texts you quoted were not written by someone with English as the first language, I tell you: English is not the framework of human civilization, it sometimes uses English for data quantization and message passing.

A lot of English native speakers has such assumptions that:

- any academic topics are universally discussed in English/Latin and so every highly educated person shall speak good English, - language is like a thin wrapper over a to-be-converted-to-YAML common intermediate language(Universal Grammar theory), - anything should translate into fluid English with intent completely intact, - but WWW is >90% English anyway, - etc.

None of these are true, and it's just not realistic for a well educated East Asian - common theme of East Asian languages is it's all custom implementations with minimal sharing with neighbors let alone English - to "just" pick up natural English. I suppose you're looking for something like following:

"At DeepSeek, we strive to serve every request to our customers with best of our effort, and we do not impose a rate limit for our APIs. However, do note that due to finite nature of our computing resources, API responses might become delayed in cases when our backend is experiencing high load. Under such circumstances, the HTTP sessions will be kept alive, and response will be served in following formats..."

... Isn't this a $1m/yr skill on its own? Have you seen a great Far East engineer write like this - I mean, how often do you come across a Far Eastern translator that can casually do this?

ComplexSystems · 2025-05-11T19:45:24 1746992724

I don't really get the point of your post.

The goal is to pretend that DeepSeek doesn't have access to good English translators? Or good English translation capabilities?

Why don't we just not pretend this instead?

numpad0 · 2025-05-11T23:10:24 1747005024

Why do we not pretend like foreign language technical ghostwriting is a solved problem! You guys are asking for complete rewrites by someone explicitly NOT Chinese natives for all documentations. There's some point it'll be just an unreasonable ask.

A lot of HNers puts blind trust on Universal Grammar Theory and downplay languages as all but obsolete human output packing format that are each no more than header differences and those are just wrong. Languages are at least CODEC. And if you go back to the original topic from there, I don't think it will sound so unreasonable that translating between different CODECs will induce losses and artifacts.

rrr_oh_man · 2025-05-11T02:13:37 1746929617

I'd just shorten it:

  DeepSeek API does NOT have rate limits. 
  However, when our servers are under high traffic, 
  your requests may take some time. During this period, 
  you will continuously receive the following responses:

albert_e · 2025-05-11T01:29:22 1746926962

Maybe the first sentence? I am guessing they meant "DeepSeek API does not enforce any rate limit on users." would be more appropriate.

_Constraining the rate 'limit'_ seems like incorrect usage - but it is an a easy mistake to make in a first draft. Review should have caught it.

meindnoch · 2025-05-11T10:48:37 1746960517

That's just standard Chinglish.

fakedang · 2025-05-11T03:09:10 1746932950

I've seen documents that were applications by CCP-affiliated provincial government bodies, things like detailed studies for loan applications to international banks, etc. and trust me, the Deepseek documentation is miles ahead of that. These are official government documents from one government agency to some international agency.

ComplexSystems · 2025-05-11T04:29:25 1746937765

This has fascinated me for years. I'll just re-link this comment of mine from a few years ago: https://news.ycombinator.com/item?id=37544019#37548278.

This was about Amazon products rather than government documentation, but the point is the same. I'll just quote the relevant part:

> The people who make these products have to spend millions and millions of dollars setting up factories, hiring people, putting things into production, etc. But somehow they don't have a budget for a bilingual college student intern to translate a bunch of copy to English better than "using this product will bring a great joy." Why?

> I will make a super strong claim: ChatGPT can now do nearly perfect mass translations of this stuff for free, in theory simultaneously increasing translation quality and reducing costs. Despite this, for whatever reason, I predict that the average translation quality on Amazon won't improve within the next few years.

My super strong claim has so far been correct. Just go on Amazon.com and click just about anything. For instance, here's a random blanket: https://www.amazon.com/dp/B07MR4FSPT

"OPTIMUM GIFT: All people can use this flannel fleece blanket in Coach、Office、Bed、Study, etc. Reversible softness offers all seasons warmth. INTIMATE SERVICE: If you have any questions, please contact us. it is our pleasure to serve you."

How does a human being in this situation somehow invent the phrase "OPTIMUM GIFT?" "Optimum" is a fairly advanced English word. Maybe you'd expect, I dunno, "GREAT GIFT" or "BEST GIFT"? And "INTIMATE SERVICE?"

And once again, we now have magic English-speaking computers that can do this all for us - for free - and China has unanimously decided "nah, screw that. We'd rather go with INTIMATE SERVICE."

gyomu · 2025-05-11T04:46:55 1746938815

I live in Japan, and when you read English texts here (it doesn’t really matter if it’s a restaurant menu, a pamphlet at a touristic area, a flyer for local government services…) the same English word will often be written differently within the same document (eg for a recent one I saw: “curbside” was spelled “crubside” and “carbside”).

I always wonder how that happens, because the documents themselves often smell strongly of machine translation - but if they’re machine translated, how would those mistakes get in? My best guess is that there’s a human manually typing out a machine translation output, which kind of boggles the mind.

I think us computer nerds who are used to using computers to do work efficiently have a hard time imagining all the weird ways in which non-computer nerds actually use computers.

numpad0 · 2025-05-11T16:47:15 1746982035

Sometimes they are machine translated by someone who doesn't understand Ctrl+C shortcut if it's longer than 10 words, but equally often they're just hand kneaded. Japanese English education is effectively machine translation with human brains as machines - we're not actually taught English at all[0], just memorized technical rulesets that yield predictable garbage. A lot of weird "Engrish" text is likely result of that.

0: That's supposed to be drastically changing, we'll see if it does. English skills is still a resume stuffer in Japan.

delian66 · 2025-05-11T05:17:57 1746940677

It may have been A/B tested, and people do prefer the "INTIMATE SERVICE" version more...

DonHopkins · 2025-05-11T09:20:11 1746955211

Because A/B testing always has a happy ending.

collingreen · 2025-05-11T16:39:22 1746981562

This is a top tier joke about "INTIMATE SERVICE" and I wanted it to know it was appreciated.

yard2010 · 2025-05-11T11:16:21 1746962181

One of my guilty pleasures is to read random chinese product pages and try not to laugh.

fakedang · 2025-05-11T05:33:35 1746941615

I mean, it's not so easy getting a perfectly knowledgeable English speaker in China. Heck, if you see some of the interviews of the insanely viral Tony from LC Signs, his English outside of his skits is actually very Chinese-flavored. One of my friend's exes who was Chinese and studied at Oxbridge had a very similar strong Chinese twang, and often made grammatical mistakes while writing, even though the British would easily call her fluent.

lolinder · 2025-05-11T05:37:12 1746941832

DeepSeek is a perfectly knowledgeable English speaker in China.

numpad0 · 2025-05-11T18:45:24 1746989124

  [me]
  > Translate to palatable startup-style English:
  >> DeepSeek API 不限制用户的访问速率。我们会尽力满足每个请求。但是，请注意，当我们的服务器流量压力较大时，您的请求可能需要一些时间才能收到服务器的响应。在此期间，您的 HTTP 请求将保持连接状态，您可能会持续收到以下格式的内容……

  [LLM]
  > Here's a more palatable, startup-style translation of your message:
  > DeepSeek API has no hard rate limits. We strive to process every request as quickly as possible. However, during peak traffic, responses may take slightly longer. Rest assured, your HTTP connection will remain active, and you may continue to receive real-time updates in the following format…

I asked it to turn it "corpospeak like":

  > At DeepSeek, we prioritize accessibility and scalability—which is why we enforce no strict rate limits on API usage. Our systems are designed to handle all requests with high availability, though during peak operational loads, response times may experience nominal delays. Rest assured, your connection will remain active, and responses will continue streaming in real time with the following structure:

... This is Google Translate from GP -> DeepSeek Web. I don't think DeepSeek is a perfectly knowledgeable English speaker in China. "However, during peak traffic," is basically a word substitution on "但是，当服务器流量压力大时", if my Han Script reading is right. Parts of the corpo version like "response times may experience nominal delays." still shows Chinese accent, assuming that's the part you think must be thoroughly washed off.

What you're asking needs English-first bilingual human person who can be trusted and has tech backgrounds. That's quite a tall order.

k__ · 2025-05-11T10:46:21 1746960381

That text would be cut to at least half its length by an editor.

827a · 2025-05-11T18:12:52 1746987172

Similarly strange and incorrect grammatical constructions are found in the English translations for Game Science’s hit game Black Myth Wukong. My expectations for, for example, the construction manual for a bookshelf is pretty different than a game or AI model & service costing tens of millions of dollars in development (or more).

Heck, they could literally pay any native English speaker to take their English-ish translations and regionalize them; you don’t even need to know Chinese to fix those paragraphs. Why is this such a common problem with the English China exports? Is it cultural? Are they so disconnected from the west that they don’t realize?

A great counter-example is NetEase’s Marvel Rivals; their English translations are fantastic, and even their dev interviews with their Chinese development team is fantastically regionalized. They make a real effort to appeal to English audiences.

ljm · 2025-05-10T23:59:38 1746921578

Sometimes I wonder if I have ADHD or if it's induced by the content, because I can spend hours soaking up interesting literature and putting my weird thoughts down onto paper but I can barely make it a few words through LLM-driven drivel.

It's crazy seeing bots posting AITA rage bait on Reddit that always follows the same pattern: some inter-personal conflict that escalates to a wider group: "I told my husband I wasn't into face-sitting and now all my colleagues are saying I should sit on his face to keep the peace."

That is one thing but using the same LLM to drive your tech specs, knowing it can say a whole lot of shit the 'author' isn't aware of, because they're illiterate and that is fucking normal... is worrying.

stuaxo · 2025-05-11T08:20:15 1746951615

Yeah it's unreadable for me.

There's been a trend to post LLM slop about tech subjects and they anger me - I don't know why someone wanted to waste people's time like that.

Even worse - I've come across an AI slop site that masquerades as dev information, with just plain wrong information.

DonHopkins · 2025-05-11T09:30:49 1746955849

I let the domain "micropolisonline.com" expire, which I was using for the old OpenLaszlo/Flash Python/SWIG/C++ client/server based version of open source SimCity, and somebody took it over and replaced it with AI generated claptrap, stealing a lot of my own and other's images without any credit. It even promises the source code, but doesn't actually link to it, just has promises and placeholders.

It totally misrepresents what Micropolis is, which was based on the original SimCity classic, and confuses it with all the subsequent versions of SimCity and other made-up stuff. And it never mentions the GPL-3 license, EA's license and restrictions on the use of their SimCity trademark, or Micropolis's license to use their trademark. I have no idea what the point of it is.

https://micropolisonline.com/

https://micropolisonline.com/source-code/

>How to Access the Source Code: For those eager to explore the Micropolis Online Source Code, it is available on our dedicated GitHub repository. Visit [Link] to access the repository, where you can browse the code, contribute to ongoing projects, or initiate your own.

The source code is actually not at [Link] but at:

https://github.com/SimHacker/MicropolisCore

Not even so much as a link to the my demo!

https://www.youtube.com/watch?v=8snnqQSI0GE

They could be in some legal jepordy since they didn't mention or link to the Micropolis GPL License or the Micropolis Public Name License, which they may be violating.

https://github.com/SimHacker/MicropolisCore/blob/main/Microp...

The have a "Meet the Team" page that mentions nobody, just hand waves about "we" and the community. They couldn't even bother to generate generic looking fake profiles of non-existent people. Suffice it to say I never heard back from anyone after using the "Contact Us" page.

They even have a cute little Terms and Conditions page with their very own license, which doesn't allow anyone to do to them what they did to me, and is not particularly GPL-v3 compatible:

https://micropolisonline.com/terms-conditions/

>License to Use Micropolis Online

>Unless otherwise stated, Micropolis Online and/or its licensors own the intellectual property rights for all material on Micropolis Online. All intellectual property rights are reserved. You may view and/or print pages from micropolisonline.com for your own personal use subject to restrictions set in these terms and conditions.

>You must not:

>Republish material from micropolisonline.com Sell, rent, or sub-license material from micropolisonline.com Reproduce, duplicate, or copy material from micropolisonline.com

They also claim all rights to all user created content:

>By displaying Your Content, you grant Micropolis Online a non-exclusive, worldwide irrevocable, sub-licensable license to use, reproduce, adapt, publish, translate, and distribute it in any and all media.

Kind of ironic for an LLM to go around stealing people's content, then telling them that not only can't anyone copy it back, but it owns the rights to everything anyone else may contribute in the future.

glimps · 2025-05-11T00:23:30 1746923010

I get the distinct feeling the spec was created by llm too. As with the doc, every evidence hints at it.

Makes great IPO to tell investor most tour product are already created be averaging out the most likely outcome

clbrmbr · 2025-05-10T23:08:00 1746918480

Certainly a shame if true, there are some really sharp folks at Anthropic and this is an important building block in the emerging ecosystem.

jes5199 · 2025-05-11T02:57:34 1746932254

someone is going to write an MCP adaptor that lets Claude use OpenAPI and then we can forget that MCP was a thing

cruffle_duffle · 2025-05-11T04:54:04 1746939244

How would that even work?

DonHopkins · 2025-05-11T10:01:50 1746957710

South Park explored that question:

https://www.youtube.com/watch?v=sbCj0i8WQA0

jes5199 · 2025-05-11T16:22:35 1746980555

new MCP tool: make-curl-request, headers, payload

otabdeveloper4 · 2025-05-11T03:31:45 1746934305

This. Endure a couple months and this madness ends.

teaearlgraycold · 2025-05-10T23:58:22 1746921502

In my experience AI startups are AI maximalists. They use AI for everything they can. AI meeting summarizations, AI search (Perplexity), AI to write code and contracts, AI to perform SEO, AI to recruit candidates, etc. So I 100% believe they would use AI to write specs.

runlaszlorun · 2025-05-11T22:51:19 1747003879

Seems like many are dreading our near future. Not I, I can't wait to see how this all plays out...

whatever1 · 2025-05-11T00:18:32 1746922712

So many bullet points in the documentation!

meander_water · 2025-05-11T01:06:32 1746925592

I can't say whether the original spec was written with AI assistance, but having a cursory look through the commit history [0] it doesn't look like they're just blatantly auto-generating the docs. The git history indicates that they do think about the spec and manually update the docs as the spec changes.

[0] https://github.com/modelcontextprotocol/modelcontextprotocol...

never_inline · 2025-05-11T09:12:50 1746954770

I don't write perfect English. Far from it. But I'd prefer broken English any day over default LLM verbiage. It seems so unnatural and facetitious. I always have this in my prompts: "Be succinct and use simple English sentences".

benatkin · 2025-05-10T23:34:22 1746920062

The DeepSeek documentation seems to be better. It looks to be quickly thrown together but not bad. I’m not sure what that says about LLMs writing documentation.

jerf · 2025-05-11T00:03:07 1746921787

It had not occurred to me that the AI coding vendors are basically positively motivated to themselves produce code that is not documented. They want code that is comprehensible to AIs but actively not comprehensible to humans. Then you need their AIs to manipulate it.

AI code as the biggest "lock you in the box" in programming history. That takes rather a lot of the luster out of it....

They'd better be right that they can get to the point that they can fully replace programmers in about two years, otherwise following this siren song will, well, demonstrate why I chose "siren song" as my metaphor. If AI code produces big piles of code that are simply incomprehensible to humans, but then the AIs can't handle it either, they'll crash out their own market by the rather disgusting mechanism of killing all their customers, precisely because the customers consumed their service.

never_inline · 2025-05-11T09:13:58 1746954838

To be honest I don't think they have any plans either.

walterbell · 2025-05-11T01:33:25 1746927205

Self Alignment™