Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

(Disclaimer: IANAL - I am not a linguist)

The Japanese language itself has very low "information density" - it takes many more syllables to convey the same thing in Japanese than it does in most languages. This is the result of the language having much tighter phonotactic constraints; literally there are fewer "legal" phonemes (and combinations of them) in Japanese than in most other languages. [1] This isn't a criticism of the language of course - just an interesting tradeoff: A Japanese speaker has fewer sounds to learn, but must learn to speak them faster in order to convey their ideas.

As a result, Japanese speakers are some of the fastest speakers in terms of syllables/second, out of necessity.

Therefore it's no surprise that the typography is going to appear more "densely packed" on a Japanese website. With the exception of Kanji which is a pictographic (logographic? I'm not really a linguist) writing system consisting of many, many characters (and thus can be quite information-dense) - Japanese's other writing systems are purely phonetic - each character representing a single syllable. Take a look at the front page of a Japanese newspaper and show it to an English speaker, they will think it's "crowded", but I bet if you translated it to English you'd see something that resembles a typical English paper.

[1] https://en.wikipedia.org/wiki/Phonotactics



As a Japanese native who's fluent in English, I disagree with this. I regularly translate English (mostly technical) texts to Japanese, and the translation I get is roughly the same size as the original. Its density is certainly not "very low".

The reason that you might think that Japanese is more "bloated" is probably because the form of the language you hear most - in TV news and speeches - are in the "polite" form. In that case, they mean to be verbose. But in the casual form, the same thing can be said much shorter.

Also, keep in mind that Japanese has a very different phonetic system. Their phonetic intuition is mostly based on "mora" and many find the notion of syllable confusing and inconsistent. I don't think these two are very comparable (at least mentally).

It's funny that you mentioned that a Japanese newspaper looks more "crowded" than English one. Japanese people would think the exact opposite. The "information density" is really in the eye of the beholder.


Thanks for calling me out, there. I did some digging and I _think_ this is where I first came across the idea of "information density" as a function of info/syllables: https://www.researchgate.net/publication/235971274_A_cross-L...

The methods this paper employed may certainly be biased in a way that makes Japanese appear less information-dense than other languages.

As for the _written_ language's density, that's entirely different! You say that as written, translations come out to be about the same in Japanese and English, I assume this corresponds to roughly the same number of characters as well? If so, if we can believe the authors of the paper above that Japanese has less information-per-syllable, then perhaps Japanese writing's syllabic/ideographic writing system accounts for this handily such that the information density of _written_ Japanese is about the same as written English.

For what it's worth, I think you were agreeing with me actually:

> Take a look at the front page of a Japanese newspaper and show it to an English speaker, they will think it's "crowded", but I bet if you translated it to English you'd see something that resembles a typical English paper.

What I meant here was that in terms of _written_ information density there are clearly significant gains made thanks to the writing system.


> I assume this corresponds to roughly the same number of characters as well?

Japanese usually has less characters, but Japanese character are bigger than average latin character.


I am completely ignorant here--how common is the polite form on websites?


It's complicated, but perhaps a good comparison is newspaper headlines in the West: headings and phrases etc are compact without ever being slangy.

For example, a random ad on the Yahoo Japan homepage proclaims 予約来場いただくと進呈QUOカード5000円分, which means "get a 5000 yen gift card if you come visit", but uses literary words like 来場 (arrival/visit) and 進呈 (gift) that would rarely be used in spoken Japanese, and even the polite いただく (to humbly receive) is in the short "dictionary" form instead of being conjugated out the way it would be in speech.


Depends on the type of websites. Most corporate websites are polite. In a message board like 2-chan, (unsurprisingly) not at all. Actually, some of the most vulgar Japanese that most people would never speak in real life can be only found on the Internet.


> I am completely ignorant here--how common is the polite form on websites?

Not op, but I can answer.

Not very common (reported speech, simulating speech, etc.).


i agree. The length is roughly the same, if not actually shorter in many cases. Syllables per second isn't a good measure of information density.


In terms of character count, it takes fewer characters to convey an idea in Japanese than in English.

You see this in NES/SNES-era game translations. Both versions of the game had to use the same number of characters. So the level of detail in English is quite low compared to the original.

Recently, modders have hacked the English versions of the games to add variable-width fonts to achieve more true translations. Breath of Fire 2 is a good example.


Both are true.

Because each character is equivalent to an entire syllable in English, not a letter.


The visual density of Japanese characters has more to do with the fact that it's not a requirement to have white space between words so you can line break mid word. The characters also don't change in size (no real upper or lower case) so everything is pretty mono spaced.

For example look at the wikipedia article for mother in english vs Japanese

https://ja.wikipedia.org/wiki/%E6%AF%8D%E8%A6%AA

https://en.wikipedia.org/wiki/Mother

In the English sections with pictures there's significantly more white space between the text and pictures than in the Japanese version.


While I do agree with you on the spoken side, there was a comparative study of spoken syllables per language floating around at some point [0], is this really true for written Japanese ? As each kanji is potentially multiple syllables long when spoken but still one character when written, conveying a meaning of itself. (I'm simplifying here obviously, as there are kanji combinations to give a specific meaning, eventually modifying the number of syllables when spoken and others)

[0] : https://advances.sciencemag.org/content/5/9/eaaw2594


Kanji is efficient precisely because it is dense.


Yeah, what a weird comment.How can you argue that you need more length and characters when comparing 2000-3000 Kanji vs 26 letters (I am ignoring the "kanas" to simplify the argument) It is like saying Hexadecimal is more verbose than binary.


That's not how Japanese works. Yes Japanese uses Chinese characters that's only a subset of Japanese. Japanese sentences are typically a bit longer than English in terms of syllables and written length - their main reduction technique at the language level is omission, rarely efficiency improvement.

- Why's it so hot today? = 6 syllables

- Nande konna ni atsui no kyou? = 10 syllables (spoken colloquially)

What about written length? You can see the Japanese sentence is a little longer here (when the English uses a non-monospaced font). It's also enough to demonstrates the why using kanji doesn't always provide the huge reduction/compression that you're expecting.

- Why's it so hot today?

- なんでこんなに暑いの今日?


> Japanese sentences are typically a bit longer than English in terms of syllables and written length?

Do you understand than 1 kana represents 1 syllable? They are BY DEFINITION, more concise than Latin letters.


Wow thanks! I didn't understand that despite being able to read and write Japanese! Sarcasm aside you don't seem to have a very deep understanding so maybe you should be a bit nicer when expressing your opinion.

> 1 kana represents 1 syllable

ちゅ <-- 2 kana, 1 syllable

> BY DEFINITION, more concise than Latin letters

smash <-- 1 syllable, 5 latin letters, ~3 kana in length スマッシュ <-- 5 kana


Why’s it so hot today?

何でこんなに暑いの今日?

Where in the US are you from?

出身は米国の何処ですか?

Japanese sentences are usually shorter than the english translation.


For paid translations, English likes to charge by the word, whereas Japanese charges by the character. The rule of thumb for conversion is 2 JP characters --> 1 English word i.e. translating a 1000 character JP document you'll expect about 500 EN words at the end.


I speak Mandarin Chinese. They are kind of similar.

今天怎麼這麼熱?

美國那來的?


> The Japanese language itself has very low "information density"

This is just wrong as characters are much more concise than using Alphabet because of use of Kanji and the like.


Japanese is fast for sure, I'm not sure if your theory explain why Spanish is close second behind... (honestly I have no idea either)

https://ahtaitay.blogspot.com/2020/07/as-native-speakers-we-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: