This is probably to support potential ambiguities and intraword emphasis e.g. underscore is a common pseudo-space so doesn't support intraword use but * does e.g.
is_not_italic
this*is*italic.
I recently implemented a commonmark parser for emphasis. Holy shit it's painful. I regret doing it but it became a battle I refused to surrender.
It's way harder than I expected because of the combination of the ambiguity of * and ** in multi-symbol runs which support infinite nesting even of the same type of emphasis. A given delimiter run could be many different permutations of plain text `*`, `em` and `strong` depending on context of other delimiter runs that might open and close sections along side other context like punctuation, intraword-ness, flanking and whether sums of runs can be be factored by three!
I never expected "**" could be nested emphasis instead of bold so interpretation requires multiple passes to break down delimiter runs and match them up e.g.
***this* and that* -> *<em><em>this</em> and that </em>
> This is probably to support potential ambiguities and intraword emphasis e.g. underscore is a common pseudo-space so doesn't support intraword use but * does e.g.
is_not_italic
this*is*italic.
That seems like a legacy spec mistake they had to adhere to. I'd expect
This is what I would have chosen too as it's natural for programmer sensibilities.
I can see it as a choice from the "plain text first" philosophy i.e. the things you typically write in plain text should not need escaping. My intuition pump is that you can copy-paste an email into .md without edits or surprising rendering.
As such, it's doomed to never satisfy everyone. Personally I never use intraword emphasis and I typically only have underscores in non-code names i.e. `this_is_normally_code`.
This is probably to support potential ambiguities and intraword emphasis e.g. underscore is a common pseudo-space so doesn't support intraword use but * does e.g.
I recently implemented a commonmark parser for emphasis. Holy shit it's painful. I regret doing it but it became a battle I refused to surrender.It's way harder than I expected because of the combination of the ambiguity of * and ** in multi-symbol runs which support infinite nesting even of the same type of emphasis. A given delimiter run could be many different permutations of plain text `*`, `em` and `strong` depending on context of other delimiter runs that might open and close sections along side other context like punctuation, intraword-ness, flanking and whether sums of runs can be be factored by three!
https://spec.commonmark.org/0.31.2/#emphasis-and-strong-emph...
I never expected "**" could be nested emphasis instead of bold so interpretation requires multiple passes to break down delimiter runs and match them up e.g.