"Cute" comes across as very dismissive. I'm not sure if you intended that. lens-regex-pcre is just a wrapper around PCRE, so anything that works in PCRE will work, for example, from your Mozilla reference:
I think in general that Haskellers would probably move to parser combinators in preference to regex when things get this complicated. I mean, who wants to read "\p{Sc}\s*[\d.,]+" in any case?
U+093b is still in the BMP. By the way, what text encodings for source files are supported by GHC? Escaping everything isn't fun.
And I am not sold on lens-regex-pcre documentation; "anything that works in PCRE will work" comes across as very dismissive.
What string-like types are supported? What version of PCRE or PCRE2 does it use?
I'm sorry, I don't know what that means. If you have a specific character you'd like me to try then please tell me what it is. My Unicode expertise is quite limited.
> I am not sold on lens-regex-pcre documentation
Nor me. It seems to leave a lot to be desired. In fact, I don't see the point of this lens approach to regex.
> "anything that works in PCRE will work" comes across as very dismissive
Noted, thanks, and apologies. That was not my intention. I was trying to make a statement of fact in response to your question.
> By the way, what text encodings for source files are supported by GHC?
UTF-8 I think. For example, pasting that character into GHC yields:
It uses https://hackage.haskell.org/package/pcre-light , which seems to link with the system version. So it depends on what you install. With Nix, it will be part of your system expression, of course.
https://unicode.org/reports/tr18/#General_Category_Property
["\2363"](U+093b is a spacing combining mark, according to https://graphemica.com/categories/spacing-combining-mark)
I think in general that Haskellers would probably move to parser combinators in preference to regex when things get this complicated. I mean, who wants to read "\p{Sc}\s*[\d.,]+" in any case?