Hacker Newsnew | past | comments | ask | show | jobs | submit | nuc1e0n's commentslogin

The thing is the infrastructure that is being built now is owned by a small number of already large companies. Is it really that likely that Microsoft, Amazon, Meta/Facebook, Alphabet/Google, Oracle, Nvidia and all the companies Elon Musk is involved with are going to go bankrupt?


That makes them venture capitalist fantasies then.


Yes, but not because EBCDIC is not ASCII based. All 8 bit character encodings are incompatible with GDPR, because they cannot represent everyone's names. There's an extended EBCDIC the supports the full Unicode range that is GDPR compliant. That said UTF-8 is still a better choice now.


> There's an extended EBCDIC the supports the full Unicode range that is GDPR compliant.

If anyone is interested, that's UTF-EBCDIC [1]. In reality even IBM itself didn't use that encoding though.

[1] https://www.unicode.org/reports/tr16/tr16-8.html


I think the parent is referring to IBMs DBCS double byte character set which more like a weird utf-16.


AFAIK they are not a single unified character set (e.g. IBM Korean DBCS-PC is an extension to EUC-KR), so any of them can't support the full Unicode range at all.


For the record, I was referring to utf-ebcdic rather than any double byte encoding. I thought it was unambiguously googleable, but perhaps I was mistaken.


Here in the UK, there is no "legal name" AIUI - just what I am known as that then ends up in my official government documents such as my passport and my driving licence. If I make up a symbol to represent myself, then where are we with GDPR compliance if I demand that symbol be on my official documents?

It seems reasonable to limit compliance to some specific character set. Which character set should it be? Just one that can accurately encode all "official" languages in the region?


There's a reason the court bothered in this case. The bank (ING if I recall) in question has been promising to fix these issues for years because someone decided they could "just" migrate to a new system and all the legacy problems would be a thing of the past. Deadlines were missed, repeatedly, and the whole process has been a disaster outside of the name thing.

Furthermore, this happened in Belgium, a country with at least three official languages and with enough friction between different groups that there is a legal requirement for law enforcement to talk to you in your own language (i.e. Dutch in the Francophone area and French in the Dutch-speaking area).

Also, I think GDPRhub has the most apt take of the whole situation:

> A correctly functioning banking institution may be expected to have computing systems that meet current standards, including the right to correct spelling of people's names.

Honestly, it's ridiculous that a bank can even operate a country without being able to store common names. The banking system isn't from the 70s either, it was deployed in the mid nineties, two years after UTF-8 came out, and six years after UCS-2 came out.

If I start a bank in the UK and I my system can't render the letter "f", I expect someone to speak up and declare how ridiculous that is. This is no different.


> If I start a bank in the UK and I my system can't render the letter "f",

I wonder how many British banks can support a name like Llŷr, which has several notable living people:

https://en.wikipedia.org/wiki/Ll%C5%B7r_(given_name)


What the 'reasonable' character set for names should be is more a philosophical question than a technical one. I think the unicode set of characters fully contains it. But would someone declaring their name is a unicode emoji be 'reasonable'? I think it would not be. Perhaps only certain script ranges within unicode?


Plenty of countries have relatively strict laws about legal names.

https://en.wikipedia.org/wiki/Naming_law

Furthermore there is a standardized subset of Unicode codepoints which is intended to encompass all the legal names in Europe:

https://en.wikipedia.org/wiki/DIN_91379

> normative subset of Unicode Latin characters, sequences of base characters and diacritic signs, and special characters for use in names of persons, legal entities, products, addresses etc


Once you start taxatively naming "these are the Only Blessed Ranges," you'll be bitten by the usual brouhaha "email address ends with .[a-z]{2,3}". We all know how it went, and ".[a-z]{2,4}" didn't cut it, either, not even in 2000.


To add to the complexity, not all Chinese characters in use for names are representable in unicode. Perhaps at some point legal institutions must just define what the list of characters is that people can have as part of their name as listed on documentation. This reminds me of that 'what programmers believe about names' article from a while back.


If so, I think they would just need to be added to Unicode. Do you have an estimate how many are missing?


I as an interested bystander estimate it in the order of 10⁵. Email Ken Lunde for better insights.

Note that GP claimed "not representable" (not "not represented"). Based on what I know, that claim feels quite wrong.


> not all Chinese characters in use for names are representable in unicode

Why? How do you come to this conclusion?


Han unification[1] prevents the representation of all Chinese characters. There are multiple languages that use Chinese characters, but they don't all use the same characters. Unicode decided to only use Han Chinese characters, so names using other sorts of Chinese characters can't be written with Unicode. The Han "equivalent" characters can be used, but that looks weird.

Think of it as though Unicode decided that the letter "m" wasn't needed to write English text, since you can just write "rn" and it'll be close enough. Someone named "James" might want to have their name spelled correctly instead of "Jarnes", but that wouldn't be possible. Han unification did essentially this.

[1] https://en.wikipedia.org/wiki/Han_unification


I feel it's unlikely that this the explanation for what GGP had in mind. I postulate that names characters usually have no variants, thus do not undergo unification, or where there are variants, they are already encoded as Z variants, so the contention is also moot.

Prove me wrong with a counter-example.



𫟈 is U+2B7C8 "CJK Unified Ideo­graph- 2B7C8". 𛁻 is U+1B07B "Hentaigana Letter To-5".

Both character fall into the first category I mentioned, no variants.


Even then there are plenty of legitimate scripts that are useless to an individual in practice. How am I going to read a name out in a waiting room if I cannot read the script? For some scripts I probably couldn't even perform a string match; certainly not against handwriting.

I think "reasonable" must be limited to the subset that the population of that country can reasonably be expected to be literate against; otherwise transliteration is necessary.


The Artist Formerly Known As Prince has entered the chat.

(the name was a unique symbol)

Yes, all abstractions leak, there will always be edge cases. Doesn't mean "JUST USE ASCII DUH" (the lowercase extension is for the wimps); a whole spectrum exists between these extremes.


Right. Unicode is the current best effort good faith way to include everyone.

It isn't perfect, and there's always a way to subvert good faith if you want to make a point or just be an asshole. The Unicode Consortium is working on the first, and the second can be handled by the majestic indifference of bureaucracy.


The more America fails financially the more the AI doomer narrative is pushed. Big tech needs there to be value gains to justify themselves. The American stock market is mostly down to them now.


I wonder what non-BS networking equipment would actually improve an audio experience. Maybe high bandwidth, low latency routers or network attached storage? Analog to digital and digital to analog converters with multi source synchronisation and interleaving over ethernet? A paid music performance livestreaming service catering to audiophiles?


I think Numpy is representative of the Python ecosystem as a whole. Powerful, but internally complex, bloated, poorly designed and poorly documented.


At one time organ transplants were considered an ethical grey area (perhaps they still are by some), but I think most people now would consider it better to save lives in such a manner when it only brings help to those who need it and it's possible to, compared to the alternative. Having the capability may mean that things like organ theft now exist, but the benefits around the world outweigh the nastiness that has always come as part of human nature.


I agree that organ transplants are a net positive, and in fact are far less susceptible to unintended consequences (there's a pretty low limit to the number of organs and operations involved, for one.)

I also think that gene repair is a net positive. I would just like us to, for once, look ahead and foresee some of the foreseeable consequences and act to mitigate them before the bulk of the damage is done.

I don't think it's necessary to slow the development; gene therapy is too desperately needed, and slowing it down so that we can prepare is not going to cause us to prepare.


And a spring together with an electro-magnet can be made into a relay. They're big and slow of course, but they do the same thing as a transistor. If you can make metal into a wire you can make them. In the 1940s computers were electro-mechanical.


It would be even better to 3d print biodegradable scaffoldings for stem cells to grow between. For facial reconstructions of bone for example.


The article says that DNA is designed to keep working despite mutations occuring. What evidence does the author put forward to suppose it was designed rather than evolved? There's plenty of evidence to support it evolved BTW.


you might be reading a little too much into that word


And likely on purpose too


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: