Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While it's hilarious, it's understandable IMHO - just basic statistics, she is more often used in front of pretty/ugly. It's showing a glitch in our society more than a glitch in Google AI.

This on the other hand looks like a hand-crafted blacklist of words that they want to remove from the language, I have no idea how would I train an AI which would classify "motherboard" as inappropriate.



> While it's hilarious, it's understandable IMHO - just basic statistics, she is more often used in front of pretty/ugly. It's showing a glitch in our society more than a glitch in Google AI.

Well, this doesn't necessarily show you facts about society because there's no particular reason to think the training set distribution is the same as anything in the real world.


You don't think it's reasonable to think the training set comes from the real world? Why not? Seems like a reasonable assumption to me.

The alternative is they are carefully curating a training set (google historically unwilling to do anything manually) or writing one themselves???


They said the training set distribution.

It's frankly impossible that they have a training set without any biases, though I'm sure they worked to eliminate the ones they could think of (which itself would have bias).


You can make the distribution different by eg duplicating some of the data a lot, which you might want to do to improve the actual purpose of the model (translation). Any other purposes (having opinions on gender) is just a coincidence and not being optimized for/regression tested.


> I have no idea how would I train an AI which would classify "motherboard" as inappropriate.

I believe you just need to give it emotions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: