Hacker News new | past | comments | ask | show | jobs | submit login

I think you're reading this statement as more general than it's meant to be? I interpret it as meaning that there is not necessarily any tradeoff, as there wasn't in this case. "You can have data" -> there exists.



> I interpret it as meaning that there is not necessarily any tradeoff, as there wasn't in this case.

They haven't shown that there is no tradeoff, either in general or in this case.


Is there anyone who thinks that the current level of racism is required for the current accuracy? I can't imagine people that racist to be common in the data community


> Is there anyone who thinks that the current level of racism is required for the current accuracy? I can't imagine people that racist to be common in the data community

It depends on two things. The first is how you're defining racism. If the algorithm is predicting that 10% of white people and 30% of black people will do X, because that is what actually happens, some people will still call that racism but there is no possible way to change it without reducing accuracy.

If the algorithm is predicting that 8% of white people and 35% of black people will do X even though the actual numbers are 10% and 30%, then the algorithm has a racial bias and it is possible to both reduce racism and increase accuracy. But it's also still possible to do the opposite.

One way to get the algorithm to predict closer to 10% and 30% is to get better data, e.g. take into account more factors that represent the actual cause of the disparity and just happen to correlate with race, so factoring them out reduces the bias and improves accuracy in general.

The other way is to anchor a pivot on race and push on it until you get the results you want, which will significantly harm accuracy in various subtle and not so subtle ways all over the spectrum because what you're really doing is fudging the numbers.


"If the algorithm is predicting that 10% of white people and 30% of black people will do X, because that is what actually happens, some people will still call that racism but there is no possible way to change it without reducing accuracy."

What is actually happening? Does it tell you if they are they doing X precisely because they are black or white? The racist part might not be the numbers per se, but in the conclusion that the color of their skin has anything to do with their respective choices.

edit: spelling


ML is spitting out correlations, not an explicit causal model. If, in reality, X is only indirectly and accidentally correlated with race, but I look at the ML result and conclude the skin color has something to do with X, then the only racist element in the whole system is me.


Agreed. That was the point I was trying to get at, albeit I might not have phrased it as clearly.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: