We've all seen the Big Data books: the future is now! A/B testing forever! AlphaGo crushed it! OkCupid says you shouldn't have a shirtless fish pic, you adorably dull redneck!
But Big Data has a darkside, and O'Neil goes through each segment of our life to show how these "models" can be used against us, to extract goods from us, and to keep us poor. Unfortunately, she also loses her argumentative power that could come with nuance, and she has to disregard nuance in order to make it understandable to the layperson (in other words, I don't think she's very charitable to the layperson).
The first big issue she brings up and has lots of evidence for is: Transparency. Lots of Big Data models that we use (and are fed into) on a daily basis, are not transparent. They're opaque equations sitting on a server farm. A teacher being graded on a value added model doesn't know where their score is coming from. A potential hire doesn't know why he failed his psyche evaluation. A criminal standing before judge doesn't know why the score said he'd be more like to return to jail.
This is honestly one of my biggest takeaways, and it hits close to home. When you apply for a credit card, they tell you why you failed to get one. You can access your "data point" and know why you have the FICO score you do. Google and Facebook tell you the metrics they use to serve you ads in their settings. I know that Amazon is trying to get me to buy another wallet even though I just bought a wallet because they showed me five hundred ads for another wallet. But what about those ads that Forbes tries to serve me? The cookies sitting on my computer watching me? I have no ideas what they're doing and not readily known way to find out. What about non-regulated credit systems that exist out there in Web 2.0 land? Bank of America controls for race and tries to stop redlining when they make a new policy, but will Peter Thiel try to do that when he invests in E-Corp? Probably not.
The second big problem with some Big Data systems is that they create feedback loops that increase inequality. Here, O'Neil is super weak except when she brings up the criminal justice example- we could be using big data to help keep people out of prison and make programs that lower recidivism, but instead we're using it as a way to keep white people out of prison.... but am I really supposed to believe that ads help make poor people poorer? She brings up for-profit schools using targeted ads to lure immigrants and poor people into massive student debt to make a profit, and, while I admit that's super shady, it's not the targeted ads' fault, is it?
The third problem with these "Weapons of Math destruction" is that they often have skewed data. This is the old line "garbage in, garbage out" except now it's "racist/sexist garbage in, racist/sexist garbage out." For example, if you make a employment system that filters out resumes, than teach it on a bunch of older resumes, you're inputting the bias of those older resumes. So if the guy that was reading those resumes was racist, you might be teaching a racist model.
Technically O'Neil has two other "bad points" about "WMDs" but they're just about scale.
Now, there are a lot of problems in this book and O'Neil kind of goes on tangents. For one thing, the WMDs she brings up are less "weapons" than they are symptoms of a bigger societal problem. Take "democracy": our current political system allows a few people- those that live in Orlando, Florida and Pennsylvania, basically- to chose who will be the President. This is messed up, but it means that the Democratic Party could build a powerful machine learning system that most efficiently spent money in locations to help change hearts and minds and win. She really dislikes this Big Data system, and says it's a threat to democracy...
... but the electoral system itself is giant problem and threat to democracy (see: election of 2000)! Big Data has nothing to do with it!
The book ought to have been longer, and it ought to have included more counterexamples of positive data models (I can recall only two, FICO and some housing model). I think that she should've, if not had hand written equations or step-by-step instructions, at least given some background on actual data science. The way it is written makes it seem like she's a magician-mathematician that wandered down from the Ivory tower and realized that bankers were using magic for evil and now she wants to raise hell.
But I guess if I wanted authors to stop writing popular non-fiction books that they A/B tested on their blogs and turned into TED talks, I should stop reading popular non-fiction.
Subscribe to:
Post Comments (Atom)
Review: Group Chat Meme
tl;dr: To endorse the concept that European borders are to blame for developing world conflict is to endorse problematic concepts of nationa...

-
I am intimately aware of the errors in my thoughts and the sins of my soul. I can hear the Type-A asshole screaming like a stolen mind in t...
-
People get the cosmic calendar wrong: The universe is not old. It is not old and wise and dirty. We tell that story to wrench dogmatic minds...
-
Uncommon Grounds is a great book, and points to what I think is an overlooked section of history: the history of things. We have lots of boo...
No comments:
Post a Comment