Wednesday, August 26, 2020

Review: Invisible Women: Data Bias in a World Designed for Men


I work in data. It is my job to understand how to look at a few hundred columns and a few million rows and write a compelling story. If I misunderstand the data, or if I tell an incorrect story, then my chances of getting fired or passed up for promotion increases. It's therefore in my best, best interest to understand data bias and fight against it.

But, oh boy, the world is so goddamned biased. The world is so goddamned sexist.

Caroline Perez lays out in a little more than 300 pages an entire catalog of sexist malpractice. It is honestly at times overwhelming. Sometimes the text reads like Perez is dumping her extremely well-organized OneNote or Evernote folders on us, where each note is the tale of a male-dominated, male-oriented system believing itself to be acting as an impartial decider. Spoiler alert: none of them are.

The biggest culprit -the problem of invisible women- is a lack of data. A related phenomenon would be “color blindness”. “I don’t see color” is usually what a naive racist person says when confronted with some facts about our racist society. “I don’t see sex/gender” is said by, well, nobody, but a lot of naive sexists are thinking it. We’ve allowed ourselves to build giant systems that essentially act that way. These “sex blind systems” -transportation, technology, medicine, etc- are holdovers from historical patriarchy. How do they continue to perpetuate themselves?

Perez describes two main ways they do this from what I can tell: aggregated or missing data, and missing women stakeholders. The entire world is designed- whether it be different products or systems- by decision-makers who are informed by the data that they have available. If the data doesn’t show different segmentations- in this case, sex or gender- then it can have no impact on the final outcome, no matter who is in charge. Given that it is men who are often in charge, this missing data is not considered.

The first problem -fucked up data- should be our first line of defense. It’s technical, and kind of breaks down into a few problems, which have their own solutions. The first is something most data-adjacent careers run into GIGO. Garbage-in, garbage-out. The trick here is that most of the garbage is male. We see this in trained AI systems, especially in speech or image recognition systems. Most of the labeled examples are male, which means the trained AI will do better with males (let’s be honest, with white or Asian males) then it will with females. Here the solution is pretty clear, but given the dearth of women engineers may be missed: add more women to the training data.

A related problem is that, even if the data exists in 50-50 spreads, it might not be labeled. If it isn’t labeled, then the model won’t be able to take it into account the differences between men and women’s behaviors. You can imagine looking at a network diagram of, say, bus passengers. Thick grey lines are the most used routes, thin grey lines are the least used. If you’re trying to prune the graph -i.e cut service for the less used lines- then you would just snip the thinnest edges. But color each route by the gender of its users, and you can easily find yourself cutting off edges that hurt women indiscriminately. The example Perez has is that women are more likely to move around the edges of a city rather than in-and-out of the city center.

Negative intent isn’t necessary- just shitty, unlabeled data.

A third type of problem is the “flaw of averages”*. A useful, non-sexed example from my external reading is when the U.S air force was designing cockpits for the jet age. They designed them based on taking the average of 4000 pilots and found that building the most perfectly average cockpit wasn’t a great cockpit at all. In fact, it was perfectly designed for nobody at all. Different traits -even just among white men- are correlated and anticorrelated in a vast array of ways, meaning the average represents absolutely no one.

Now imagine if you do apply this same problem, but include an entire population with their own vast array of correlations and anticorrelations. It just gets worse.

A final problem, and one that is truly invisible, is simply missing data. Data on women, in many cases, is simply not collected. These are known unknowns, but often unknown unknowns. A lot of the time these are economic data that we’re missing, but often we’re ignoring them. Women do kincare and unpaid labor at higher rates than men worldwide, but Perez points to ways that it simply does not make its way into the policy decision-making process.

These problems- GIGO, unlabeled data, missing data, and the flaw of averages- are not exhaustive or exclusive categories. In fact, they’re not even sexist problems in and of themselves, but they happen in sexist ways because there is nobody to catch them.

Perez points to two ways that women are failed to be consulted. The first is obvious- they’re simply not in the design and decision-making spaces. Parliaments, parties, and local governance boards are skewed male. Board rooms (and therefore the promotional ladders they control) are skewed male. I don’t need to go deeply into this- you’re reading this review, you know how the world works.

The second is more insidious because it can be made by people who are actively pursuing women’s best interests: not getting women’s feedback. As a person interested in development, it’s pretty terrifying to watch how interventions in the third world simply fail or even backfire because women were not asked (or were asked in the presence of the local male authority) how an intervention would actually work for them in practice. Without actively seeking women’s input as customers, we fail to meet their needs.

All of these failures lead up to women being put at higher risk of discomfort, danger, or even death. And, again to stress the point, you do not even need to be an actively sexist, pro-trad MRA for these failures to occur. Like with racism and anti-racism, if you’re not actively attempting to be anti-sexist, these errors can and will perpetuate sexism.

This book is fantastic and brings as much of the data around sexism to light as it can in as many areas as possible. Definitely read it, definitely recommend it.


Tuesday, August 11, 2020

Review: The Great Warming: Climate Change and the Rise and Fall of Civilizations


At times, Fagan’s writing is annoying and seems like it was vying for a PBS Nova Documentary that it would never get. Most of the time, it is terrifying:

The Medieval Warm Period was a period where Europe was warmer, but most of the rest of the world was drier, and therefore more prone to drought, and therefore more deadly. We are rapidly approaching that world.

Fagan’s book is a section of world history, a kind of hypercut that Tim Urban calls “horizontal history”. We get to see every continent, a wide range of different societies, all at the same time period. The thing that connects them all is that the sky is changing, and often changing too slowly for them to notice until it's too late.

The thread that connects every society affected by the Warm Period isn’t warming itself- some societies did not find their geography warming at all. No, the thread is Water. The overabundance or lack of water, combined with regional climatic unpredictability, is what caused multiple societies to collapse. The Mayans, California Indigenous peoples, and multiple African peoples fell to endemic drought. South China and the Khmer Empire fell to devastating climatological switchbacks as droughts gave way to torrential flooding. Fagan’s list goes on.

12 years after the publication of the Great Warming, we’re seeing the droughts of climate change that humanity already faced 1000 years ago rear their heads. Droughts clamp down on water in developing countries, and cities like Cape Town and Chennai have stunning reductions in their local water sources. We see floods threatening the lives of hundreds of millions of people in China. The once-in-a-millennium droughts and floods come on a regular basis. Countries fight wars over food.

The Greater Warming marches on.


Review: Group Chat Meme

tl;dr: To endorse the concept that European borders are to blame for developing world conflict is to endorse problematic concepts of nationa...