Two things have hit me profoundly this week, and I wanted to share them with you. Ending Bias in Artificial Intelligence The first is an article that I wish I could just copy in its entirety and paste here, and I thank Paula Jones for sending it to me! In his article Now Is the Time to Act to End Bias in AI , Will Byrne has pointed out what I have been trying to say ever since I started this journey of learning about #machinelearningdeeplearning. As with any new technology, artificial intelligence reflects the bias of its creators. Societal bias—the attribution of individuals or groups with distinct traits without any data to back it up—is a stubborn problem that has stymied humans since the dawn of civilization. Our introduction of synthetic intelligence may be making it worse. He goes on to pose the question: “If AI is going to be the interface between people and critical services they need, how is it going to be fair and inclusive? How is it going to engage and support the marginalized people and the most vulnerable in our society?” This absolutely fits in with what I have been trying to articulate lately. Just as journalism is inherently biased—there is no way to say anything using language without showing the bias of the speaker—so must AI, built by humans, be inherently biased. It’s impossible to have an infinitely sized training set, and as soon as you remove one image from that theoretically infinitely sized training set, you have introduced bias. So of course with our fallible limited scope of training sets, the decisions about what to include and what to exclude introduces an inherent bias, leading to, for example, in natural language processing (NLP), African-American vernacular being incorrectly identified as Norwegian , or visual AI identifying people of African descent as gorillas . Obviously these are extreme examples, but the question remains: how do we avoid this? “A first step is opening up the black box—creating transparency standards, open-sourcing code and making AI less inscrutable…” Read Byrne’s article. It will make you think, I promise. Weapons of Math Destruction The second thing is a book I’m reading by Cathy O’Neil . She is a mathematician, data scientist, and author of the blog mathbabe.org , and I can’t recommend her work higher. In her book, Weapons of Math Destruction , she points out that algorithms embed existing bias into code—with potentially destructive outcomes. Everyone should question the fairness of said algorithms, not just computer scientists and coders. She says that algorithms are not what everyone thinks they are: objective, true, and scientific. That’s a marketing tool, she says. They don’t “make things fair.” Instead, they repeat our past practices and automate the status quo. O’Neil defines a “weapon of math destruction” (WMD) as an algorithm that is opaque to the public, has the capacity to scale, and can inflict damage upon the people about whom they are predicting. There are examples of this everywhere, from evaluating public school teachers to predicting recidivism to hiring practices to using facial recognition in an AI-infused security system. There there are ways to address the problems of WMDs, however. It lies in the definitions of what is being measured and doing your best to eliminate the factors that are irrelevant. Using the example of the blind audition for symphony orchestras, O’Neil points out that the evaluators of said orchestras decided the most important thing was the music and decided not to get distracted by extraneous things like what the performer looked like. So, in the 1970s, auditions started to take place behind a screen. “Connections and reputations suddenly counted for nothing. Nor did the musician’s race or alma mater. The music from behind the sheet spoke for itself.” The result? The rate of women musicians being hired went up by a factor of five. Not just multiplied by five, but a factor of five. In the case of hiring engineers, for example, this kind of audition may not be possible, but at least you can be aware of your inherent biases and allow for your known prejudices. She concludes that data scientists should “not be arbiters of truth. We should be translators of ethical discussions that happen in larger society.” If I Had a Hammer… I’d Hammer Out Justice I have felt vaguely uncomfortable with AI classification systems because of its capacity for misuse. If your classification algorithm is shown only dogs, how can it recognize a cat or a hamster? If the training set has more men than women, it will identify more people as male (not to mention what it might do with gender-ambiguous people). If the NLP takes only the Queen’s English as its training set, it won’t recognize Jamaican English or even English with a southern accent. If the system is shown that non-white people are more likely to be intruders, then you can see where this leads. If the only tool it sees is a hammer, then every solution involves a nail—and a particular nail, at that. How do you not introduce bias into your facial recognition feature or NLP or any kind of classification system? I don’t think there is an answer to this question; the only answer that comes close to solving the problem is to use a huge training set, as close to infinity as you can possibly come. This system must learn from its mistakes as it collects more data, and inherent biases must be accounted for. And when the system is revealed as having bias, the data scientists and neural network engineers must be vigilant in fixing the problem, using every means necessary. Much as we would like to think that the real world can be codified into a mathematical model, the reality is that there are nuances that simply can’t be introduced into your system. Maybe Infinity Isn’t Required If deep learning is modeled on the neural networks in humans, and humans are fallible, where does that leave us? Parenting involves exposing your children to as many sets of training data as possible. Still, it only takes a baby a few examples of what a dog looks like for the baby to identify one, even if it looks different than the ones it has already seen. That baby’s mistakes in identification are easily corrected, until the baby grows up to be a human adult can easily classify eleventy-dozen different breeds of dog. That human hasn’t seen an infinity-minus-one of dogs, though. Humans have preferences—dare I say, prejudices—and only an advanced human can recognize that those biases are part of who they are. The best anyone can do is to actively work to dismantle those prejudices. Can we build into the system the ability to identify biases? Does that mean that neural networks should have an ethics processor? Not being a deep learning expert, I don’t have the answer. I do know that work is being done to process more and more information in a smaller and smaller footprint, and that there are specific processors that lend themselves well to neural network processing (the Cadence Tensilica Vision family of DSPs , for example). The more processing power that we can throw at these networks, the more training sets can be used by the classification algorithm, and the more un-biased we can get in the end results. Applying these advances in technology to these networks that create true artificial intelligence brings us closer to leaving the WMDs behind, and look forward to an unbiased future. —Meera
↧