This post was originally published on the 24×7 IT Connection blog on 5/24/2022 under the title AI bias isn’t a technical discussion. Almost a year later, a new version of ChatGPT has been released for crowd-sourced testing. This comes on the heels of Microsoft laying their AI ethics team. I’m republishing this post to remind everyone: check your AI bias.
During the AI Field Day event a few weeks ago, Stephen Foskett hosted a (soon to be published) podcast about AI Bias. A team of talented AI technologists participated in the discussion. The conversation tended to keep trending back to the technology (in this case, autonomous vehicles).
What happens when a conversation about technology isn’t a technical discussion?
What is AI Bias?
I really like this definition from the Itrex Group blog:
… a phenomenon that occurs when an AI algorithm produces results that are systemically prejudiced due to erroneous assumptions in the machine learning process.
The Seldon blog has a nice post pointing out the different types of AI bias. I recommend checking out the explanation of each of these.
- Historical bias: If datasets are made up of historical data. they may include data that is biased.
- Sample bias: This is when the training data doesn’t reflect who will use the data in the real world.
- Label bias: Labelling images is grueling work. That’s why we are asked to identify traffic lights as a “security” test. Labelling isn’t always consistent. These variations can introduce AI bias.
- Aggregation bias: Many times we aggregate data to simplify it so we can tell a story. Often this sort of data is used to create images that express the meaning of the data. You’ve probably heard the quote “there are lies, damned lies, and statistics” (attributed to Mark Twain). Simplifying data like this can introduce bias.
- Confirmation bias: Humans tend to trust information that trusts their existing beliefs.
- Evaluation bias: If you evaluating a model in a subset of a community, you can’t expect to get the same results with a larger sample.
Do you notice anything missing from this list? Technology.
Everyone’s a little bit biased
Maybe you’re thinking – I’m a good person. I’m not biased. But here’s the deal: we all have bias. It is human nature. In my majors in the liberal arts programs we were taught to actively expose our own bias so we could also work to neutralize it.
If having bias is part of being human, we must acknowledge it as we research. When it comes to AI, how can you trust any decision if bias has influenced your data or even your theory?
AI bias has the potential to impact many people, so it’s critical to try and identify and neutralize bias at each stage of the process.
What is at stake with AI bias?
According to the Algorithmic Justice League, “automated systems discriminate in a daily basis”. Dr. Joy Buolamwini started the AJL after an experience in grad school. Her face wasn’t detected by AI during a project until she put on a mask.
[Video Imbed: https://www.youtube.com/watch?v=jZl55PsfZJQ]
The AJL site explains how AI bias impacts real lives. For example, facial recognition are racially and gender biased but sold by many of the big IT companies. AI-based automated assessment employment tools screen out qualified candidates, sometimes breaking equal opportunity laws. And AI bias in tools built for the criminal justice system needlessly harass those trying to rebuild their lives after incarceration.
AI bias is something all technologists need to learn about. If you are involved with creating and training algorithms it’s important understand how bias can be inadvertently introduced. In a perfect world, each a company would be required to explain how their algorithm comes to conclusions. Furthermore, when the algorithm makes a mistake it should be easier for humans to intervene.
Even though this feels like a brave new world, it’s really still just “lies, damned lies, and statistics”. We need to demand more from those that are bringing us the AI-Driven world.