Back to Glossary
Safety
AI Bias
Definition
AI bias refers to systematic errors in AI systems that produce unfair outcomes, typically arising from biased training data, flawed model design, or problematic deployment contexts.
Why It Matters
AI systems can amplify and automate discrimination at scale. A biased hiring algorithm might reject qualified candidates from certain groups. A biased language model might reinforce stereotypes. Understanding bias sources helps you build fairer systems and identify problems before deployment.
Types of Bias
Data Bias:
- Training data underrepresents certain groups
- Historical data reflects past discrimination
- Labels contain human annotator biases
Algorithmic Bias:
- Model architecture advantages certain patterns
- Optimization objectives misalign with fairness
- Proxy features encode protected attributes
Deployment Bias:
- System used in unintended contexts
- User interface affects who can access
- Feedback loops amplify initial biases
Mitigation Strategies
- Audit Training Data: Check for representation and label quality
- Test Across Groups: Measure performance for different demographics
- Define Fairness Metrics: What does “fair” mean for your application?
- Implement Guardrails: Block known biased outputs
- Monitor in Production: Track outcomes across groups over time