Back to Glossary
Safety

AI Bias

Definition

AI bias refers to systematic errors in AI systems that produce unfair outcomes, typically arising from biased training data, flawed model design, or problematic deployment contexts.

Why It Matters

AI systems can amplify and automate discrimination at scale. A biased hiring algorithm might reject qualified candidates from certain groups. A biased language model might reinforce stereotypes. Understanding bias sources helps you build fairer systems and identify problems before deployment.

Types of Bias

Data Bias:

  • Training data underrepresents certain groups
  • Historical data reflects past discrimination
  • Labels contain human annotator biases

Algorithmic Bias:

  • Model architecture advantages certain patterns
  • Optimization objectives misalign with fairness
  • Proxy features encode protected attributes

Deployment Bias:

  • System used in unintended contexts
  • User interface affects who can access
  • Feedback loops amplify initial biases

Mitigation Strategies

  1. Audit Training Data: Check for representation and label quality
  2. Test Across Groups: Measure performance for different demographics
  3. Define Fairness Metrics: What does “fair” mean for your application?
  4. Implement Guardrails: Block known biased outputs
  5. Monitor in Production: Track outcomes across groups over time