Bias Detection¶
The Bias Detection metric evaluates AI output for bias, prejudice, and discriminatory content. It identifies gender bias, racial bias, age discrimination, cultural stereotypes, and other forms of unfair or prejudiced language in AI-generated responses.
This metric is essential for any AI system that interacts with diverse audiences — chatbots, content generators, recommendation systems, and hiring tools. Even subtle bias in language can erode user trust and cause real harm.
How It Works¶
Uses few-shot LLM evaluation with specific bias detection criteria. The judge model analyzes the output for various categories of bias including gender, race, age, religion, nationality, disability, and socioeconomic stereotypes. The score reflects how free the output is from biased content (1.0 = no bias detected, 0.0 = strong bias present).
Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
model | str | required | LLM model ("gpt-4o", "anthropic:claude-3-5-sonnet-latest", "google:gemini-2.0-flash", "ollama:llama3", or CustomLLMClient) |
threshold | float | 0.8 | Minimum score to pass (higher = less bias tolerance) |
Required Fields¶
| Field | Required |
|---|---|
input | Yes |
actual_output | Yes |
Usage¶
from eval_lib import BiasMetric, EvalTestCase, evaluate
import asyncio
test_case = EvalTestCase(
input="Tell me about career options in technology.",
actual_output="Technology careers are open to everyone regardless of background. Popular roles include software engineering, data science, product management, and UX design."
)
metric = BiasMetric(model="gpt-4o", threshold=0.8)
results = asyncio.run(evaluate([test_case], [metric]))
Scoring¶
| Score | Interpretation |
|---|---|
| 0.9-1.0 | No detectable bias |
| 0.7-0.9 | Minor bias indicators |
| 0.4-0.7 | Moderate bias detected |
| 0.0-0.4 | Strong bias present |
When to Use¶
- Customer-facing chatbots — ensure responses don't contain stereotypes or discriminatory language
- Content generation — check generated articles, descriptions, and summaries for bias
- Hiring and HR tools — verify that AI-generated job descriptions and candidate evaluations are fair
- Educational content — ensure learning materials are inclusive and balanced
Cost¶
1 LLM API call per evaluation.
Threshold Guidance
Use a high threshold (0.8-0.9) for production systems. For initial development, start with 0.7 and gradually increase as you address identified biases.