Semantic Similarity¶
The Semantic Similarity metric computes the cosine similarity between the embeddings of actual_output and expected_output. No LLM model is needed.
How It Works¶
graph TD
A[actual_output] --> B[1. Generate Embedding]
C[expected_output] --> D[2. Generate Embedding]
B --> E[3. Cosine Similarity]
D --> E
E --> F[Final Score 0.0-1.0] - Embed actual output — converts
actual_outputto a vector representation - Embed expected output — converts
expected_outputto a vector representation - Cosine similarity — computes the cosine of the angle between the two vectors
Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
threshold | float | 0.7 | Minimum score to pass |
embedding_provider | str | "openai" | Embedding provider ("openai" or "local") |
model_name | str | provider default | Embedding model name |
Required Fields¶
| Field | Required |
|---|---|
actual_output | Yes |
expected_output | Yes |
input | No |
retrieval_context | No |
Usage¶
from eval_lib.metrics.vector_metrics import SemanticSimilarityMetric
from eval_lib import EvalTestCase, evaluate
import asyncio
test_case = EvalTestCase(
actual_output="Paris is the capital of France.",
expected_output="The capital of France is Paris."
)
metric = SemanticSimilarityMetric(
threshold=0.8,
embedding_provider="openai",
model_name="text-embedding-3-small"
)
results = asyncio.run(evaluate([test_case], [metric]))
Cost¶
1 embedding API call per evaluation (both texts are batched into a single request).
Example Scenarios¶
High Score (0.95+)¶
EvalTestCase(
actual_output="The cat sat on the mat.",
expected_output="A cat was sitting on the mat."
)
# Semantically near-identical statements