Skip to content

Semantic Similarity

The Semantic Similarity metric computes the cosine similarity between the embeddings of actual_output and expected_output. No LLM model is needed.

How It Works

graph TD
    A[actual_output] --> B[1. Generate Embedding]
    C[expected_output] --> D[2. Generate Embedding]
    B --> E[3. Cosine Similarity]
    D --> E
    E --> F[Final Score 0.0-1.0]
  1. Embed actual output — converts actual_output to a vector representation
  2. Embed expected output — converts expected_output to a vector representation
  3. Cosine similarity — computes the cosine of the angle between the two vectors

Parameters

Parameter Type Default Description
threshold float 0.7 Minimum score to pass
embedding_provider str "openai" Embedding provider ("openai" or "local")
model_name str provider default Embedding model name

Required Fields

Field Required
actual_output Yes
expected_output Yes
input No
retrieval_context No

Usage

from eval_lib.metrics.vector_metrics import SemanticSimilarityMetric
from eval_lib import EvalTestCase, evaluate
import asyncio

test_case = EvalTestCase(
    actual_output="Paris is the capital of France.",
    expected_output="The capital of France is Paris."
)

metric = SemanticSimilarityMetric(
    threshold=0.8,
    embedding_provider="openai",
    model_name="text-embedding-3-small"
)

results = asyncio.run(evaluate([test_case], [metric]))

Cost

1 embedding API call per evaluation (both texts are batched into a single request).

Example Scenarios

High Score (0.95+)

EvalTestCase(
    actual_output="The cat sat on the mat.",
    expected_output="A cat was sitting on the mat."
)
# Semantically near-identical statements

Low Score (< 0.5)

EvalTestCase(
    actual_output="Python is a programming language.",
    expected_output="The snake slithered through the grass."
)
# Same word, completely different meaning