LLM Hallucination Detector – Detect Unsupported AI Claims avatar
LLM Hallucination Detector – Detect Unsupported AI Claims

Pricing

from $0.01 / 1,000 results

Go to Apify Store
LLM Hallucination Detector – Detect Unsupported AI Claims

LLM Hallucination Detector – Detect Unsupported AI Claims

Detect hallucinations, unsupported claims, and overconfident language in LLM outputs. Ideal for RAG pipelines, AI agents, and production QA.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

JAYESH SOMANI

JAYESH SOMANI

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

a day ago

Last modified

Share

LLM Hallucination Detector (Stage-1)

A Python Apify Actor that performs first-layer hallucination risk detection on LLM outputs. It flags overconfident language and claims not supported by the provided context, then emits a normalized risk score.

This Actor is designed to be used as an early signal in RAG pipelines, agent workflows, and LLM output QA — not as a fact checker.


What this Actor does

  • Detects overconfident language (e.g. definitely, guaranteed, always).
  • Flags unsupported claims by comparing output sentences against supplied context.
  • Produces a hallucination risk score (0.0–1.0) and a human-readable risk level.
  • Returns structured JSON suitable for automated pipelines.

What this Actor does NOT do

  • ❌ No web search or crawling
  • ❌ No external fact verification
  • ❌ No citation generation
  • ❌ No semantic or embedding-based validation

Think of this as a Stage-1 signal detector, not a truth engine.


How it works

  1. Scans the LLM output for predefined overconfident terms.

  2. Splits the output into sentences.

  3. Marks any sentence not found verbatim in the provided context as unsupported.

  4. Each detected issue adds 0.25 to the hallucination_score (capped at 1.0).

  5. Derives risk_level:

    • low≤ 0.3
    • medium> 0.3
    • high> 0.6

Inputs

Defined in .actor/input_schema.json:

FieldTypeRequiredDescription
model_outputstringYesLLM response to analyze
contextstringNoReference text used to validate claims

Outputs

One record per run is pushed to the default dataset:

{
"hallucination_score": 0.75,
"risk_level": "high",
"issues": [
{ "type": "overconfident_language", "value": "guaranteed" },
{ "type": "unsupported_claims", "count": 1 }
]
}

Output fields

  • hallucination_score — float (0.0–1.0)
  • risk_levellow | medium | high
  • issues — array of detected hallucination signals

Dataset and output views are defined in:

  • .actor/dataset_schema.json
  • .actor/output_schema.json

Quick start

Run locally with UI

$apify run

Run locally with direct input

apify run --input '{
"model_output": "This is absolutely guaranteed to work.",
"context": "The product improves efficiency."
}'

Results appear in:

apify_storage/datasets/default/000000001.json

Project structure

src/
└── main.py # Detection logic
.actor/
├── input_schema.json # Input validation & UI
├── dataset_schema.json # Dataset columns
└── output_schema.json # Output links

Configuration

  • Update confidence terms in CONFIDENCE_WORDS
  • Adjust scoring logic or thresholds in src/main.py
  • Extend issue types as needed

Deployment

apify login
apify push

Roadmap (planned)

  • Semantic claim matching (embeddings)
  • Batch input support
  • Claim-level scoring
  • External source validation (Stage-2)
  • Citation-based confidence scoring

Intended usage

✔ RAG guardrails ✔ Agent output QA ✔ Prompt regression testing ✔ LLM risk monitoring