Under maintenance

Pricing

from $0.01 / 1,000 results

Try for free

Go to Apify Store

LLM Hallucination Detector – Detect Unsupported AI Claims

Under maintenance

Try for free

Detect hallucinations, unsupported claims, and overconfident language in LLM outputs. Ideal for RAG pipelines, AI agents, and production QA.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

JAYESH SOMANI

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

LLM Hallucination Detector (Stage-1)

A Python Apify Actor that performs first-layer hallucination risk detection on LLM outputs. It flags overconfident language and claims not supported by the provided context, then emits a normalized risk score.

This Actor is designed to be used as an early signal in RAG pipelines, agent workflows, and LLM output QA — not as a fact checker.

What this Actor does

Detects overconfident language (e.g. definitely, guaranteed, always).
Flags unsupported claims by comparing output sentences against supplied context.
Produces a hallucination risk score (0.0–1.0) and a human-readable risk level.
Returns structured JSON suitable for automated pipelines.

What this Actor does NOT do

❌ No web search or crawling
❌ No external fact verification
❌ No citation generation
❌ No semantic or embedding-based validation

Think of this as a Stage-1 signal detector, not a truth engine.

How it works

Scans the LLM output for predefined overconfident terms.
Splits the output into sentences.
Marks any sentence not found verbatim in the provided context as unsupported.
Each detected issue adds 0.25 to the hallucination_score (capped at 1.0).
Derives risk_level:
- low → ≤ 0.3
- medium → > 0.3
- high → > 0.6

Inputs

Defined in .actor/input_schema.json:

Field	Type	Required	Description
`model_output`	string	Yes	LLM response to analyze
`context`	string	No	Reference text used to validate claims

Outputs

One record per run is pushed to the default dataset:

{
  "hallucination_score": 0.75,
  "risk_level": "high",
  "issues": [
    { "type": "overconfident_language", "value": "guaranteed" },
    { "type": "unsupported_claims", "count": 1 }
  ]
}

Output fields

hallucination_score — float (0.0–1.0)
risk_level — low | medium | high
issues — array of detected hallucination signals

Dataset and output views are defined in:

.actor/dataset_schema.json
.actor/output_schema.json

Quick start

Run locally with UI

$apify run

Run locally with direct input

apify run --input '{
  "model_output": "This is absolutely guaranteed to work.",
  "context": "The product improves efficiency."
}'

Results appear in:

apify_storage/datasets/default/000000001.json

Project structure

src/
 └── main.py                # Detection logic
.actor/
 ├── input_schema.json      # Input validation & UI
 ├── dataset_schema.json    # Dataset columns
 └── output_schema.json     # Output links

Configuration

Update confidence terms in CONFIDENCE_WORDS
Adjust scoring logic or thresholds in src/main.py
Extend issue types as needed

Deployment

apify login
apify push

Roadmap (planned)

Semantic claim matching (embeddings)
Batch input support
Claim-level scoring
External source validation (Stage-2)
Citation-based confidence scoring

Intended usage

✔ RAG guardrails ✔ Agent output QA ✔ Prompt regression testing ✔ LLM risk monitoring

Pricing Detector

brave_zygantrum/pricing-radar

Detect SaaS Pricing Pages & Extract Prices without LLM

Etan gentil

5.0

Web-to-Markdown Generator for AI & RAG Pipelines

profitstack/web-to-markdown-generator-for-ai-rag-pipelines

Convert any website into clean, heading-based chunking, LLM-ready Markdown for RAG and AI agents.

Manas Mantri

AI Context Fetcher: Clean Text for RAG

sarvesh_bijawe/ai-context-fetcher-clean-text-for-rag

Instantly extracts clean, ad-free text from any URL. Designed for AI Agents, RAG pipelines, and LLM context windows.

Sarvesh Bijawe

AI Training Data Scraper - LLM and RAG-Ready

george.the.developer/ai-training-data-scraper

Extract web content formatted for LLM fine-tuning and RAG pipelines. Output in OpenAI JSONL, Claude JSONL, Markdown, or raw text.

George Kioko

AI-Ready Website Crawler

optimus-fulcria/ai-ready-website-crawler

Crawl websites and convert to clean markdown for AI/RAG, LLM fine-tuning, and document pipelines.

Fulcria Labs

YouTube Transcript & Metadata Extractor (LLM-ready)

jocadev/youtube-transcript-metadata-extractor-llm-ready

Extract full YouTube transcripts and video metadata in one run. Includes LLM-ready full text, timestamped segments, and engagement stats — perfect for AI pipelines, automation, and content analysis. Fast, clean, and production-ready.

Joca

5.0

RAG Browser

visita/rag-browser

This Actor provides essential web browsing and content extraction functionality for AI Agents, LLM applications, and Retrieval-Augmented Generation (RAG) pipelines. It functions similarly to the web search feature in popular LLM chatbots, providing fresh, contextualized data directly from the web.

Visita Intelligence

Ai Image Detector

vivid_astronaut/ai-image-detector

Fabio Suizu

Ai Document Qa

vivid_astronaut/ai-document-qa

Fabio Suizu

LLM-Ready Web Scraper

devoted_helix/llm-web-scraper

Convert web pages to clean, LLM-friendly text. Perfect for RAG pipelines, AI chatbot training, and fine-tuning datasets. Removes ads,menus, and clutter automatically.