Pricing

Pay per event

Go to Apify Store

PII Masking

Try for free

Identify, mark and replace PII information.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Canadesk Support

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

PII Redaction (PPE)

This Apify Actor redacts PII/PHI/PCI from text using a remote redaction API.

What it does

Takes either:
- INPUT.text (a string or array of strings), or
- INPUT.datasetId (reads items from an existing dataset and redacts selected fields)
Sends text to a redaction API for entity detection + redaction (marker or mask)
Makes one API call per input text
Writes multiple output records:
- one record per sentence (type: "sentence")
- one record per entity (type: "entity")

Pricing (PPE)

The Actor uses Pay-Per-Event.

Event: REDACTION_REQUEST (configurable via chargeEventName)
Charged per output entry: One PPE event per dataset record pushed.
- each type: "sentence" record charges once
- each type: "entity" record charges once

This matches Apify's PPC/PPE design where events correspond to output units.

Input

Parameter	Type	Default	Description
`apiKey`	String	(optional)	API key (sent as `x-api-key`). Use an Apify Secret in production.
`text`	String \| String[]		Text(s) to redact.
`datasetId`	String		Dataset to read items from.
`fieldPaths`	String[]	`["text"]`	Field paths (dot/bracket notation) to redact inside each dataset item, e.g. `"text"`, `"customer.message"`, `"items[0].note"`.
`split.enabled`	Boolean	`true`	Whether to split the redacted output into sentences for dataset records.
`hideit.endpoint`	String	(default set in code)	Override the redaction endpoint URL.
`hideit.entityTypes`	String[]	(sane default list)	Entity types to enable.
`hideit.outputMode`	String	`MARKER`	Redaction output mode: `MARKER` (replace with typed markers like `[NAME_1]`) or `MASK` (replace with a fixed token).
`hideit.maskToken`	String	`[REDACTED]`	Only used when `hideit.outputMode="MASK"`.
`hideit.markerPattern`	String	`[UNIQUE_NUMBERED_ENTITY_TYPE]`	Marker pattern for redaction output.
`hideit.markerLanguage`	String	`en`	Marker language.
`hideit.coreferenceResolution`	String	`heuristics`	Coreference setting supported by the endpoint.
`proxy`	Object		Optional Apify proxy configuration.
`chargeEventName`	String	`REDACTION_REQUEST`	PPE event name to charge.

Output modes

MARKER: returns structured placeholders like [NAME_1], [PHONE_NUMBER_1].
MASK: returns a uniform token like [REDACTED].

Output dataset schema

Two kinds of output records are written:

Sentence records (`type: "sentence"`)

type: "sentence"
source: "input.text" or "dataset"
sourceMeta: metadata like index, or { datasetId, itemIndex, fieldPath }
sentenceIndex, sentencesTotal
text (one redacted sentence)
hideit request metadata (endpoint + mode)
createdAt

Entity records (`type: "entity"`)

type: "entity"
source, sourceMeta
entityIndex, entitiesTotal
entity (as returned by the endpoint)
createdAt

Notes

Keep your API keys out of git. Prefer Apify Secrets.
For very large datasets, you may want to implement pagination + concurrency. Current implementation reads the dataset first page only.

Length limits

limits.maxInputChars (default: 10000): if an input text exceeds this length, the actor logs max reached, truncates the text, and continues processing.

SecureRedact: Zero-Cost PII Guard

juwan/secureredact-zero-cost-pii-guard

Protect your AI users instantly. This lightweight API automatically detects & redacts sensitive PII (Phones, Emails, Names) from text. HIPAA/GDPR-ready, Free Tier friendly, and built for high-performance AI Agents. No configuration needed

Muhamad Juwandi

5.0

AI Training Data Curator

ryanclinton/ai-training-data-curator

Crawl websites and extract clean training data for LLMs. Quality scoring, deduplication, PII detection, markdown output. Built for fine-tuning and RAG pipelines.

ryan clinton

Voice Ai Data Pipeline

solutionssmart/voice-ai-data-pipeline

Turn voice and chat transcripts into training-ready datasets. Anonymize PII, normalize dialogue, detect language, and add lightweight intent, sentiment, and resolution labels. No scraping — processes only data you provide.

Solutions Smart

LLM Data Pipeline Pro

sanztheo/llm-data-pipeline-pro

Transform websites into LLM training data. Scrape, validate, deduplicate, chunk for RAG, and export to OpenAI/Anthropic/Mistral formats. Built-in PII detection and GDPR compliance. Vector DB export to Pinecone & Qdrant.

Theo Sanz

Google Ads Offline Conversions - GCLID & Enhanced Conversions

alizarin_refrigerator-owner/google-ads-offline-conversions---gclid-enhanced-conversions

Upload offline conversions to Google Ads using GCLID click tracking, enhanced conversions with hashed PII & store sales data for campaign attribution. Upload conversions w/GCLID Upload phone calls Upload w/hashed PII Upload in-store transactions All conversion actions Conversion action details

The Howlers

TikTok Events API - Server-Side Conversions

alizarin_refrigerator-owner/tiktok-events-api---server-side-conversions

Send offline events and server-side conversions to TikTok Events API. Track purchases, signups, and custom events with automatic PII hashing for campaign attribution.

The Howlers

NLP API MCP Server

vivid_astronaut/nlp-api-mcp

Text analysis MCP server: toxicity detection, sentiment analysis, named entity recognition, PII detection, language identification. 6 tools for AI agents. BERT/DistilBERT/fastText models, sub-50ms latency, 176 languages.

Fabio Suizu

Rag Architect

ai_solutionist/rag-architect

Transform any website into vector-store-ready knowledge chunks for Pinecone, Weaviate, LangChain, LlamaIndex, Supabase, n8n & more. AI-generated Q&A pairs, smart chunking, PII scrubbing. Build hallucination-free RAG chatbots in minutes.

Jason Pellerin

Offline Conversions Master - Multi-Platform Event Distribution

alizarin_refrigerator-owner/offline-conversions-master---multi-platform-event-distribution

Import offline conversions from multiple sources, validate, normalize, hash PII for compliance, and distribute to any ad platform via webhooks. Perfect for tracking in-store purchases, phone orders, and CRM conversions.

The Howlers