# Unicode Text Inspector (`automation-lab/unicode-text-inspector`) Actor

Scan text for hidden Unicode characters: zero-width spaces, RTL override attacks, homoglyphs, and control characters. Get risk level + full codepoint details per character.

- **URL**: https://apify.com/automation-lab/unicode-text-inspector.md
- **Developed by:** [Stas Persiianenko](https://apify.com/automation-lab) (community)
- **Categories:** Developer tools
- **Stats:** 2 total users, 0 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Unicode Text Inspector

🔍 **Detect hidden Unicode characters, homoglyphs, invisible markers, and security threats in any text.** Analyze strings for zero-width spaces, RTL override attacks, Cyrillic/Greek look-alikes, control characters, and get a full Unicode category breakdown — all without any external dependencies.

### What does Unicode Text Inspector do?

Unicode Text Inspector scans text strings for characters that are **invisible, look deceptively similar to ASCII, or can manipulate text rendering**. It covers:

- **Zero-width characters** (U+200B, U+200C, U+200D, U+FEFF) — invisible spaces used for fingerprinting, SEO manipulation, and bypassing keyword filters
- **Bidirectional control characters** (U+202A–U+202E, U+2066–U+2069) — the building blocks of the **Trojan Source attack**, where displayed text looks different from actual logical content
- **ASCII control characters** (U+0000–U+001F, U+007F, U+0080–U+009F) — null bytes, escape sequences, and C1 controls that signal data corruption or injection attempts
- **Homoglyphs** — Cyrillic `а` (U+0430) vs Latin `a`, Greek `Η` (U+0397) vs ASCII `H`, fullwidth Latin characters (U+FF21–U+FF5A), and typographic quotes masquerading as ASCII
- **Unicode category breakdown** — count of letters, numbers, symbols, marks, separators, format characters, and control characters per text

Each text string produces one output record listing every flagged character with its **position, codepoint (e.g. U+200B), Unicode name, category, and a plain-English description** of the risk.

### Who is Unicode Text Inspector for?

**🔐 Security engineers and threat analysts**
- Detect homoglyph phishing domains in email headers (e.g., `pаypal.com` with a Cyrillic `а`)
- Catch Trojan Source bidi attacks in code review pipelines
- Identify null-byte injection attempts in web form inputs

**🗄️ Data quality and ETL engineers**
- Scrub invisible characters from user-generated content before indexing in Elasticsearch or Solr
- Validate imported datasets for hidden formatting characters that break string matching
- Clean CRM records that silently contain zero-width spaces from copy-paste operations

**🛡️ Content moderation teams**
- Detect attempts to bypass keyword filters using look-alike characters
- Identify text fingerprinting (watermarking with zero-width patterns)
- Find suspicious Unicode in usernames, product titles, and forum posts

**🔎 SEO and marketing professionals**
- Check scraped competitor content for invisible characters that could cause duplicate-content issues
- Validate structured data fields before submission to Google Search Console
- Ensure brand names and product titles are free of invisible markers

### Why use Unicode Text Inspector?

- ✅ **No external dependencies** — pure Unicode detection using built-in string operations. No rate limits, no API keys, no external services.
- ✅ **Covers all major threat vectors** — zero-width, bidi, homoglyphs, and control characters in one pass
- ✅ **Security-grade detection** — includes Trojan Source bidi patterns (CVE-2021-42574), not just simple invisible character checks
- ✅ **Rich output** — character position, codepoint, Unicode name, category, issue type, and description for every flagged character
- ✅ **Configurable detection** — enable/disable each detection type independently; turn off homoglyphs for performance-critical pipelines
- ✅ **Risk level scoring** — each text gets a `none/low/medium/high/critical` risk level for easy filtering
- ✅ **Batch processing** — analyze hundreds of strings in a single run; output is one dataset record per input text
- ✅ **Schedule and monitor** — run on a schedule to continuously audit new content in your database

### What data can you extract?

| Field | Description |
|-------|-------------|
| `textIndex` | Position in the input array (1-based) |
| `label` | Optional source tag you provide |
| `textPreview` | First 100 chars with invisible chars stripped |
| `totalCharacters` | Full Unicode codepoint count |
| `issueCount` | Total number of flagged characters |
| `hasSuspiciousContent` | Boolean quick-filter |
| `riskLevel` | `none` / `low` / `medium` / `high` / `critical` |
| `issues[].position` | 0-based index of the flagged character |
| `issues[].codepoint` | Unicode codepoint string (e.g. `U+200B`) |
| `issues[].codepointDecimal` | Integer codepoint value |
| `issues[].character` | The actual character (may be invisible) |
| `issues[].name` | Full Unicode character name |
| `issues[].category` | Unicode general category abbreviation |
| `issues[].categoryName` | Human-readable category |
| `issues[].issueType` | `zero-width` / `bidi-control` / `control-character` / `homoglyph` / `format-character` |
| `issues[].description` | Plain-English risk explanation |
| `categoryBreakdown.letters` | Total letter count (all scripts) |
| `categoryBreakdown.uppercaseLetters` | Uppercase letter count |
| `categoryBreakdown.lowercaseLetters` | Lowercase letter count |
| `categoryBreakdown.numbers` | Decimal digit count |
| `categoryBreakdown.punctuation` | Punctuation character count |
| `categoryBreakdown.symbols` | Symbol character count |
| `categoryBreakdown.separators` | Space separator count |
| `categoryBreakdown.marks` | Combining mark count |
| `categoryBreakdown.controlChars` | Control character count |
| `categoryBreakdown.formatChars` | Format/invisible character count |

### How much does it cost to analyze Unicode text?

Unicode Text Inspector uses **pay-per-event pricing** — you only pay for what you use:

| Event | FREE / BRONZE | SILVER | GOLD | PLATINUM | DIAMOND |
|-------|--------------|--------|------|----------|---------|
| Run started (one-time) | $0.001 | $0.001 | $0.001 | $0.001 | $0.001 |
| Per text analyzed | $0.00069 | $0.000552 | $0.000449 | $0.000345 | $0.000276 |

**Real-world cost examples:**

- 100 texts analyzed: ~$0.070 (FREE tier)
- 1,000 texts: ~$0.691
- 10,000 texts: ~$6.91
- 100 texts (DIAMOND tier): ~$0.029

**Free plan estimate:** Apify's free $5 credit gives you approximately **7,200 texts** at FREE tier pricing — more than enough for most one-off audits.

> Tip: DIAMOND tier users get 60% discount on per-text charges. Upgrade your Apify plan to reduce costs at scale.

### How to inspect text for Unicode issues

1. Go to [Unicode Text Inspector on Apify Store](https://apify.com/automation-lab/unicode-text-inspector)
2. Click **Try for free**
3. Paste your text strings into the **Texts to inspect** field (one per line, or as JSON array)
4. Configure detection options — all four detectors are enabled by default
5. Click **Start** and wait for the run to complete (typically 2–10 seconds)
6. Download results as JSON, CSV, or Excel from the **Dataset** tab

#### Input JSON example — basic inspection

```json
{
    "texts": [
        "Hello\u200b World",
        "paypal.com (p\u0430ypal with Cyrillic a)",
        "Normal safe text"
    ],
    "detectHomoglyphs": true,
    "detectInvisible": true
}
````

#### Input JSON example — security audit of email subjects

```json
{
    "texts": [
        "Your account has been suspended",
        "Urgent: verify your p\u0430ssword",
        "Click here to reset access"
    ],
    "label": "email_subjects_2024_01",
    "detectHomoglyphs": true,
    "detectBidi": true,
    "detectControl": true,
    "detectInvisible": true,
    "includeCategoryBreakdown": false
}
```

#### Input JSON example — data quality audit (performance mode)

```json
{
    "texts": ["item 1", "item 2", "item 3"],
    "detectHomoglyphs": false,
    "detectInvisible": true,
    "detectControl": true,
    "detectBidi": true,
    "includeCategoryBreakdown": false
}
```

### Input parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `texts` | array | **required** | Array of text strings to analyze. Each becomes one output record. |
| `detectHomoglyphs` | boolean | `true` | Flag characters that look like ASCII but are different Unicode codepoints (Cyrillic, Greek, fullwidth Latin) |
| `detectInvisible` | boolean | `true` | Flag zero-width spaces, zero-width joiners, BOM, and other invisible/format characters |
| `detectControl` | boolean | `true` | Flag ASCII and C1 control characters (null bytes, escape, etc.) |
| `detectBidi` | boolean | `true` | Flag bidirectional control characters (Trojan Source attack vectors) |
| `includeCategoryBreakdown` | boolean | `true` | Include Unicode category counts per text (letters, numbers, symbols, etc.) |
| `label` | string | `null` | Optional tag to attach to all output records (e.g., `"email_subjects"`, `"user_input"`) |

### Output examples

**Text with zero-width space:**

```json
{
    "textIndex": 1,
    "label": "demo",
    "textPreview": "Hello World",
    "totalCharacters": 12,
    "issueCount": 1,
    "hasSuspiciousContent": true,
    "riskLevel": "low",
    "issues": [
        {
            "position": 5,
            "codepoint": "U+200B",
            "codepointDecimal": 8203,
            "character": "​",
            "name": "ZERO WIDTH SPACE",
            "category": "Cf",
            "categoryName": "Format",
            "issueType": "zero-width",
            "description": "Invisible zero-width character that can hide text, break search, or be used for text fingerprinting."
        }
    ]
}
```

**Text with bidi override (critical / Trojan Source):**

```json
{
    "textIndex": 2,
    "riskLevel": "critical",
    "issueCount": 2,
    "issues": [
        {
            "position": 6,
            "codepoint": "U+202E",
            "name": "RIGHT-TO-LEFT OVERRIDE",
            "issueType": "bidi-control",
            "description": "Bidirectional control character that can reorder displayed text (Trojan Source attack vector)."
        }
    ]
}
```

**Clean text:**

```json
{
    "textIndex": 3,
    "textPreview": "Normal clean text",
    "issueCount": 0,
    "hasSuspiciousContent": false,
    "riskLevel": "none",
    "issues": []
}
```

### Tips for best results

- 🚀 **Start small** — test with 5–10 strings first to verify the detection settings match your needs before running large batches
- 🏷️ **Use the `label` field** — tag batches with a source identifier (e.g., `"product_titles_jan"`) to track which dataset was audited
- ⚡ **Disable `includeCategoryBreakdown`** for large batches where you only care about security issues — it reduces output size
- 🔇 **Disable `detectHomoglyphs`** if your content legitimately contains Cyrillic or Greek text (e.g., multilingual apps)
- 🎯 **Filter by `riskLevel`** in downstream processing: `critical` and `high` need human review; `low` may be benign copy-paste artifacts
- 📅 **Schedule regular audits** — run on a daily or weekly schedule against new user-generated content, imported data, or crawled text
- 🔗 **Combine with webhook** — trigger automated alerts when runs find `critical` or `high` risk texts

### Integrations

**Unicode Text Inspector → Google Sheets (content moderation audit)**
Use the [Apify → Google Sheets integration](https://apify.com/integrations) to automatically append flagged texts to a review spreadsheet. Filter rows where `riskLevel = "critical"` for priority review.

**Unicode Text Inspector → Slack (security alerts)**
Connect via Make or Zapier: when a run dataset contains any record with `riskLevel = "critical"`, post an alert to your #security Slack channel with the text preview and codepoints found.

**Unicode Text Inspector → Elasticsearch (data quality pipeline)**
Use the JSON output as a pre-indexing filter. Strip or reject texts where `hasSuspiciousContent = true` before feeding them to your search index to prevent invisible character search poisoning.

**Scheduled runs (continuous monitoring)**
Schedule this actor to run nightly against your CRM contact names, product catalog titles, or user profile fields. Export results to your data warehouse to track zero-width character prevalence over time.

**Unicode Text Inspector → Make/Zapier (form validation)**
Trigger a run on new form submissions via webhook. If any field returns `riskLevel != "none"`, flag the submission for review or reject it automatically.

### Using the Apify API

#### Node.js

```javascript
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });

const run = await client.actor('automation-lab/unicode-text-inspector').call({
    texts: [
        'Hello\u200b World',
        'p\u0430ypal.com (phishing domain candidate)',
    ],
    detectHomoglyphs: true,
    detectBidi: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
    console.log(`Text ${item.textIndex}: risk=${item.riskLevel}, issues=${item.issueCount}`);
}
```

#### Python

```python
from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run = client.actor("automation-lab/unicode-text-inspector").call(run_input={
    "texts": [
        "Hello\u200b World",
        "p\u0430ypal.com (Cyrillic a)",
    ],
    "detectHomoglyphs": True,
    "detectBidi": True,
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"Text {item['textIndex']}: risk={item['riskLevel']}, issues={item['issueCount']}")
```

#### cURL

```bash
curl -X POST "https://api.apify.com/v2/acts/automation-lab~unicode-text-inspector/runs?token=YOUR_APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "texts": ["Hello\u200b World", "Normal text"],
    "detectHomoglyphs": true,
    "detectBidi": true
  }'
```

### Use with AI agents via MCP

Unicode Text Inspector is available as a tool for AI assistants that support the [Model Context Protocol (MCP)](https://docs.apify.com/platform/integrations/mcp).

Add the Apify MCP server to your AI client — this gives you access to all Apify actors, including this one:

#### Setup for Claude Code

```bash
claude mcp add --transport http apify "https://mcp.apify.com?tools=automation-lab/unicode-text-inspector"
```

#### Setup for Claude Desktop, Cursor, or VS Code

Add this to your MCP config file:

```json
{
    "mcpServers": {
        "apify": {
            "type": "http",
            "url": "https://mcp.apify.com?tools=automation-lab/unicode-text-inspector",
            "headers": { "Authorization": "Bearer YOUR_APIFY_TOKEN" }
        }
    }
}
```

#### Example prompts for AI agents

- *"Check this text for hidden Unicode characters: 'Hello​ World'"*
- *"Scan these domain names for homoglyph spoofing: paypal.com, pаypal.com, amazon.com"*
- *"Analyze this CSV column of user-submitted names and flag any with bidirectional Unicode or zero-width spaces"*

### Is it legal to analyze text with this tool?

Unicode Text Inspector is a **text analysis utility** — it performs no web scraping, makes no external requests, and does not interact with any website. You provide text strings; the actor processes them locally on Apify infrastructure.

There are no legal concerns with Unicode character detection on text you own or have rights to process. Always ensure your data handling complies with applicable privacy regulations (GDPR, CCPA) when processing user-generated content.

### FAQ

**How fast is Unicode Text Inspector?**
Very fast. Pure in-memory string processing with no I/O or network calls. A batch of 1,000 strings typically completes in under 5 seconds. The per-run timeout of 300 seconds can handle hundreds of thousands of texts.

**How much does it cost to analyze 10,000 texts?**
At FREE/BRONZE tier pricing: $0.001 (start) + 10,000 × $0.00069 = approximately $6.90. At DIAMOND tier: approximately $2.76.

**Does it detect all Unicode homoglyphs?**
The current detector covers the most common script-based confusables: Cyrillic, Greek, and fullwidth Latin characters. This covers the vast majority of real-world phishing and spoofing cases. Rare confusables from other scripts (Armenian, Georgian, etc.) are not yet included. The detector is pattern-based, not a comprehensive Unicode confusables database — it prioritizes precision over recall.

**Why are some results marked `risk=low` even though there's a zero-width space?**
Zero-width spaces are sometimes inserted legitimately by word processors, CMS platforms, and copy-paste operations (especially from web pages). `low` risk means an issue was found but it may not be malicious. Only bidirectional override characters are marked `critical` because they have no legitimate use in most text contexts.

**Why is a text showing 0 issues when I can see something strange in it?**
Most likely the character is not in the current detection tables. Try enabling all detection options. If the character is in a script that isn't covered (e.g., Armenian lookalikes), it won't be detected. You can check the character manually by pasting it into a Unicode inspector like [unicode.org](https://unicode.org/cldr/utility/character.jsp).

**Can I analyze very long texts?**
Yes. The actor processes texts of any length. Very long texts (millions of characters) may take a few seconds each. The 300-second timeout is sufficient for typical use cases. If you need to analyze extremely long documents, split them into chunks before passing to the actor.

### Other text and data quality tools

Looking for related utilities? Check these automation-lab actors:

- 🎨 [Color Contrast Checker](https://apify.com/automation-lab/color-contrast-checker) — WCAG 2.1 AA/AAA contrast ratio validation for UI design
- 🔬 [Accessibility Checker](https://apify.com/automation-lab/accessibility-checker) — WCAG accessibility audit for web pages
- 📋 [JSON Schema Generator](https://apify.com/automation-lab/json-schema-generator) — Generate JSON Schema from example JSON data
- 🔗 [Ads.txt Checker](https://apify.com/automation-lab/ads-txt-checker) — Validate ads.txt files for publisher compliance
- 📊 [Base64 Converter](https://apify.com/automation-lab/base64-converter) — Encode and decode Base64 strings

# Actor input Schema

## `texts` (type: `array`):

List of text strings to analyze. Each string will produce one output record with all detected Unicode issues.

## `detectHomoglyphs` (type: `boolean`):

Flag characters that look like ASCII letters but are different Unicode codepoints (e.g., Cyrillic 'а' vs Latin 'a'). Useful for catching phishing domains, spoofed text, and confusable characters.

## `detectInvisible` (type: `boolean`):

Flag zero-width spaces (U+200B), zero-width non-joiners (U+200C), zero-width joiners (U+200D), byte order marks (U+FEFF), and other invisible/formatting characters.

## `detectControl` (type: `boolean`):

Flag ASCII control characters (U+0000–U+001F, U+007F) and C1 control characters (U+0080–U+009F) that should not appear in normal text.

## `detectBidi` (type: `boolean`):

Flag RTL/LTR override and embedding characters (U+200E, U+200F, U+202A–U+202E, U+2066–U+2069) that can be used to disguise malicious text (Trojan Source attacks).

## `includeCategoryBreakdown` (type: `boolean`):

Include a count breakdown of Unicode categories (letters, numbers, symbols, marks, separators, etc.) for each text.

## `label` (type: `string`):

Optional label to attach to all results (e.g., 'user\_input', 'email\_subject', 'filename'). Useful for batch processing from multiple sources.

## Actor input object example

```json
{
  "texts": [
    "Hello​ World",
    "Ηello (homoglyph H)",
    "Normal clean text"
  ],
  "detectHomoglyphs": true,
  "detectInvisible": true,
  "detectControl": true,
  "detectBidi": true,
  "includeCategoryBreakdown": true
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "texts": [
        "Hello​ World",
        "Ηello (homoglyph H)",
        "Normal clean text"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("automation-lab/unicode-text-inspector").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "texts": [
        "Hello​ World",
        "Ηello (homoglyph H)",
        "Normal clean text",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("automation-lab/unicode-text-inspector").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "texts": [
    "Hello​ World",
    "Ηello (homoglyph H)",
    "Normal clean text"
  ]
}' |
apify call automation-lab/unicode-text-inspector --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=automation-lab/unicode-text-inspector",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Unicode Text Inspector",
        "description": "Scan text for hidden Unicode characters: zero-width spaces, RTL override attacks, homoglyphs, and control characters. Get risk level + full codepoint details per character.",
        "version": "0.1",
        "x-build-id": "WsAdherzYcz7enXcd"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/automation-lab~unicode-text-inspector/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-automation-lab-unicode-text-inspector",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/automation-lab~unicode-text-inspector/runs": {
            "post": {
                "operationId": "runs-sync-automation-lab-unicode-text-inspector",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/automation-lab~unicode-text-inspector/run-sync": {
            "post": {
                "operationId": "run-sync-automation-lab-unicode-text-inspector",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "texts"
                ],
                "properties": {
                    "texts": {
                        "title": "🔍 Texts to inspect",
                        "type": "array",
                        "description": "List of text strings to analyze. Each string will produce one output record with all detected Unicode issues.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "detectHomoglyphs": {
                        "title": "🔤 Detect homoglyphs",
                        "type": "boolean",
                        "description": "Flag characters that look like ASCII letters but are different Unicode codepoints (e.g., Cyrillic 'а' vs Latin 'a'). Useful for catching phishing domains, spoofed text, and confusable characters.",
                        "default": true
                    },
                    "detectInvisible": {
                        "title": "👻 Detect invisible characters",
                        "type": "boolean",
                        "description": "Flag zero-width spaces (U+200B), zero-width non-joiners (U+200C), zero-width joiners (U+200D), byte order marks (U+FEFF), and other invisible/formatting characters.",
                        "default": true
                    },
                    "detectControl": {
                        "title": "🚫 Detect control characters",
                        "type": "boolean",
                        "description": "Flag ASCII control characters (U+0000–U+001F, U+007F) and C1 control characters (U+0080–U+009F) that should not appear in normal text.",
                        "default": true
                    },
                    "detectBidi": {
                        "title": "↔️ Detect bidirectional markers",
                        "type": "boolean",
                        "description": "Flag RTL/LTR override and embedding characters (U+200E, U+200F, U+202A–U+202E, U+2066–U+2069) that can be used to disguise malicious text (Trojan Source attacks).",
                        "default": true
                    },
                    "includeCategoryBreakdown": {
                        "title": "📊 Include Unicode category breakdown",
                        "type": "boolean",
                        "description": "Include a count breakdown of Unicode categories (letters, numbers, symbols, marks, separators, etc.) for each text.",
                        "default": true
                    },
                    "label": {
                        "title": "🏷️ Label / source tag",
                        "type": "string",
                        "description": "Optional label to attach to all results (e.g., 'user_input', 'email_subject', 'filename'). Useful for batch processing from multiple sources."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
