Entity Extractor — emails, URLs, phones, dates (regex, no LLM) avatar

Entity Extractor — emails, URLs, phones, dates (regex, no LLM)

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Entity Extractor — emails, URLs, phones, dates (regex, no LLM)

Entity Extractor — emails, URLs, phones, dates (regex, no LLM)

Extract structured entities from free text: email addresses, URLs, phone numbers (incl. Japanese formats and full-width digits), dates (ISO, slash, Japanese 年月日) and IP addresses. Deterministic regex extraction with per-kind counts — fast, cheap, no LLM.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Shinobu Otani

Shinobu Otani

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Categories

Share

Entity Extractor

Extract structured entities from free text with deterministic regexes — fast, cheap, no LLM.

What it does

  • Emails, URLs (http/https, trailing punctuation stripped), phone numbers (international +XX… and Japanese 0X-XXXX-XXXX formats), dates (YYYY-MM-DD, YYYY/M/D, YYYY年M月D日) and IPv4 addresses.
  • Input is NFKC-normalized first, so full-width Japanese digits and symbols (090-1234-5678) extract cleanly.
  • Each entity kind can be toggled; values are deduplicated by default (first occurrence kept, order preserved).

Input

{
"texts": ["Contact info@example.com or 03-1234-5678, see https://example.com on 2026-06-13."],
"emails": true,
"urls": true,
"phone_numbers": true,
"dates": true,
"ip_addresses": true,
"unique": true
}

Output (one dataset item per text)

{
"emails": ["info@example.com"],
"urls": ["https://example.com"],
"phone_numbers": ["03-1234-5678"],
"dates": ["2026-06-13"],
"ip_addresses": [],
"counts": {"emails": 1, "urls": 1, "phone_numbers": 1, "dates": 1, "ip_addresses": 0},
"total": 4,
"index": 0
}

Usage

Point it at scraped pages, support tickets, or listings to pull out contact details and dates for CRM enrichment, lead lists, or monitoring pipelines.