Pricing

Pay per event

Try for free

Go to Apify Store

Japan Contact Scraper

Try for free

Extract emails, Japanese phone numbers (03-, 090-, 0120- formats), and social media links from Japanese company websites. Optimized regex patterns ensure high accuracy with minimal false positives.

Pricing

Pay per event

Rating

5.0

(1)

Developer

kyo kou

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

Email & Phone Scraper for Japanese Websites — 日本企業の連絡先を一括抽出

日本企業サイトの問い合わせ先探し、手作業で1サイトずつ調べていませんか？

This Actor crawls Japanese company websites and extracts email addresses, phone numbers (固定電話・携帯・フリーダイヤル), social media profiles, and contact form URLs — all in a single batch run. Built specifically for Japanese B2B lead generation and sales list building.

Quick Start

Run on Apify Console or via API:

apify call your-username/japan-contact-scraper \
  --input='{"urls": ["https://example.co.jp"], "maxPagesPerSite": 10}'

Or call the Apify API directly:

curl "https://api.apify.com/v2/acts/your-username~japan-contact-scraper/runs" \
  -X POST \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{"urls": ["https://example.co.jp"], "maxPagesPerSite": 10}'

Features

Batch URL processing — scrape hundreds of Japanese websites in a single run
Email extraction with DNS validation — verifies that the mail domain actually exists, auto-excludes dummy addresses (example.com, test.com, etc.)
Japanese phone number parsing — landline (固定電話), mobile (070/080/090), IP phone (050), toll-free / フリーダイヤル (0120/0800), using the phonenumbers-jp library
Social media profile detection — YouTube, Instagram, Facebook, GitHub, Reddit (LINE and X/Twitter are not currently supported)
Contact form URL detection (optional) — finds Japanese "お問い合わせ" pages by scoring 4 signals: URL path, link text, page title, and form element presence
Per-site page limits — control crawl depth per domain to stay within budget
Blacklist filtering — regex and domain-based rules to reduce false positives

How It Works

This Actor uses Crawlee with the BeautifulSoup crawler (HTTP-based, no browser). It follows same-domain links up to your configured page limit and extracts contact information from each page's HTML.

What this means in practice:

✅ Fast and lightweight — no headless browser overhead
✅ Respects per-site page limits to avoid excessive crawling
⚠️ JavaScript-rendered content (SPAs, React sites) is not visible to this crawler. If the contact info is loaded dynamically via JS, it won't be extracted.
⚠️ No built-in robots.txt enforcement — please check each site's robots.txt manually if compliance is important for your use case.

Input

Field	Type	Description
`urls`	Array (required)	Target website URLs（スクレイピング対象のURLリスト）
`maxPagesPerSite`	Integer	Max pages to crawl per site — default: 10（サイトあたりの最大ページ数）
`enableContactForm`	Boolean	Enable contact form URL detection — default: false（お問い合わせフォームURL検出の有効化）

Example Input

{
    "urls": [
        "https://example.co.jp",
        "https://another-company.co.jp"
    ],
    "maxPagesPerSite": 20,
    "enableContactForm": true
}

Output

One result per URL in the dataset:

{
    "url": "https://example.co.jp",
    "emails": ["info@example.co.jp", "sales@example.co.jp"],
    "phones": [
        {
            "number": "0312345678",
            "formatted": "03-1234-5678",
            "type": "固定電話"
        },
        {
            "number": "09012345678",
            "formatted": "090-1234-5678",
            "type": "携帯"
        }
    ],
    "socials": {
        "youtube": ["https://www.youtube.com/@example"],
        "facebook": ["https://www.facebook.com/example.japan"]
    },
    "contact_url": {
        "url": "https://example.co.jp/contact/",
        "score": 95,
        "has_form": true,
        "error": null
    },
    "error": null
}

Note: The type field in phone results uses Japanese labels (固定電話, 携帯, IP電話, フリーダイヤル) as returned by the phonenumbers-jp library.

Supported Japanese Phone Formats

Type	Example
Landline / 固定電話	03-1234-5678, 06-1234-5678
Mobile / 携帯電話	090-1234-5678, 080-1234-5678, 070-1234-5678
IP Phone / IP電話	050-1234-5678
Toll-free / フリーダイヤル	0120-123-456, 0800-123-4567

Contact Form Detection — How Scoring Works

When enableContactForm is enabled, each crawled page is scored across 4 dimensions:

Signal	Example match	Points
URL path contains `toiawase` / `contact`	`/otoiawase/`, `/contact/`	+15 ~ +25
Link text matches Japanese contact terms	「お問い合わせ」「ご相談」	+25 ~ +45
Page title contains contact keywords	`<title>お問い合わせ</title>`	+30
Page has a real `<form>` with inputs	`<form>` + `<input>` + submit button	+35

The highest-scoring page is returned as the contact form URL. Pages matching negative patterns (/blog, /faq, /product, etc.) receive a -15 penalty.

Use Cases

BtoB lead generation（BtoBリード獲得） — build targeted outreach lists of Japanese companies
Sales prospecting（営業リスト作成） — collect contact details for cold outreach campaigns
Partner / supplier research（取引先調査） — bulk-collect contact info for potential business partners
Market research & competitive analysis（市場調査・競合分析） — gather structured contact data across an industry
CRM enrichment — import verified Japanese contact info into your CRM

Limitations

No JavaScript rendering — this crawler uses HTTP requests + BeautifulSoup, not a headless browser. Content rendered by JavaScript (React, Vue, Angular SPAs) will not be scraped.
No LINE or X (Twitter) detection — these platforms are not currently supported for social profile extraction.
Anti-scraping protections — sites with CAPTCHAs, Cloudflare, or aggressive rate limiting may return incomplete results.
No timeout configuration — crawl timeouts use Crawlee's defaults.

Legal Considerations / 法的事項

This tool only extracts publicly available information from website HTML. It does not bypass authentication, access restricted pages, or collect non-public data.

Users are responsible for complying with:

Each website's terms of service and robots.txt
Japan's Act on the Protection of Personal Information（個人情報保護法）
Japan's Unauthorized Computer Access Act（不正アクセス禁止法）
All other applicable laws in your jurisdiction

Google Maps Japan Scraper — Email + Business Leads

totaka/google-maps-japan-scraper

Extract Japanese business leads from Google Maps — name, address, phone, email, website, rating and GPS. Emails auto-extracted from websites. Works in English and Japanese. $0.001/result.

Thomas Gharbi

Entity Extractor — emails, URLs, phones, dates (regex, no LLM)

shoebill-dev27/entity-extractor

Extract structured entities from free text: email addresses, URLs, phone numbers (incl. Japanese formats and full-width digits), dates (ISO, slash, Japanese 年月日) and IP addresses. Deterministic regex extraction with per-kind counts — fast, cheap, no LLM.

Shinobu Otani

Extract Emails, Phone & Social Media from Website

contacts-api/extract-emails-phone-social-media-from-website

Easily extract emails, phone numbers, and social media links from websites. Perfect for lead generation, prospecting, and outreach with fast and accurate results.