TokyoDev Scraper - Japan Tech Job Listings & Companies
Pricing
Pay per usage
TokyoDev Scraper - Japan Tech Job Listings & Companies
Scrape job listings and company profiles from TokyoDev.com. Extract job titles, companies, locations, remote policies, salary ranges, Japanese language requirements, visa sponsorship, and technology tags.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
BowTiedRaccoon
Actor stats
0
Bookmarked
3
Total users
0
Monthly active users
10 days ago
Last modified
Categories
Share
TokyoDev Job & Company Scraper
Scrapes tech job listings and company profiles from TokyoDev.com, the primary English-language job board for developers targeting Japan's tech industry. Returns jobs with titles, salaries, remote policies, Japanese language requirements, visa sponsorship signals, and technology tags — plus company profiles with descriptions and tech stacks — across ~182 job listings and ~232 company pages.
TokyoDev Scraper Features
- Scrapes job listings, company profiles, or both via a single
scrapeModeselector - Extracts Japanese language requirement per listing — true/false, not buried in description text
- Captures remote policy per job: fully-remote, partially-remote, or no-remote
- Returns apply-from-abroad eligibility where disclosed — useful for candidates outside Japan
- Collects technology and skill tags per listing (Ruby, Python, React, etc.)
- Filters by remote policy, seniority level, or Japanese language requirement before saving
- Accepts specific TokyoDev URLs directly — skip sitemap discovery for targeted runs
- Uses residential proxy to bypass Cloudflare protection on all non-sitemap pages
Who Uses TokyoDev Data?
- Recruiters — Pull structured Japan tech listings with remote and language filters already applied, not raw HTML to parse
- Job aggregators — Ingest English-language Japan tech jobs with consistent field structure across listings
- Market researchers — Analyze salary trends, remote policy distribution, and Japanese language demand across the Japan tech sector
- HR analytics teams — Build datasets tracking which companies are hiring, what seniority levels are in demand, and what tech stacks are common
- Candidate matching platforms — Filter by
japanese_requiredandapply_from_abroadto surface realistic options for international applicants
How TokyoDev Scraper Works
- Fetches
/sitemap.xml— accessible without Cloudflare challenge — and classifies URLs into job listings and company profile pages - Applies mode filter (
jobs,companies, orboth) and optional filters for remote policy, seniority, and Japanese language requirement - Loads each target page using a Playwright browser with residential proxy and anti-detection fingerprinting to bypass Cloudflare
- Extracts data from both JSON-LD structured markup and rendered HTML, with HTML as fallback for fields not in the schema
Input
{"scrapeMode": "jobs","remotePolicy": "fully-remote","japaneseRequired": "no-japanese-required","maxItems": 50,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
| Field | Type | Default | Description |
|---|---|---|---|
scrapeMode | string | "both" | What to scrape: "jobs", "companies", or "both" |
searchUrls | array | — | Optional: specific TokyoDev URLs to scrape. Skips sitemap discovery. |
remotePolicy | string | "" | Filter by remote policy: "fully-remote", "partially-remote", "no-remote", or empty for all |
seniority | string | "" | Filter by seniority: "intern", "junior", "intermediate", "senior", or empty for all |
japaneseRequired | string | "" | Filter by Japanese language: "japanese-required", "no-japanese-required", or empty for all |
maxItems | integer | 50 | Maximum number of results to return |
proxyConfiguration | object | RESIDENTIAL | Proxy settings — residential proxy required for Cloudflare bypass |
TokyoDev Scraper Output Fields
Job Listings
{"job_title": "Senior Rails Engineer","company_name": "TableCheck","company_url": "https://www.tablecheck.com","location": "Tokyo","job_type": "full-time","seniority": "senior","remote_policy": "partially-remote","japanese_required": false,"apply_from_abroad": true,"salary_range": "8000000-14000000 JPY","description": "TableCheck is looking for a senior Rails engineer...","requirements": ["5+ years Rails experience", "Experience with PostgreSQL"],"tags": ["Ruby", "Rails", "PostgreSQL", "React"],"apply_url": "https://www.tablecheck.com/jobs/apply/rails-engineer","posted_date": "2025-03-20","job_url": "https://www.tokyodev.com/companies/tablecheck/jobs/senior-rails-engineer"}
| Field | Type | Description |
|---|---|---|
job_title | string | Job title |
company_name | string | Hiring company name |
company_url | string | Company website URL |
location | string | Job location (e.g. Tokyo, Remote, Osaka) |
job_type | string | Employment type: full-time, contract, intern |
seniority | string | Seniority level: junior, intermediate, senior |
remote_policy | string | Remote work policy: fully-remote, partially-remote, no-remote |
japanese_required | boolean | Whether Japanese language proficiency is required |
apply_from_abroad | boolean | Whether candidates can apply from outside Japan |
salary_range | string | Salary range if disclosed |
description | string | Full job description text |
requirements | array | Job requirements and qualifications |
tags | array | Technology and skill tags (e.g. Ruby, Python, React) |
apply_url | string | Direct URL to apply for the position |
posted_date | string | Date the job was posted |
job_url | string | Full TokyoDev job listing URL |
Company Profiles
When scrapeMode is "companies" or "both", company records are included in the same dataset. Company records populate company_name, company_url, description, location, tags, and job_url (set to the company profile URL). Job-specific fields are null.
{"company_name": "Mercari","company_url": "https://www.mercari.com","location": "Tokyo","description": "Mercari is Japan's largest marketplace app...","tags": ["Go", "Kotlin", "Swift", "React", "Kubernetes"],"job_url": "https://www.tokyodev.com/companies/mercari"}
🔍 FAQ
How do I scrape TokyoDev.com?
TokyoDev Scraper handles sitemap discovery automatically. Set scrapeMode to "jobs", "companies", or "both", apply any filters you need, configure the residential proxy, and run it. For targeted runs, paste specific TokyoDev URLs into searchUrls to skip the sitemap phase entirely.
Does TokyoDev Scraper need proxies?
It does. TokyoDev uses Cloudflare managed challenge on all page routes. The scraper uses a Playwright browser with residential proxy and anti-detection fingerprinting to get through. The sitemap at /sitemap.xml is accessible without challenge — the scraper uses that for URL discovery without consuming proxy budget.
What data can I get from TokyoDev.com?
TokyoDev Scraper returns job titles, companies, locations, employment types, seniority levels, remote policies, Japanese language requirements, apply-from-abroad flags, salary ranges, descriptions, requirements lists, technology tags, apply URLs, and posting dates. Company profiles include the company description, location, and tech stack tags.
Can I filter for jobs that don't require Japanese?
Set japaneseRequired to "no-japanese-required". TokyoDev Scraper applies the filter before saving records, so only matching results land in the dataset — you don't have to filter downstream.
How much does TokyoDev Scraper cost to run?
TokyoDev Scraper uses pay-per-event pricing. Because it requires a browser with residential proxy for each page, cost per record is higher than plain HTTP scrapers. Running the full board (~182 jobs + ~232 companies) costs roughly a few dollars depending on proxy consumption.
Need More Features?
Need scheduled runs, webhook delivery, or fields not currently extracted? File an issue or get in touch.
Why Use TokyoDev Scraper?
- Structured language and remote data —
japanese_requiredandremote_policyare extracted as typed fields, not buried in description text, so your filters work without NLP preprocessing - Dual-mode output — Jobs and company profiles in a single run with a shared schema, so you can join them by
company_namewithout running two separate scrapers - CF-resilient by design — Residential proxy with browser fingerprinting handles Cloudflare without manual intervention; the sitemap bypass keeps URL discovery cheap