TokyoDev Scraper - Japan Tech Job Listings & Companies avatar

TokyoDev Scraper - Japan Tech Job Listings & Companies

Pricing

Pay per usage

Go to Apify Store
TokyoDev Scraper - Japan Tech Job Listings & Companies

TokyoDev Scraper - Japan Tech Job Listings & Companies

Scrape job listings and company profiles from TokyoDev.com. Extract job titles, companies, locations, remote policies, salary ranges, Japanese language requirements, visa sponsorship, and technology tags.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

10 days ago

Last modified

Share

TokyoDev Job & Company Scraper

Scrapes tech job listings and company profiles from TokyoDev.com, the primary English-language job board for developers targeting Japan's tech industry. Returns jobs with titles, salaries, remote policies, Japanese language requirements, visa sponsorship signals, and technology tags — plus company profiles with descriptions and tech stacks — across ~182 job listings and ~232 company pages.


TokyoDev Scraper Features

  • Scrapes job listings, company profiles, or both via a single scrapeMode selector
  • Extracts Japanese language requirement per listing — true/false, not buried in description text
  • Captures remote policy per job: fully-remote, partially-remote, or no-remote
  • Returns apply-from-abroad eligibility where disclosed — useful for candidates outside Japan
  • Collects technology and skill tags per listing (Ruby, Python, React, etc.)
  • Filters by remote policy, seniority level, or Japanese language requirement before saving
  • Accepts specific TokyoDev URLs directly — skip sitemap discovery for targeted runs
  • Uses residential proxy to bypass Cloudflare protection on all non-sitemap pages

Who Uses TokyoDev Data?

  • Recruiters — Pull structured Japan tech listings with remote and language filters already applied, not raw HTML to parse
  • Job aggregators — Ingest English-language Japan tech jobs with consistent field structure across listings
  • Market researchers — Analyze salary trends, remote policy distribution, and Japanese language demand across the Japan tech sector
  • HR analytics teams — Build datasets tracking which companies are hiring, what seniority levels are in demand, and what tech stacks are common
  • Candidate matching platforms — Filter by japanese_required and apply_from_abroad to surface realistic options for international applicants

How TokyoDev Scraper Works

  1. Fetches /sitemap.xml — accessible without Cloudflare challenge — and classifies URLs into job listings and company profile pages
  2. Applies mode filter (jobs, companies, or both) and optional filters for remote policy, seniority, and Japanese language requirement
  3. Loads each target page using a Playwright browser with residential proxy and anti-detection fingerprinting to bypass Cloudflare
  4. Extracts data from both JSON-LD structured markup and rendered HTML, with HTML as fallback for fields not in the schema

Input

{
"scrapeMode": "jobs",
"remotePolicy": "fully-remote",
"japaneseRequired": "no-japanese-required",
"maxItems": 50,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}
FieldTypeDefaultDescription
scrapeModestring"both"What to scrape: "jobs", "companies", or "both"
searchUrlsarrayOptional: specific TokyoDev URLs to scrape. Skips sitemap discovery.
remotePolicystring""Filter by remote policy: "fully-remote", "partially-remote", "no-remote", or empty for all
senioritystring""Filter by seniority: "intern", "junior", "intermediate", "senior", or empty for all
japaneseRequiredstring""Filter by Japanese language: "japanese-required", "no-japanese-required", or empty for all
maxItemsinteger50Maximum number of results to return
proxyConfigurationobjectRESIDENTIALProxy settings — residential proxy required for Cloudflare bypass

TokyoDev Scraper Output Fields

Job Listings

{
"job_title": "Senior Rails Engineer",
"company_name": "TableCheck",
"company_url": "https://www.tablecheck.com",
"location": "Tokyo",
"job_type": "full-time",
"seniority": "senior",
"remote_policy": "partially-remote",
"japanese_required": false,
"apply_from_abroad": true,
"salary_range": "8000000-14000000 JPY",
"description": "TableCheck is looking for a senior Rails engineer...",
"requirements": ["5+ years Rails experience", "Experience with PostgreSQL"],
"tags": ["Ruby", "Rails", "PostgreSQL", "React"],
"apply_url": "https://www.tablecheck.com/jobs/apply/rails-engineer",
"posted_date": "2025-03-20",
"job_url": "https://www.tokyodev.com/companies/tablecheck/jobs/senior-rails-engineer"
}
FieldTypeDescription
job_titlestringJob title
company_namestringHiring company name
company_urlstringCompany website URL
locationstringJob location (e.g. Tokyo, Remote, Osaka)
job_typestringEmployment type: full-time, contract, intern
senioritystringSeniority level: junior, intermediate, senior
remote_policystringRemote work policy: fully-remote, partially-remote, no-remote
japanese_requiredbooleanWhether Japanese language proficiency is required
apply_from_abroadbooleanWhether candidates can apply from outside Japan
salary_rangestringSalary range if disclosed
descriptionstringFull job description text
requirementsarrayJob requirements and qualifications
tagsarrayTechnology and skill tags (e.g. Ruby, Python, React)
apply_urlstringDirect URL to apply for the position
posted_datestringDate the job was posted
job_urlstringFull TokyoDev job listing URL

Company Profiles

When scrapeMode is "companies" or "both", company records are included in the same dataset. Company records populate company_name, company_url, description, location, tags, and job_url (set to the company profile URL). Job-specific fields are null.

{
"company_name": "Mercari",
"company_url": "https://www.mercari.com",
"location": "Tokyo",
"description": "Mercari is Japan's largest marketplace app...",
"tags": ["Go", "Kotlin", "Swift", "React", "Kubernetes"],
"job_url": "https://www.tokyodev.com/companies/mercari"
}

🔍 FAQ

How do I scrape TokyoDev.com?

TokyoDev Scraper handles sitemap discovery automatically. Set scrapeMode to "jobs", "companies", or "both", apply any filters you need, configure the residential proxy, and run it. For targeted runs, paste specific TokyoDev URLs into searchUrls to skip the sitemap phase entirely.

Does TokyoDev Scraper need proxies?

It does. TokyoDev uses Cloudflare managed challenge on all page routes. The scraper uses a Playwright browser with residential proxy and anti-detection fingerprinting to get through. The sitemap at /sitemap.xml is accessible without challenge — the scraper uses that for URL discovery without consuming proxy budget.

What data can I get from TokyoDev.com?

TokyoDev Scraper returns job titles, companies, locations, employment types, seniority levels, remote policies, Japanese language requirements, apply-from-abroad flags, salary ranges, descriptions, requirements lists, technology tags, apply URLs, and posting dates. Company profiles include the company description, location, and tech stack tags.

Can I filter for jobs that don't require Japanese?

Set japaneseRequired to "no-japanese-required". TokyoDev Scraper applies the filter before saving records, so only matching results land in the dataset — you don't have to filter downstream.

How much does TokyoDev Scraper cost to run?

TokyoDev Scraper uses pay-per-event pricing. Because it requires a browser with residential proxy for each page, cost per record is higher than plain HTTP scrapers. Running the full board (~182 jobs + ~232 companies) costs roughly a few dollars depending on proxy consumption.


Need More Features?

Need scheduled runs, webhook delivery, or fields not currently extracted? File an issue or get in touch.

Why Use TokyoDev Scraper?

  • Structured language and remote datajapanese_required and remote_policy are extracted as typed fields, not buried in description text, so your filters work without NLP preprocessing
  • Dual-mode output — Jobs and company profiles in a single run with a shared schema, so you can join them by company_name without running two separate scrapers
  • CF-resilient by design — Residential proxy with browser fingerprinting handles Cloudflare without manual intervention; the sitemap bypass keeps URL discovery cheap