Japanese Web Scraper - Yahoo News, Rakuten, Suumo, Tabelog avatar

Japanese Web Scraper - Yahoo News, Rakuten, Suumo, Tabelog

Pricing

Pay per usage

Go to Apify Store
Japanese Web Scraper - Yahoo News, Rakuten, Suumo, Tabelog

Japanese Web Scraper - Yahoo News, Rakuten, Suumo, Tabelog

Scrape major Japanese websites: Yahoo! Japan News, Rakuten, Suumo, Tabelog. Full Shift_JIS/EUC-JP encoding support, cookie wall bypass, and JP pagination handling. Structured JSON output with optional romaji transliteration for non-Japanese data consumers.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

BBB & Company

BBB & Company

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

8 days ago

Last modified

Share

Japanese Website Scraper Pack 🇯🇵

Specialized web scraper for major Japanese websites. Handles Japanese-specific challenges: character encoding (Shift_JIS, EUC-JP), cookie consent walls, and locale-specific pagination patterns.

Supported Websites

SourceData TypeExample Use
Yahoo! Japan NewsNews articlesMedia monitoring, sentiment analysis
RakutenProduct listingsPrice monitoring, market research
SuumoReal estateProperty data, rental market analysis
TabelogRestaurant reviewsF&B research, location intelligence
Hot Pepper GourmetRestaurantsDining data, regional analysis
CustomAny Japanese siteGeneral-purpose scraping

Why This Actor?

Most scrapers on Apify Store are built for English websites. Japanese sites have unique challenges:

  1. Character encoding: Many older Japanese sites use Shift_JIS or EUC-JP instead of UTF-8
  2. Consent walls: Japanese privacy regulations require specific cookie handling
  3. Pagination patterns: Japanese sites often use non-standard pagination (e.g., ?PG=2, ?pn=2)
  4. Mobile-first layouts: Many Japanese sites serve different HTML to mobile vs desktop
  5. Anti-bot measures: Major sites like Tabelog and Rakuten have sophisticated bot detection

This Actor handles all of these out of the box.

Input Example

{
"source": "rakuten-search",
"searchQuery": "ワイヤレスイヤホン",
"maxItems": 100,
"maxPages": 5
}

Output Example

{
"source": "rakuten-search",
"title": "Apple AirPods Pro (第2世代) USB-C",
"url": "https://item.rakuten.co.jp/...",
"price": "36,800円",
"rating": "4.65",
"reviewCount": 1234,
"imageUrl": "https://...",
"extractedAt": "2026-05-01T12:00:00Z"
}

Tips

  • For Tabelog and Hot Pepper, use Japanese area names (e.g., "渋谷", "新宿") for best results
  • Yahoo News categories: domestic, world, business, entertainment, sports, it-science
  • Rakuten supports category codes for more precise filtering
  • Use proxy configuration with Japan-based IPs for sites that geo-restrict content

Cost

Lightweight CheerioCrawler (no browser needed for most sources). Typical run costs under $0.10 for 100 items.

Store submission packet

Primary category: Web scraping

Secondary category: Ecommerce / News / Real estate / Food and restaurants

Store description: Scrape structured Japanese web data from Yahoo! Japan News, Rakuten search, Suumo listings, Tabelog restaurants, Hot Pepper Gourmet, or custom Japanese URLs. Handles Japanese search terms, local pagination patterns, common Japanese encodings, and normalized JSON output for monitoring, market research, and AI pipelines.

Recommended monetization: Pay per event (PPE). Keep the automatic apify-actor-start event at Apify's default $0.00005. Enable the automatic apify-default-dataset-item event at $0.006 per saved result ($6 per 1,000 items). Do not pass platform usage costs to users after the first pricing validation run unless proxy costs make margins negative.

Chargeable output mapping: One default dataset item is written for each listing, article, product, restaurant, property, custom page, or full-content detail record saved by the Actor.

User-visible limits: Site layouts and access controls can change. Users should start with conservative maxItems and maxPages, and use Apify Proxy with Japan-capable IPs where the target site requires it.