Japanese Web Scraper - Yahoo News, Rakuten, Suumo, Tabelog
Pricing
Pay per usage
Japanese Web Scraper - Yahoo News, Rakuten, Suumo, Tabelog
Scrape major Japanese websites: Yahoo! Japan News, Rakuten, Suumo, Tabelog. Full Shift_JIS/EUC-JP encoding support, cookie wall bypass, and JP pagination handling. Structured JSON output with optional romaji transliteration for non-Japanese data consumers.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
BBB & Company
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
0
Monthly active users
8 days ago
Last modified
Categories
Share
Japanese Website Scraper Pack 🇯🇵
Specialized web scraper for major Japanese websites. Handles Japanese-specific challenges: character encoding (Shift_JIS, EUC-JP), cookie consent walls, and locale-specific pagination patterns.
Supported Websites
| Source | Data Type | Example Use |
|---|---|---|
| Yahoo! Japan News | News articles | Media monitoring, sentiment analysis |
| Rakuten | Product listings | Price monitoring, market research |
| Suumo | Real estate | Property data, rental market analysis |
| Tabelog | Restaurant reviews | F&B research, location intelligence |
| Hot Pepper Gourmet | Restaurants | Dining data, regional analysis |
| Custom | Any Japanese site | General-purpose scraping |
Why This Actor?
Most scrapers on Apify Store are built for English websites. Japanese sites have unique challenges:
- Character encoding: Many older Japanese sites use Shift_JIS or EUC-JP instead of UTF-8
- Consent walls: Japanese privacy regulations require specific cookie handling
- Pagination patterns: Japanese sites often use non-standard pagination (e.g.,
?PG=2,?pn=2) - Mobile-first layouts: Many Japanese sites serve different HTML to mobile vs desktop
- Anti-bot measures: Major sites like Tabelog and Rakuten have sophisticated bot detection
This Actor handles all of these out of the box.
Input Example
{"source": "rakuten-search","searchQuery": "ワイヤレスイヤホン","maxItems": 100,"maxPages": 5}
Output Example
{"source": "rakuten-search","title": "Apple AirPods Pro (第2世代) USB-C","url": "https://item.rakuten.co.jp/...","price": "36,800円","rating": "4.65","reviewCount": 1234,"imageUrl": "https://...","extractedAt": "2026-05-01T12:00:00Z"}
Tips
- For Tabelog and Hot Pepper, use Japanese area names (e.g., "渋谷", "新宿") for best results
- Yahoo News categories:
domestic,world,business,entertainment,sports,it-science - Rakuten supports category codes for more precise filtering
- Use proxy configuration with Japan-based IPs for sites that geo-restrict content
Cost
Lightweight CheerioCrawler (no browser needed for most sources). Typical run costs under $0.10 for 100 items.
Store submission packet
Primary category: Web scraping
Secondary category: Ecommerce / News / Real estate / Food and restaurants
Store description: Scrape structured Japanese web data from Yahoo! Japan News, Rakuten search, Suumo listings, Tabelog restaurants, Hot Pepper Gourmet, or custom Japanese URLs. Handles Japanese search terms, local pagination patterns, common Japanese encodings, and normalized JSON output for monitoring, market research, and AI pipelines.
Recommended monetization: Pay per event (PPE). Keep the automatic apify-actor-start event at Apify's default $0.00005. Enable the automatic apify-default-dataset-item event at $0.006 per saved result ($6 per 1,000 items). Do not pass platform usage costs to users after the first pricing validation run unless proxy costs make margins negative.
Chargeable output mapping: One default dataset item is written for each listing, article, product, restaurant, property, custom page, or full-content detail record saved by the Actor.
User-visible limits: Site layouts and access controls can change. Users should start with conservative maxItems and maxPages, and use Apify Proxy with Japan-capable IPs where the target site requires it.