PeachParser (Beta)
Pricing
from $0.10 / 1,000 results
PeachParser (Beta)
Crawl arbitrary websites, checks which are alive, and crawls them for emails and social links. Filters common telemetry and template junk.
Pricing
from $0.10 / 1,000 results
Rating
0.0
(0)
Developer

SLASH
Actor stats
3
Bookmarked
3
Total users
2
Monthly active users
17 days ago
Last modified
Categories
Share
PeachParser
PeachParser is an Apify Actor developed by SLASH for crawling websites to extract:
- Emails (from visible content and
mailto:links) - Social profiles (Facebook, Instagram, LinkedIn, X, YouTube, TikTok, Pinterest)
- Optional listing items from directory-like pages (for example, lists of choirs, restaurants, or organizations)
It is optimized for small to medium websites where you want both contact details and, optionally, all items listed on a directory page (such as https://www.sverigeskorforbund.se/korer).
Key Features
-
Site-level contact extraction
- Extracts emails from:
mailto:links- Visible page text (up to a configurable limit)
- Filters:
- Removes tracking / telemetry addresses
- Blocks placeholder / bogus domains (e.g.
mysite.com,*.wixpress.com) - Accepts emails that match the website’s domain or come from generic providers (Gmail, Outlook, etc.)
- Extracts emails from:
-
Social profile detection
- Detects social links from:
- JSON-LD (
sameAsarrays) - Regular anchor tags (
<a href="...">)
- JSON-LD (
- Supports:
- X (Twitter)
- YouTube
- TikTok
- Applies a brand token (derived from the hostname) to avoid irrelevant social links from third-party widgets when possible.
- Detects social links from:
-
Optional listing extraction
- When
extract_listingsis enabled, PeachParser tries to identify listing items on pages:- Looks for same-domain
<a>links with meaningful text - Skips obviously generic link text such as “read more”, “les mer”, “more info”, etc.
- Avoids file downloads and non-HTML resources
- Looks for same-domain
- Each listing item is stored as a separate dataset record with:
record_type = "listing_item"item_nameitem_urlitem_source_page
- When
-
Smart crawling
- Restricts crawling to a single domain (supports
www.and bare domain equivalence) - Skips non-HTML responses and resources with unwanted file extensions (
.pdf, images, archives, etc.) - Prioritizes URLs with contact-related keywords (
kontakt,contact,om-oss,about,personvern, etc.) - Respects
max_pages_per_siteto control workload
- Restricts crawling to a single domain (supports
-
Robots.txt (optional)
- When
respect_robots_txtis enabled, PeachParser:- Fetches and parses
robots.txt(with a short timeout and size limit) - Uses it to decide whether a URL may be crawled
- Fetches and parses
- When
Notes
- Keep
max_pages_per_sitemodest for reliability and to avoid hitting rate limits. - Results depend on site structure and the presence of contact information in public pages.
- Respect terms of service and local laws.
Supported & planned regions
| Region | Status | Details | Link |
|---|---|---|---|
| Nordics | Optimized | Last optimized: 2025-11-11 (NO/SE/DK/FI/IS) | — |
| Western EU | Planned | — | — |
| Eastern EU | Planned | — | — |
| North America | Not started | — | — |
| South America | Not started | — | — |
| East/SE Asia | Not started | — | — |
| Middle East | Not started | — | — |
| Africa | Not started | — | — |
| Oceania | Not started | — | — |
Create an issue if you’d like your country prioritized.
Disclaimer & License
This Apify Actor is provided “as is”, without warranty of any kind — express or implied — including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. Please follow local laws and do not use for malicious purposes.
ToS & legality (Reminder): Great scraping comes with great responsibility. Follow local laws and do not use my code to spam.
© 2025 SLSH. All rights reserved. Copying or modifying the source code is prohibited.