Hitta.se Lead Scraper (Beta)
Pricing
Pay per event
Hitta.se Lead Scraper (Beta)
Retrieve leads on hitta.se, the easy way. This actor will retrieve the business' name, address, email addresses, phone numbers and social links.
0.0 (0)
Pricing
Pay per event
3
3
2
Last modified
6 days ago
Hitta.se Scraper
Please note: This actor is currently in beta. Minimum cost is enabled until out of beta. You’ll receive an email update if you’re subscribed and the price is adjusted.
(apify-actor-start: $0.00001 / apify-default-dataset-item: $0.00001 until out of beta)
Apify Actor: Hitta.se scraper (listing → detail pages → tiny same-domain contact crawl)
Updates in this version
- If there is no website or it is banned/unavailable, keep all valid detail-page emails (not only generic).
- Email slug fallback now unescapes HTML and matches both:
<slug>-button-email-<email>button-email-<email>(broader pattern)
- Unescape attribute values before extracting emails to handle entity-encoded addresses.
Kept (still working as intended)
max_resultswith “Nästa” pagination (25 per page).- Website picked from canonical/og:url/JSON-LD first, then scored anchors; bans junk.
website_details: one ofok|404|unavailable|banned|n/a.- Banned:
hitta.dixa.help,dixa.help,biluppgifter.se, specific DNB URL. - Detail-page email extraction from Hitta UI + mailto + strict regex + slug fallback.
- Same-domain mini-crawl for emails + social links when website is OK.
- Swedish address heuristics; phone extraction; categories.
Features
- Crawls listing pages on Hitta.se, discovers company detail pages, and pushes normalized contact data.
- Extracts from detail pages:
namecategories(best-effort)phoneaddress(Swedish heuristics and JSON-LD)website(structured first, then scored anchors with bans)email1..N(strict parsing, de-duplicated)website_details(status as above)- Socials if available:
social_facebook,social_instagram,social_linkedin,social_x,social_youtube,social_tiktok,social_pinterest
- If website is OK, performs a tiny same-domain crawl (configurable) to discover more emails and socials.
Input Configuration
Example input JSON:
{"start_urls": [{ "url": "https://www.hitta.se/nacka/företag/2" }],"max_depth": 3,"headers": {"User-Agent": "Mozilla/5.0 ...","Accept-Language": "sv-SE,sv;q=0.9,en-US;q=0.8,en;q=0.7"},"timeout_seconds": 30,"site_email_max_pages": 3,"max_results": 0}
| Field | Type | Default | Description |
|---|---|---|---|
start_urls | array | [{ "url": "https://www.hitta.se/nacka/företag/2"}] | One or more Hitta listing URLs. |
max_depth | integer | 3 | Crawl depth from the start URLs. Listing depth is reused for pagination. |
headers | object | See code defaults | HTTP headers for requests. |
timeout_seconds | integer | 30 | Read timeout for HTTP requests. |
site_email_max_pages | integer | 3 | Max pages to crawl on the same domain as the extracted website for extra contacts. |
max_results | integer | 0 (no cap) | Limit how many detail results to push. Pagination respects 25/page with “Nästa.” |
How It Works
-
Listing pages: extracts detail links using
/verksamhet/anchors. Follows “Nästa” to paginate. -
Detail pages: extracts:
-
Website from canonical,
og:url, and JSON-LD; otherwise scores external anchors and bans junk domains. -
Address from JSON-LD, microdata, and Swedish heuristics (street tokens + postcode check).
-
Phone from
tel:links or strict regex for Swedish formats. -
Emails from:
- Hitta UI attributes (unescaped)
mailto:links- Strict regex on visible text
- Slug-pattern fallback:
<slug>-button-email-<email>andbutton-email-<email>
-
-
Website status: HEAD/GET to classify
website_detailsasok,404,unavailable,banned, orn/a. -
Same-domain mini-crawl (if website OK): fetch up to
site_email_max_pagespages for more emails and socials. -
Email policy:
- If no website or website is banned/unavailable, keep all valid detail-page emails.
- If website is OK, keep emails that match the website base-domain or are generic providers (Gmail, Outlook, etc.).
Example Output
{"source_url": "https://www.hitta.se/foeretag/exempel-ab/123456","name": "Exempel AB","categories": "Bygg, Renovering","phone": "08 123 45 67","address": "Exempelgatan 10, 123 45 Stockholm","website": "https://www.exempel.se","email1": "info@exempel.se","email2": "support@exempel.se","website_details": "ok","social_facebook": "https://www.facebook.com/exempel-ab","social_instagram": "https://www.linkedin.com/exempel-ab","social_linkedin": "https://www.linkedin.com/company/exempel-ab","social_x": "https://www.x.com/exempel-ab","social_youtube": "https://www.youtube.com/exempel-ab","social_tiktok": "https://www.tiktok.com/exempel-ab","social_pinterest": "https://www.pinterest.com/exempel-ab"}
If no valid emails are found, the actor emits
"email1": "n/a".
Notes
- Pagination: respects Hitta’s “Nästa” flow, ~25 results per page.
- Bans:
hitta.dixa.help,dixa.help,biluppgifter.se, and a specific DNB marketing URL are excluded. - Email hygiene: strict regex, HTML-unescape, de-duplication, and tracking-pattern filtering.
- Address quality: prefers JSON-LD
PostalAddress, then microdata, then heuristics requiring Swedish postcode + street token.
Disclaimer & License
This Apify Actor is provided "as is", without warranty of any kind — express or implied — including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. Use it, modify it, break it, or improve it — but you do so at your own risk.
© 2025 SLSH. All rights reserved. Copying or modifying the source code is prohibited.

