Hitta.se Lead Scraper (Beta)
Pricing
from $1.00 / 1,000 results
Hitta.se Lead Scraper (Beta)
Retrieve leads on hitta.se, the easy way. This actor will retrieve the business' name, address, email addresses, phone numbers and social links.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer

SLASH
Actor stats
3
Bookmarked
8
Total users
3
Monthly active users
17 days ago
Last modified
Categories
Share
Hitta.se Scraper
Please note: This actor is currently in beta. Minimum cost is enabled until out of beta. You’ll receive an email update if you’re subscribed and the price is adjusted.
(apify-actor-start: $0.00001 / apify-default-dataset-item: $0.00001 until out of beta)
Apify Actor: Hitta.se scraper (listing → detail pages → tiny same-domain contact crawl)
Updates in this version
- If there is no website or it is banned/unavailable, keep all valid detail-page emails (not only generic).
- Email slug fallback now unescapes HTML and matches both:
<slug>-button-email-<email>button-email-<email>(broader pattern)
- Unescape attribute values before extracting emails to handle entity-encoded addresses.
Kept (still working as intended)
max_resultswith “Nästa” pagination (25 per page).- Website picked from canonical/og:url/JSON-LD first, then scored anchors; bans junk.
website_details: one ofok|404|unavailable|banned|n/a.- Banned:
hitta.dixa.help,dixa.help,biluppgifter.se, specific DNB URL. - Detail-page email extraction from Hitta UI + mailto + strict regex + slug fallback.
- Same-domain mini-crawl for emails + social links when website is OK.
- Swedish address heuristics; phone extraction; categories.
Features
- Crawls listing pages on Hitta.se, discovers company detail pages, and pushes normalized contact data.
- Extracts from detail pages:
namecategories(best-effort)phoneaddress(Swedish heuristics and JSON-LD)website(structured first, then scored anchors with bans)email1..N(strict parsing, de-duplicated)website_details(status as above)- Socials if available:
social_facebook,social_instagram,social_linkedin,social_x,social_youtube,social_tiktok,social_pinterest
- If website is OK, performs a tiny same-domain crawl (configurable) to discover more emails and socials.
Input Configuration
Example input JSON:
{"start_urls": [{ "url": "https://www.hitta.se/nacka/företag/2" }],"max_depth": 3,"headers": {"User-Agent": "Mozilla/5.0 ...","Accept-Language": "sv-SE,sv;q=0.9,en-US;q=0.8,en;q=0.7"},"timeout_seconds": 30,"site_email_max_pages": 3,"max_results": 0}
| Field | Type | Default | Description |
|---|---|---|---|
start_urls | array | [{ "url": "https://www.hitta.se/nacka/företag/2"}] | One or more Hitta listing URLs. |
max_depth | integer | 3 | Crawl depth from the start URLs. Listing depth is reused for pagination. |
headers | object | See code defaults | HTTP headers for requests. |
timeout_seconds | integer | 30 | Read timeout for HTTP requests. |
site_email_max_pages | integer | 3 | Max pages to crawl on the same domain as the extracted website for extra contacts. |
max_results | integer | 0 (no cap) | Limit how many detail results to push. Pagination respects 25/page with “Nästa.” |
How It Works
-
Listing pages: extracts detail links using
/verksamhet/anchors. Follows “Nästa” to paginate. -
Detail pages: extracts:
-
Website from canonical,
og:url, and JSON-LD; otherwise scores external anchors and bans junk domains. -
Address from JSON-LD, microdata, and Swedish heuristics (street tokens + postcode check).
-
Phone from
tel:links or strict regex for Swedish formats. -
Emails from:
- Hitta UI attributes (unescaped)
mailto:links- Strict regex on visible text
- Slug-pattern fallback:
<slug>-button-email-<email>andbutton-email-<email>
-
-
Website status: HEAD/GET to classify
website_detailsasok,404,unavailable,banned, orn/a. -
Same-domain mini-crawl (if website OK): fetch up to
site_email_max_pagespages for more emails and socials. -
Email policy:
- If no website or website is banned/unavailable, keep all valid detail-page emails.
- If website is OK, keep emails that match the website base-domain or are generic providers (Gmail, Outlook, etc.).
Example Output
{"source_url": "https://www.hitta.se/foeretag/exempel-ab/123456","name": "Exempel AB","categories": "Bygg, Renovering","phone": "08 123 45 67","address": "Exempelgatan 10, 123 45 Stockholm","website": "https://www.exempel.se","email1": "info@exempel.se","email2": "support@exempel.se","website_details": "ok","social_facebook": "https://www.facebook.com/exempel-ab","social_instagram": "https://www.linkedin.com/exempel-ab","social_linkedin": "https://www.linkedin.com/company/exempel-ab","social_x": "https://www.x.com/exempel-ab","social_youtube": "https://www.youtube.com/exempel-ab","social_tiktok": "https://www.tiktok.com/exempel-ab","social_pinterest": "https://www.pinterest.com/exempel-ab"}
If no valid emails are found, the actor emits
"email1": "n/a".
Notes
- Pagination: respects Hitta’s “Nästa” flow, ~25 results per page.
- Bans:
hitta.dixa.help,dixa.help,biluppgifter.se, and a specific DNB marketing URL are excluded. - Email hygiene: strict regex, HTML-unescape, de-duplication, and tracking-pattern filtering.
- Address quality: prefers JSON-LD
PostalAddress, then microdata, then heuristics requiring Swedish postcode + street token.
Disclaimer & License
This Apify Actor is provided "as is", without warranty of any kind — express or implied — including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. Use it, modify it, break it, or improve it — but you do so at your own risk.
© 2025 SLSH. All rights reserved. Copying or modifying the source code is prohibited.