Superclean URLs
Pricing
from $0.35 / 1,000 results
Superclean URLs
Clean messy URLs from lead exports. Remove 60+ tracking parameters (utm_*, fbclid, gclid), normalize format, extract domains, and optionally verify URLs are reachable. Perfect for cold email personalization and CRM data hygiene.
Pricing
from $0.35 / 1,000 results
Rating
0.0
(0)
Developer

Superlative
Actor stats
2
Bookmarked
4
Total users
3
Monthly active users
4 days ago
Last modified
Categories
Share
Clean messy URLs from lead exports. Remove tracking parameters, normalize format, and extract domains.
What does Superclean URLs do?
Superclean URLs normalizes URLs from lead lists, CRM exports, and web scraping using rule-based parsing (no AI/LLM required).
- Removes tracking parameters — Strip UTM, fbclid, gclid, and 50+ other tracking params
- Normalizes format — Consistent protocol, lowercase domains, clean paths
- Extracts domains — Pull clean domain names from full URLs
- Validates URLs — Identify and flag invalid URL formats
- Fixes missing protocols — Adds https:// to bare domains
- Instant API mode — Sub-second single-URL cleaning via Standby HTTP server
Use with AI Agents
Available as an MCP tool via the Apify MCP Server.
| Property | Value |
|---|---|
| Actor ID | superlativetech/superclean-urls |
| Standby URL | https://superlativetech--superclean-urls.apify.actor |
| Input | items (string[]) or item (string) |
| Output | {id, input, output, domain, protocol, path, valid, confidence} per item |
| Pricing | $0.50 per 1,000 items |
| Idempotent | Yes — same input always produces same output |
Output schema
{ "id": 1, "input": "https://example.com/page?utm_source=google", "output": "https://example.com/page", "domain": "example.com", "valid": true, "confidence": 0.95 }
Pipeline composability
This actor works in data cleaning pipelines:
- Scrape → 2. Clean (this actor) → 3. Enrich (DNS/WHOIS) → 4. Score (ICP Scorer)
Standby (instant API)
GET https://superlativetech--superclean-urls.apify.actor?token=TOKEN&input=https://example.com?utm_source=google
What else can Superclean do?
If you're cleaning lead data, you might also need:
- Superclean Company Names — Clean messy company names for cold emails and CRM
- Superclean Person Names — Clean person names for cold email personalization
- Superclean Job Titles — Normalize job titles for lead scoring and personalization
- Superclean Product Names — Clean product names from e-commerce data
- Superclean Places — Normalize location data from lead exports
- Superlead ICP Scorer — Score leads against your Ideal Customer Profile with AI
Why clean URLs?
Your lead data comes with messy, tracking-laden URLs:
- "https://example.com/?utm_source=linkedin&fbclid=abc123&gclid=xyz"
- "http://ACME.COM/about/"
- "example.com/contact"
- "www.company.com/?ref=email&mc_eid=12345"
Clean data means better:
- Cold email personalization — Clean company URLs for email templates
- Lead enrichment — Normalize URLs from scraped or imported lead lists
- Data hygiene — Remove tracking params before storing in CRM
- Domain extraction — Pull domains for company research or deduplication
How to use Superclean URLs
- Paste your URLs into the input field (one per line)
- Select your output style (Full or Domain)
- Click Start and download your cleaned results
Output styles
| Style | Best for | Example Input | Example Output |
|---|---|---|---|
| Full | Cleaned URLs | https://example.com/?utm_source=x | https://example.com |
| Domain | Domain extraction | https://www.example.com/about | example.com |
Full (default)
Complete cleaned URL with tracking removed, protocol normalized, and format standardized.
Domain
Just the registrable domain (e.g., example.com). Useful for deduplication or company matching.
Standby mode (instant API)
Standby mode keeps a warm container running so you get instant URL cleaning without cold-start delays. Instead of starting a full Actor run, you make a simple HTTP GET request and get results in milliseconds.
This is ideal for:
- Clay enrichment steps — single-URL cleaning inline
- Make / n8n HTTP modules — real-time URL normalization in workflows
- MCP agents — AI tools that need instant URL cleaning
Standby URL
https://superlativetech--superclean-urls.apify.actor?token=YOUR_API_TOKEN
Or use a Bearer token in the Authorization header instead of the token query parameter.
Clean a URL
$curl "https://superlativetech--superclean-urls.apify.actor?token=YOUR_API_TOKEN&input=https://example.com/%3Futm_source%3Dlinkedin%26fbclid%3Dabc123"
Extract domain only
$curl "https://superlativetech--superclean-urls.apify.actor?token=YOUR_API_TOKEN&input=https://www.example.com/about&style=domain"
Query parameters
| Parameter | Required | Description |
|---|---|---|
input | Yes | URL to clean |
style | No | Output format: full (default) or domain |
forceHttps | No | Convert http to https (default: true) |
removeTracking | No | Remove tracking parameters (default: true) |
Response format
{"id": 1,"input": "https://example.com/?utm_source=linkedin","output": "https://example.com","domain": "example.com","protocol": "https","path": "","query": "","hash": "","valid": true,"confidence": 0.9}
Error responses
| Code | Cause |
|---|---|
| 400 | Missing input parameter or invalid style |
| 405 | Non-GET request |
| 500 | Unexpected server error |
What gets cleaned?
Tracking parameters removed
The Actor removes 60+ tracking parameters including:
- UTM — utm_source, utm_medium, utm_campaign, utm_term, utm_content
- Facebook — fbclid, fb_action_ids, fb_source
- Google — gclid, gclsrc, dclid, gbraid, wbraid
- Microsoft — msclkid
- LinkedIn — li_fat_id, li_tc
- Email marketing — mc_eid, mc_cid, _hsenc, _hsmi, mkt_tok
- Analytics — _ga, _gl, ref, spm, clickid
Normalization applied
- HTTP upgraded to HTTPS (configurable)
- Domains lowercased
- Trailing slashes removed
- Empty query strings removed
How many URLs can you clean?
There's no limit. Process as many URLs as you need — from a handful to hundreds of thousands. The Actor scales automatically.
For best performance, batch your requests. Processing 1,000 URLs at once is more efficient than 10 separate runs of 100 URLs each.
How much will it cost you?
This Actor uses pay-per-result pricing at half the cost of LLM-based actors (rule-based normalization with no external API calls):
| URLs | Cost |
|---|---|
| 1,000 | $0.50 |
| 10,000 | $5.00 |
| 100,000 | $50.00 |
Volume discounts apply automatically:
- Bronze (100+ items): $0.00045/URL
- Silver (1,000+ items): $0.0004/URL
- Gold (10,000+ items): $0.00035/URL
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
items | array | — | List of URLs to clean (one per line in the UI, or JSON array) |
item | string | — | Single URL to clean — API shorthand for integration callers (Clay, Make, n8n). If both item and items are provided, item is prepended to the list |
style | string | full | Output format: full (cleaned URL) or domain (domain only) |
forceHttps | boolean | true | Convert http:// to https:// |
removeTracking | boolean | true | Remove tracking parameters (utm_*, fbclid, etc.) |
Input example
{"items": ["https://www.example.com/?utm_source=linkedin&fbclid=abc123","http://ACME.COM/about/","example.com/contact","not a valid url"],"style": "full"}
items also accepts objects, which is useful for API and MCP integrations:
{"items": [{ "input": "https://www.example.com/?utm_source=linkedin&fbclid=abc123" },{ "input": "http://ACME.COM/about/" }],"style": "full"}
For API and integration callers who want to clean a single value without wrapping it in an array, use the item shorthand:
{"item": "https://www.example.com/?utm_source=linkedin&fbclid=abc123","style": "full"}
During the Actor run
The Actor processes URLs quickly using rule-based parsing. You'll see progress updates as items are processed.
If you provide invalid input (e.g., an empty list), the Actor will stop immediately with an error message explaining what went wrong.
Results are available in real-time — you can start downloading cleaned URLs before the full run completes.
Output format
Results are saved to the default dataset. Each cleaned URL is a separate item.
You can export results as JSON, CSV, Excel, or other formats directly from Apify Console. Or access them programmatically via the API.
Output example
[{"id": 1,"input": "https://www.example.com/?utm_source=linkedin&fbclid=abc123","output": "https://www.example.com","domain": "example.com","protocol": "https","path": "","query": "","hash": "","valid": true,"confidence": 0.9},{"id": 2,"input": "http://ACME.COM/about/","output": "https://acme.com/about","domain": "acme.com","protocol": "https","path": "/about","query": "","hash": "","valid": true,"confidence": 0.9},{"id": 3,"input": "example.com/contact","output": "https://example.com/contact","domain": "example.com","protocol": "https","path": "/contact","query": "","hash": "","valid": true,"confidence": 0.7},{"id": 4,"input": "not a valid url","output": "not a valid url","domain": "","protocol": "","path": "","query": "","hash": "","valid": false,"confidence": 0}]
| Field | Description |
|---|---|
id | Row number (1-based, matches Apify's displayed row numbers) |
input | Original URL before cleaning |
output | Cleaned result (format depends on style) |
domain | Extracted registrable domain (e.g., example.com) |
protocol | Protocol (http or https) |
path | URL path (e.g., /about/contact) |
query | Query string without ? (e.g., foo=bar&baz=1) |
hash | Fragment/anchor without # |
valid | Whether the URL format is valid |
confidence | Confidence score from 0 to 1 |
Confidence scores
- 1.0 — Valid URL, no changes needed
- 0.9 — Valid URL, tracking removed or normalized
- 0.7 — URL fixed (protocol added)
- 0.3 — Partially valid (domain extracted but issues remain)
- 0.0 — Invalid URL (couldn't parse)
Integrations
Superclean URLs works with any tool that can call Apify Actors:
- Clay — Add as an enrichment step in your Clay tables
- Make — Use the Apify module to run the Actor
- Zapier — Trigger runs and retrieve results automatically
- n8n — Self-hosted workflow automation
You can also use webhooks to trigger actions when a run completes — for example, send a Slack notification or automatically import results into your CRM.
Using Superclean URLs with the Apify API
The Apify API gives you programmatic access to run Actors, retrieve results, and manage datasets.
Node.js:
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('superlativetech/superclean-urls').call({items: ['https://example.com/?utm_source=linkedin', 'http://ACME.COM/about/'],style: 'full'});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Python:
from apify_client import ApifyClientclient = ApifyClient('YOUR_API_TOKEN')run = client.actor('superlativetech/superclean-urls').call(run_input={'items': ['https://example.com/?utm_source=linkedin', 'http://ACME.COM/about/'],'style': 'full'})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items)
Check out the Apify API reference for full details, or click the API tab above for more code examples.
Your feedback
We're always improving Superclean Actors. If you have feature requests, find a bug, or need help with a specific use case, please open an issue in the Actor's Issues tab.
When Apify asks to share your run data with us, we encourage you to opt in — it's the fastest way for us to spot edge cases and improve results. Sharing is completely optional (you can toggle it anytime under Account Settings → Privacy), and shared runs are automatically deleted by Apify based on your plan's data retention period. We only use shared data to debug issues and improve this Actor.
Leave a review
If Superclean URLs saves you time or improves your lead data, please leave a review. Your feedback helps other users discover the tool and helps us understand what's working well.
Built by Superlative