Selector Auto Fixer avatar

Selector Auto Fixer

Pricing

from $0.50 / 1,000 results

Go to Apify Store
Selector Auto Fixer

Selector Auto Fixer

Automatically detect broken CSS selectors and generate validated replacement selectors. The Actor reloads pages, tests candidate selectors, and outputs ready-to-apply patches with confidence and stability evidence. No LLMs. Deterministic. Built for production scrapers.

Pricing

from $0.50 / 1,000 results

Rating

0.0

(0)

Developer

Hayder Al-Khalissi

Hayder Al-Khalissi

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Selector Auto-Fixer

Selector Auto-Fixer is an Apify Actor that detects broken CSS selectors on any website and automatically suggests validated replacement selectors. You provide Start URLs and field definitions (name, selector, optional fallbacks and hints); the Actor tests your selectors and, when they fail, repairs them using deterministic similarity heuristics and multi-reload validation—no LLMs, no third-party APIs. Use it for web scraping maintenance, selector drift detection, and CI checks so your scrapers keep working when sites change.


What does Selector Auto-Fixer do?

Selector Auto-Fixer evaluates and repairs CSS selectors used for web scraping. In 1–2 sentences: it loads your URLs, tries your selectors (and fallbacks) per field, and when a selector no longer matches it finds candidate elements, scores them with rule-based heuristics, generates new CSS selectors, and validates them across multiple page reloads. It does not scrape full pages for content—it focuses on keeping your field selectors valid so your existing scrapers or extractors keep working. The Actor is deterministic: same input and page content yield the same results, with no LLM or paid external API calls.


Why use Selector Auto-Fixer? Why fix CSS selectors automatically?

Sites change markup, classes, and structure. When selectors break, scrapers fail or extract wrong data; manually finding new selectors is slow and error-prone. Selector Auto-Fixer automates repair: run it on critical URLs (e.g. after deployments or on a schedule) to get validated replacement selectors you can feed into your scraper config. You get transparent, deterministic suggestions with confidence and stability scores—no AI black box. Plus you get Apify platform benefits: scheduling, monitoring, API, and integrations so you can run selector checks in CI, nightly jobs, or pipelines without managing browsers yourself.


What can Selector Auto-Fixer do?

  • Try original selector and fallback selectors per field.
  • Validate extracted values with text hints (contains/regex), value type (price, date, number, text), and optional attribute (href, src, etc.).
  • Use proximity hints (near label, in container) to improve candidate scoring.
  • Generate robust CSS selectors (prefer data-testid, data-qa, stable classes; optional :nth-child).
  • Validate candidates across multiple reloads with configurable stability (same-node fingerprint, value-type checks).
  • Output patches (old → new selector) to the Dataset and Key-Value Store for easy use by other Actors or your app.
  • Process multiple URLs in parallel (default 3); block images/fonts/media for faster runs.

Your Actor + the Apify platform — Monitoring, API access, scheduling, proxy rotation, and integrations come with every run. Use the platform to automate selector checks and keep your scrapers healthy.


What data does Selector Auto-Fixer extract and output?

The Actor does not scrape page content for end-user data; it evaluates and repairs selectors and returns metadata about each field. Main output:

OutputDescription
Per-URL resulturl, finalUrl (after redirects), durationMs, timestamp, statusSummary (okCount, repairedCount, failedCount), actorRunId (for pipeline tracing), fields, patches, errors.
Per-field statusok (selector worked), repaired (new selector found), or failed.
Patchesfield, from (old selector), to (new selector), confidence, stability, sample (for CI/automation).
EvidencevalueSamples, stability, metrics (textSimilarity, domPathSimilarity, etc.).
Key-Value StoreKey patches: object mapping each URL to its list of patches (old→new selectors).

You can download the dataset produced by Selector Auto-Fixer in JSON, CSV, Excel, or other formats from the Apify platform.


How do I use Selector Auto-Fixer to fix CSS selectors?

  1. Open the Actor on Apify and go to the Input tab.
  2. Add one or more Start URLs (the pages you want to check).
  3. Fill in Field definitions using the form: at least Field 1 - Name and Field 1 - CSS selector are required. You can use up to 5 field slots; add fallback selectors (comma-separated), text hints, attribute (text/href/src/content), and value type (text/price/date/number) to improve accuracy.
  4. (Optional) Adjust Validation (reloads, timeout, wait until, min stability) and Repair (max candidates, top K, selector options) if needed.
  5. Click Start and wait for the run to finish.
  6. In the Dataset tab, open the result: each item is one URL with fields, patches, and errors. In Key-Value Store, use the patches key to get the old→new selector map per URL.
  7. Use the patches output to update your scraper configuration or selector registry.

How much does it cost to run Selector Auto-Fixer? Is fixing selectors free?

Selector Auto-Fixer runs on Apify’s consumption-based pricing (Compute Units). Cost depends on the number of URLs, fields, reloads, and concurrency. The Actor uses Playwright and blocks images/fonts/media to keep runs fast and resource-efficient. There are no LLM or third-party API calls, so you only pay for compute and platform usage. Check the Pricing section on the Actor page for current rates and free tier. For small runs (a few URLs, a few fields), usage typically stays within free-tier limits.


Input

Selector Auto-Fixer has the following input options. Click the Input tab on the Actor page for the full schema and tooltips.

  • Start URLs (required) — URLs to analyze (e.g. https://example.com/product/123). Use the built-in URL editor to add or bulk-edit.
  • Field definitions — Use the form: Field 1–5 each have Name, CSS selector, fallback selectors (comma-separated), text hint (contains/regex), attribute, value type, near label, in container. At least one field must have both name and selector filled.
  • Validation — Reloads, timeout (ms), wait until (DOM content loaded / load / network idle), min stability, require same node, require same value type.
  • Repair — Max candidates, top K, selector max depth, prefer data attributes, allow :nth-child, max selector length.
  • Debug — Save HTML snapshots, save screenshots, include candidate rankings.
  • Webhook URL — If set, each URL result is POSTed as JSON (url, finalUrl, patches, evidence, timestamp, durationMs, statusSummary, actorRunId) to this URL (e.g. n8n webhook).
  • Concurrency — Max URLs to process in parallel (default 3).

Example input (via API; you can also use the form in the Console):

{
"startUrls": [{ "url": "https://example.com/product/123" }],
"fields": [
{
"name": "title",
"selector": "h1.product-title",
"fallbackSelectors": ["h1", ".title"],
"textHintContains": "Product",
"attribute": "text",
"type": "text"
},
{
"name": "price",
"selector": ".price .amount",
"textHintRegex": "\\d+[.,]\\d{2}",
"type": "price",
"nearLabel": "Price",
"inContainer": ".product"
}
],
"validationReloads": 3,
"validationTimeoutMs": 45000,
"concurrency": 3
}

Output

You can download the dataset produced by Selector Auto-Fixer in formats such as JSON, CSV, or Excel from the Dataset tab. The Key-Value Store holds the patches key (old→new selectors per URL).

Example output (one dataset item per URL):

{
"url": "https://example.com/product/123",
"timestamp": "2025-02-22T12:00:00.000Z",
"fields": {
"title": {
"status": "ok",
"oldSelector": "h1.product-title",
"newSelector": null,
"confidence": 1,
"stability": 1,
"valueSamples": ["Product 123"],
"metrics": { "textSimilarity": 1, "attributeMatch": 1, "domPathSimilarity": 1, "tagAndClassSimilarity": 1, "contextSimilarity": 1, "totalScore": 1 }
},
"price": {
"status": "repaired",
"oldSelector": ".price .amount",
"newSelector": "[data-testid=\"price\"]",
"confidence": 0.82,
"stability": 1,
"valueSamples": ["29.99"],
"metrics": { "textSimilarity": 1, "attributeMatch": 0.5, "domPathSimilarity": 0.8, "tagAndClassSimilarity": 0.6, "contextSimilarity": 1, "totalScore": 0.82 }
}
},
"patches": [
{ "field": "price", "from": ".price .amount", "to": "[data-testid=\"price\"]", "confidence": 0.82, "stability": 1, "sample": "29.99", "evidence": { "reloads": 3, "existsRate": "3/3", "sameNodeRate": "3/3", "valueTypeRate": "3/3" } }
],
"errors": []
}

Tips for best results

  • CI and nightly runs — Run the Actor on critical URLs after deployments or on a schedule; alert when status is failed or confidence drops below a threshold.
  • Store and apply patches — Read the Key-Value Store patches and feed approved replacements into your scraper configuration or selector registry.
  • Use hints — Fill text hint (contains/regex), near label, and in container in the form to improve candidate scoring on noisy pages.
  • Validation — Increase reloads and min stability for stricter acceptance; use “Network idle” only when necessary (slower).

Deterministic behavior and no LLM

  • Same input ⇒ same candidate order and results (for the same page content). No random sampling, no AI/LLM calls.
  • Scoring uses fixed weights and deterministic tie-breaks: total score → selector length → DOM order.
  • Selector generation prefers stable attributes (data-testid, data-qa, etc.), then tag + stable classes, then :nth-child only when needed.
  • Validation runs over N reloads and can require same-node fingerprint and value-type stability before accepting a selector.

Run locally and development

npm install
npm run build
npm start

Input is read from Apify’s default key-value store when run as an Actor. For a local test with sample input:

$npm run test

This runs the Actor with sample-input.json and a mock Apify environment so you can test the pipeline without deploying.


FAQ, disclaimers, and support

Is Selector Auto-Fixer deterministic?
Yes. Same inputs and page content produce the same candidate order and outputs. No LLMs or random sampling are used.

Does it use an LLM or external APIs?
No. Repair is rule-based: DOM fingerprinting, similarity heuristics, and multi-reload validation only.

What if a field stays failed?
Increase max candidates or top K, add stronger text hint / near label / in container hints, or relax min stability / require same node if appropriate. Check errors in the result for runtime issues.

Where are patches stored?
In the Dataset (inside each URL result as patches) and in the Key-Value Store under the key patches (object mapping URL → list of patches).

Integrations and API
You can call the Actor via the Apify API, use schedules for recurring selector checks, and connect results to other tools using Apify integrations. Data can be transferred programmatically using the API.

Support
If you run into issues or have feedback, use the Issues tab on the Actor page or reach out through Apify support. We’re open to feedback and to creating custom solutions based on this Actor.