Deprecated Html Scanner avatar

Deprecated Html Scanner

Pricing

$4.99/month + usage

Go to Apify Store
Deprecated Html Scanner

Deprecated Html Scanner

Deprecated HTML scanner that detects obsolete elements and attributes on any website — helping developers, SEO teams, and QA engineers modernize code, improve search rankings, and ensure HTML5 compliance.

Pricing

$4.99/month + usage

Rating

0.0

(0)

Developer

ZeroBreak

ZeroBreak

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

7 days ago

Last modified

Share

Deprecated HTML Scanner — Find Obsolete HTML Elements and Attributes

Deprecated HTML scanner that automatically detects outdated HTML elements and attributes on any website. Identify code quality issues, improve SEO rankings, and ensure your web pages follow modern HTML5 standards. Built with strict cost controls to prevent accidental overuse.

Use Cases

  • SEO auditing — Find deprecated HTML tags that search engines may penalize or ignore
  • Code quality checks — Identify legacy HTML that needs modernization during refactoring
  • Accessibility compliance — Detect outdated elements that may cause accessibility issues
  • Website migrations — Scan sites before migration to identify technical debt
  • Client reporting — Generate automated reports showing deprecated code issues
  • CI/CD integration — Add HTML quality checks to your deployment pipeline

Cost Protection Features

This actor includes multiple safeguards to prevent accidental overuse and unexpected charges:

SafeguardDefaultHard LimitDescription
maxUrls1001,000Maximum URLs processed per run
timeoutSecs300 (5 min)3,600 (1 hour)Total actor runtime limit
requestTimeoutSecs30120Per-request timeout

The actor will:

  • Stop gracefully when any limit is reached
  • Log warnings when limits are approached
  • Never exceed the hard limits regardless of input

Input

ParameterTypeDefaultDescription
urlstring-Single URL to scan for deprecated HTML
urlsarray-List of URLs to scan (one per line)
maxUrlsinteger100Maximum URLs to process (hard max: 1000)
timeoutSecsinteger300Actor timeout in seconds (hard max: 3600)
requestTimeoutSecsinteger30Per-request timeout in seconds
includeWarningsbooleantrueInclude warnings for discouraged patterns
scanAttributesbooleantrueScan for deprecated HTML attributes

Example Input

{
"urls": [
"https://example.com",
"https://example.com/about",
"https://example.com/contact"
],
"maxUrls": 50,
"timeoutSecs": 180,
"includeWarnings": true,
"scanAttributes": true
}

Output

The actor stores results in a dataset. Each entry contains:

{
"url": "https://example.com",
"status": "success",
"totalIssues": 5,
"deprecatedElements": [
{
"type": "deprecated_element",
"element": "center",
"recommendation": "Use CSS text-align: center instead",
"count": 3
},
{
"type": "deprecated_element",
"element": "font",
"recommendation": "Use CSS for styling",
"count": 2
}
],
"deprecatedAttributes": [
{
"type": "deprecated_attribute",
"attribute": "bgcolor",
"element": "body",
"recommendation": "Use CSS background-color",
"count": 1
}
],
"warnings": [
{
"type": "warning",
"attribute": "style",
"count": 15,
"recommendation": "Inline styles - consider using external CSS"
}
],
"error": null,
"scannedAt": "2024-01-15T10:30:00.000Z",
"processingTimeMs": 245
}
FieldTypeDescription
urlstringThe URL that was scanned
statusstringScan status: success, error, timeout
totalIssuesintegerTotal deprecated elements + attributes + warnings
deprecatedElementsarrayList of deprecated HTML elements found
deprecatedAttributesarrayList of deprecated HTML attributes found
warningsarrayDiscouraged patterns (inline styles, event handlers)
errorstringError message if scan failed
scannedAtstringISO timestamp of the scan
processingTimeMsintegerProcessing time in milliseconds

Deprecated Elements Detected

The scanner detects these obsolete HTML elements:

  • <font>, <center>, <big>, <strike>, <tt> — Use CSS instead
  • <marquee>, <blink> — Use CSS animations
  • <frame>, <frameset>, <noframes> — Use modern layouts or iframes
  • <applet> — Use <object> or <embed>
  • <acronym> — Use <abbr>
  • <dir> — Use <ul> with CSS
  • And 20+ more obsolete elements

Deprecated Attributes Detected

The scanner detects these obsolete HTML attributes:

  • bgcolor, background, text, link, vlink, alink — Use CSS colors
  • align, valign — Use CSS flexbox or grid
  • border, cellpadding, cellspacing — Use CSS borders and spacing
  • hspace, vspace, marginwidth, marginheight — Use CSS margins
  • width, height (except on <img>, <video>, <canvas>) — Use CSS
  • And 10+ more obsolete attributes

How It Works

  1. Accepts single URL or batch of URLs as input
  2. Enforces cost limits (maxUrls, timeout) before processing
  3. Fetches each page with configurable request timeout
  4. Parses HTML using BeautifulSoup with lxml parser
  5. Scans for deprecated elements, attributes, and warning patterns
  6. Consolidates results by element/attribute type with counts
  7. Pushes results to dataset immediately (streaming output)
  8. Stops gracefully when limits are reached

FAQ

How do I prevent accidental high charges? The actor has built-in safeguards: default 100 URL limit, 5-minute timeout, and hard limits that cannot be exceeded. Always set maxUrls and timeoutSecs appropriate for your needs.

Can I scan JavaScript-rendered pages? This actor uses HTTP requests and parses static HTML. For JavaScript-heavy SPAs, you would need a browser-based actor using Playwright.

What counts as a "warning" vs "deprecated"? Deprecated elements/attributes are officially obsolete in HTML5. Warnings are for patterns that work but are discouraged (like inline styles or onclick handlers).

How are duplicate URLs handled? URLs are automatically deduplicated before processing. Trailing slashes are normalized.

What happens if a page times out? The actor records a timeout status for that URL and continues with the next one. It won't crash or waste time on slow pages.

Integrations

Connect Deprecated HTML Scanner with other apps and services using Apify integrations. You can integrate with Make, Zapier, Slack, Airbyte, GitHub, Google Sheets, Google Drive, and many more. You can also use webhooks to trigger actions whenever results are available.