Html Lang Validator avatar

Html Lang Validator

Pricing

$4.99/month + usage

Go to Apify Store
Html Lang Validator

Html Lang Validator

HTML lang validator that checks any webpage for missing or invalid lang attributes, so developers and SEO teams can fix language tag errors across large sites without clicking through pages one by one.

Pricing

$4.99/month + usage

Rating

0.0

(0)

Developer

ZeroBreak

ZeroBreak

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 hours ago

Last modified

Share

HTML Lang Validator: Check and Fix Missing Lang Attributes on Any Website

HTML lang validator checks every URL you give it for a missing or invalid lang attribute on the <html> tag. The lang attribute tells browsers and screen readers what language a page is written in. Leave it out and you create an ambiguity that costs you in accessibility audits and search rankings. Most site audits catch broken links and slow pages but skip the lang check entirely.

Point the actor at one URL or feed it a list of hundreds. It fetches each page, reads the lang attribute, and validates the value against BCP 47 rules. Results land in a dataset with the exact lang value found, HTTP status, page title, and a plain list of any issues detected.

Use cases

  • SEO auditing: find pages with missing lang attributes before they hurt international SEO performance
  • Accessibility compliance: flag lang attribute errors that break screen reader language detection
  • Site migrations: validate lang attributes across every page after a CMS or template change
  • Multilingual site QA: confirm each language variant has the correct lang value set
  • Bulk content audits: check hundreds of URLs without opening each page by hand

Input

ParameterTypeDefaultDescription
urlstring-Single URL to validate
urlsarray-List of URLs to validate, one per line
maxUrlsinteger100Maximum URLs to process per run (max: 1000)
requestTimeoutSecsinteger30Per-request timeout in seconds
proxyConfigurationobjectDatacenter (Anywhere)Proxy type and location for requests. Supports Datacenter, Residential, Special, and custom proxies. Optional.

Example input

{
"urls": [
"https://apify.com",
"https://apify.com/store",
"https://apify.com/about"
],
"maxUrls": 100,
"requestTimeoutSecs": 30,
"proxyConfiguration": { "useApifyProxy": true }
}

What data does this actor collect?

Each result in the dataset contains:

{
"url": "https://apify.com",
"httpStatus": 200,
"pageTitle": "Apify: Full-stack web scraping and data extraction platform",
"langValue": "en",
"xmlLangValue": null,
"isLangPresent": true,
"isLangValid": true,
"issues": [],
"hasIssues": false,
"error": null,
"checkedAt": "2025-03-04T10:23:41.123456+00:00"
}
FieldTypeDescription
urlstringFinal URL after any redirects
httpStatusintegerHTTP response status code
pageTitlestringPage title from the <title> tag
langValuestringValue of the lang attribute. Null if missing.
xmlLangValuestringValue of xml:lang (used in XHTML documents). Null if not present.
isLangPresentbooleanTrue if a lang attribute exists on the <html> tag
isLangValidbooleanTrue if the lang value passes BCP 47 validation
issuesarrayList of validation issues found for this URL
hasIssuesbooleanTrue if any issues were detected
errorstringError message if the page could not be fetched. Null otherwise.
checkedAtstringISO 8601 timestamp of when the check ran

How it works

  1. Collects URLs from url and urls inputs, deduplicates them, and caps at maxUrls
  2. Fetches each page with an async HTTP client using a realistic browser user-agent
  3. Parses the HTML with BeautifulSoup and reads the lang attribute from the <html> tag
  4. Validates the value against BCP 47 language tag rules
  5. Flags issues: missing attribute, empty value, or invalid format
  6. Pushes results to the dataset in real time as each URL finishes

Integrations

Connect HTML Lang Validator with other apps using Apify integrations. You can pipe results into Google Sheets, trigger Slack alerts via Make or Zapier, or connect to Airbyte for data warehouse ingestion. Use webhooks to trigger downstream actions as soon as a run completes.

FAQ

What counts as a valid lang attribute? The actor validates against BCP 47 language tags. Values like en, en-US, fr, zh-Hant, and pt-BR all pass. Values like english, EN_US, or an empty string fail.

What happens if a page returns a 404 or 5xx error? The actor records the HTTP status and writes an error message to the result. It keeps processing the rest of the list rather than stopping.

Can it handle XHTML pages that use xml:lang instead of lang? Yes. The actor reads both lang and xml:lang. If only xml:lang is found, it flags the missing lang attribute and notes the xml:lang value in the result.

How many URLs can it process in one run? Up to 1,000 per run. Use maxUrls to keep runs smaller during testing.

Does it follow redirects? Yes. The actor follows HTTP redirects automatically and reports the final URL after all hops.

Can I run this on a schedule? Yes. Set up a scheduled run in Apify Console to monitor your site's lang attribute compliance over time and catch regressions after deployments.


Run the HTML lang validator on your site and get a full report on missing or invalid lang attributes. Export results as JSON or CSV, or push them directly to Google Sheets for review.