W3C Html Reporter avatar
W3C Html Reporter
Try for free

3 days trial then $25.00/month - No credit card required now

View all Actors
W3C Html Reporter

W3C Html Reporter

service-paradis/w3c-html-reporter
Try for free

3 days trial then $25.00/month - No credit card required now

Get HTML validity reports from various web pages using W3C HTML validator.

W3C HTML Validity Reporter

The W3C HTML Validity Reporter is an Apify actor that generates reports on the validity of given webpages HTML according to the W3C HTML Validator. The actor takes webpages URL as input and produces reports with detailed information on the validity of the webpages HTML.

Input

The actor takes the following input:

  • startUrls (required): The URL of the webpages to validate.
  • proxy (Object): Proxy configuration. You can edit this to use Apify proxy, or provide your own proxy servers. Default value is { "useApifyProxy": false }.
  • debug (Boolean): See detailed logs when activated. Default value is false.

Output

The actor generates a JSON report on the validity of the webpages HTML. The report includes:

  • A list of messages given by the validator

Usage

To use the actor, you'll need an Apify account. If you don't have one, sign up for free on the Apify website.

Once you have an account, you can run the actor by creating a new task with the following configuration:

1{
2  "startUrls": [{
3      "url": "https://example.com"
4    }
5  ],
6  "proxy": {
7    "useApifyProxy": false
8  },
9  "debug": false
10}

Replace "https://example.com" with the URL of the webpage you want to validate.

Please note that w3c validator use Cloudflare to protect their website against bot. You may need to use Apify proxy in order to use this crawler.

Results example

The output from scraping W3C validator is stored in the dataset. Each messsage is stored as an item inside the dataset. After the run is finished, you can download the scraped data onto your computer or export to any web app in various data formats (JSON, CSV, XML, RSS, HTML Table). Here's a few examples of the outputs you can get:

1{
2  "url": "https://apify.com",
3  "language": "en",
4  "severity": "info",
5  "lastLine": 10,
6  "firstColumn": 301,
7  "lastColumn": 357,
8  "message": "Trailing slash on void elements has no effect and interacts badly with unquoted attribute values.",
9  "markup": "rowser.\"/><meta name=\"twitter:card\" content=\"summary_large_image\"/><meta ",
10  "highlightIndex": 10,
11  "highlightLength": 57
12}
1{
2  "url": "https://apify.com",
3  "language": "en",
4  "severity": "warning",
5  "firstLine": 614,
6  "lastLine": 614,
7  "firstColumn": 5684,
8  "lastColumn": 5721,
9  "message": "Section lacks heading. Consider using “h2”-“h6” elements to add identifying headings to all sections, or else use a “div” element instead for any cases where no heading is needed.",
10  "markup": "-0 wwExY\"><section class=\"sc-1913faef-1 jYOdxN\"><div c",
11  "highlightIndex": 10,
12  "highlightLength": 38
13}
1{
2  "url": "https://apify.com",
3  "language": "en",
4  "severity": "error",
5  "lastLine": 10,
6  "firstColumn": 1210,
7  "lastColumn": 1272,
8  "message": "A “meta” element with an “http-equiv” attribute whose value is “X-UA-Compatible” must have a “content” attribute with the value “IE=edge”.",
9  "markup": "ent=\"24\"/><meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge,chrome=1\"/><meta ",
10  "highlightIndex": 10,
11  "highlightLength": 63
12}
Developer
Maintained by Community
Actor metrics
  • 2 monthly users
  • 100.0% runs succeeded
  • days response time
  • Created in Jun 2023
  • Modified 11 months ago