Wikipedia Revision History Scraper avatar
Wikipedia Revision History Scraper
Deprecated

Pricing

$20.00/month + usage

Go to Apify Store
Wikipedia Revision History Scraper

Wikipedia Revision History Scraper

Deprecated

Scrape the revision history of any Wikipedia page, including metadata and diffs for each revision.

Pricing

$20.00/month + usage

Rating

0.0

(0)

Developer

ViewSource

ViewSource

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

9 months ago

Last modified

Share

Scrapes the revision history of any Wikipedia page, including diffs and metadata for each revision.

Input

FieldTypeRequiredDescription
wikipediaPagestringYesWikipedia page URL (e.g. "https://en.wikipedia.org/wiki/LangChain")
limitintegerNoMaximum umber of revisions to fetch (default: 50)
includeDiffbooleanNoWhether to include diff between revisions in output (default: true)

Example input

{
"wikipediaPage": "https://en.wikipedia.org/wiki/LangChain",
"limit": 10,
"includeDiff": true
}

Output

The actor outputs a JSON array where each object represents a revision. Each object contains:

  • revid: revision ID
  • parentid: parent revision ID
  • minor: whether the revision was a minor edit
  • user: username of the user who made the revision
  • timestamp: timestamp of the revision
  • comment: edit summary
  • tags: array of tags associated with the revision (e.g. ["visualeditor"])
  • size: size of the page after the revision, in bytes
  • size_diff: difference between the size of the page after and before the revision, in bytes
  • diff_raw: raw HTML diff between the revision and its parent (if includeDiff is true)
  • diff_parsed: array of line changes between the revision and its parent (if includeDiff is true)

Example output

[
{
"revid": 123456789,
"parentid": 123456788,
"minor": false,
"user": "ExampleUser",
"timestamp": "2024-01-01T12:34:56Z",
"comment": "Fixed typo",
"tags": ["visualeditor"],
"size": 12345,
"size_diff": 10,
"diff_raw": "<tr>...</tr>",
"diff_parsed": [
{ "before": "old line", "after": "new line" }
]
}
]