HTML Table to JSON/CSV Extractor avatar

HTML Table to JSON/CSV Extractor

Pricing

from $1.00 / 1,000 table extracteds

Go to Apify Store
HTML Table to JSON/CSV Extractor

HTML Table to JSON/CSV Extractor

Convert complex web tables into clean, structured JSON or CSV data. Automate data entry and reporting without writing custom parsers.

Pricing

from $1.00 / 1,000 table extracteds

Rating

0.0

(0)

Developer

Andok

Andok

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

19 days ago

Last modified

Share

HTML Table Extractor

Pull structured table data from any web page and export it as clean JSON or CSV. No custom scraper needed — just provide URLs containing HTML tables and get rows and columns as structured data. Process multiple pages in a single run.

Features

  • Automatic table detection — finds all <table> elements on each page
  • Header recognition — detects <th> headers or uses the first row as column names
  • Bulk processing — extract tables from multiple URLs in one run
  • Clean JSON output — each table row becomes a structured object with named fields
  • Configurable concurrency — process 1 to 50 URLs in parallel

Input

FieldTypeRequiredDefaultDescription
urlsarrayNoList of webpage URLs to extract tables from
urlstringNoSingle URL for backwards compatibility (use urls for bulk)
timeoutSecondsintegerNo15Maximum seconds to wait for each URL response
concurrencyintegerNo10Number of URLs to process in parallel (1-50)

Input Example

{
"urls": [
"https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)"
]
}

Output

Each URL produces one dataset item containing all tables found on the page.

Key output fields:

  • inputUrl (string) — the original URL provided
  • finalUrl (string) — the URL after following redirects
  • status (number) — HTTP status code
  • tableCount (number) — number of tables found on the page
  • tables (array) — array of table objects, each containing headers and rows
  • error (string) — error message if extraction failed, otherwise null
  • checkedAt (string) — ISO 8601 timestamp

Output Example

{
"inputUrl": "https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)",
"finalUrl": "https://en.wikipedia.org/wiki/List_of_countries_by_population_(United_Nations)",
"status": 200,
"tableCount": 2,
"tables": [
{
"headers": ["Rank", "Country", "Population"],
"rows": [
["1", "India", "1,450,935,791"],
["2", "China", "1,419,321,278"]
]
}
],
"error": null,
"checkedAt": "2025-01-15T10:30:00.000Z"
}

Pricing

EventCost
Table ExtractedPay-per-event (see actor pricing page)

Use Cases

  • Data collection — grab financial data, sports stats, or product specs from web pages without writing a scraper
  • Spreadsheet import — convert HTML tables to CSV or JSON for import into Excel or Google Sheets
  • Research automation — extract tabular data from Wikipedia, government sites, or academic pages
  • Price monitoring — pull pricing tables from competitor websites
ActorWhat it adds
Web Page to Markdown Converter for LLMsConvert full page content to Markdown including tables
JSON-LD Schema ExtractorExtract structured data from Schema.org markup