Harris County Court Records Scraper avatar

Harris County Court Records Scraper

Pricing

Pay per event

Go to Apify Store
Harris County Court Records Scraper

Harris County Court Records Scraper

Scrape public Harris County District Clerk new civil and criminal filing records into structured datasets.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Extract public Harris County District Clerk court filing records from the anonymous public records portal. The actor focuses on fast, reliable new-filing monitoring for civil and criminal listings without logging in and without attempting to bypass CAPTCHA-protected document downloads.

What does Harris County Court Records Scraper do?

Harris County Court Records Scraper collects rows from the public Harris County District Clerk record listings.

It currently supports:

  • ⚖️ Today's new civil filings
  • 🚔 Today's new criminal filings
  • 📄 Case number and style fields
  • 📅 File dates
  • 🏛️ Court and case-region columns
  • 🔎 Type of action or offense
  • 🧾 Optional public viewer tokens found in listing buttons

Who is it for?

Legal operations teams can monitor new Harris County matters every day.

Litigation researchers can collect newly filed civil case metadata for intake, market mapping, or docket monitoring.

Journalists and public-record researchers can build repeatable datasets from county-level court records.

Compliance teams can watch local criminal filing activity without manually opening the District Clerk site.

Data vendors can enrich broader public-record pipelines with county-specific filing rows.

Why use this actor?

The public portal is an ASP.NET WebForms site. Manual scraping requires handling large HTML pages, public listing tables, inconsistent whitespace, and entity decoding.

This actor wraps the public pages into a simple Apify dataset.

You get structured rows, clear source provenance, and a repeatable run input.

Data source

The actor reads anonymous public pages from:

https://www.hcdistrictclerk.com/edocs/public/Search.aspx

The new-filing listing URLs are:

  • Search.aspx?NewSuits=0
  • Search.aspx?NewSuits=1

The actor does not log in.

The actor does not solve CAPTCHA.

The actor does not download protected documents.

Data fields

FieldDescription
recordTypeCivil or criminal new-filing mode.
sourceUrlPublic listing URL used for the row.
countyHarris.
stateTexas.
courtSystemHarris County District Clerk.
caseNumberCase/cause number as displayed.
styleCase style, such as plaintiff vs. defendant or State of Texas vs. defendant.
fileDateFiling date displayed by the portal.
courtCourt column value.
caseRegionCivil or Criminal region value.
typeOfActionOrOffenseCivil action type or criminal offense text.
detailTokenOptional encrypted public case-detail token from the listing HTML.
documentImageTokenOptional encrypted public image-viewer token when present.
scrapedAtISO timestamp when the row was scraped.

How much does it cost to scrape Harris County court records?

The actor uses pay-per-event pricing.

There is a small start event per run and a per-record event for each saved row.

Use a low maxItems value for testing.

Increase maxItems for daily monitoring or back-office exports.

Final platform pricing is visible on the Apify Store page before you start a run.

Input options

recordTypes

Choose which listing pages to scrape.

Allowed values:

  • civilNewFilings
  • criminalNewFilings

maxItems

Maximum records saved across all selected listing pages.

Default: 50.

Maximum: 500.

includeViewerTokens

Advanced option.

When enabled, the actor includes encrypted public tokens from the site's case-detail and document-viewer buttons.

These tokens are useful for traceability and debugging.

They are not direct stable document URLs.

Example input

{
"recordTypes": ["civilNewFilings", "criminalNewFilings"],
"maxItems": 25,
"includeViewerTokens": false
}

Example output

{
"recordType": "civilNewFilings",
"sourceUrl": "https://www.hcdistrictclerk.com/edocs/public/Search.aspx?NewSuits=0",
"county": "Harris",
"state": "Texas",
"courtSystem": "Harris County District Clerk",
"caseNumber": "202639824- 7",
"style": "LE, AHN vs. COLONIAL COUNTY MUTUAL INSURANCE COMPANY",
"fileDate": "6/12/2026",
"court": "061",
"caseRegion": "Civil",
"typeOfActionOrOffense": "Motor Vehicle Accident",
"scrapedAt": "2026-06-13T08:26:29.085Z"
}

How to run

  1. Open the actor on Apify.
  2. Select one or both record types.
  3. Set maxItems.
  4. Leave viewer tokens disabled unless you need them.
  5. Click Start.
  6. Export the dataset as JSON, CSV, Excel, or via API.

Tips for better results

Run the actor after the county portal has posted new records for the day.

Select both civil and criminal modes for broad daily monitoring.

Use separate scheduled runs if you want separate datasets per record type.

Keep maxItems small for smoke tests.

Use scrapedAt to compare daily snapshots.

Integrations

Use this actor with Apify schedules for daily filing monitoring.

Connect the dataset to Google Sheets for paralegal review queues.

Send dataset exports to a data warehouse for public-record trend analysis.

Trigger a webhook when new rows are scraped.

Feed records into a deduplication workflow keyed by caseNumber and recordType.

API usage with Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/harris-county-court-records-scraper').call({
recordTypes: ['civilNewFilings', 'criminalNewFilings'],
maxItems: 25,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('MY-APIFY-TOKEN')
run = client.actor('automation-lab/harris-county-court-records-scraper').call(run_input={
'recordTypes': ['civilNewFilings', 'criminalNewFilings'],
'maxItems': 25,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

API usage with cURL

curl -X POST "https://api.apify.com/v2/acts/automation-lab~harris-county-court-records-scraper/runs?token=$APIFY_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"recordTypes":["civilNewFilings","criminalNewFilings"],"maxItems":25}'

MCP usage

Use the Apify MCP server to call this actor from Claude Code, Claude Desktop, Cursor, or VS Code.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper

Claude Code setup

Add the Apify MCP server with HTTP transport:

$claude mcp add apify-harris-county-court-records --transport http "https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper"

Claude Desktop setup

Add this server to your Claude Desktop MCP configuration:

{
"mcpServers": {
"apify-harris-county-court-records": {
"url": "https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper"
}
}
}

Restart Claude Desktop after saving the configuration.

Cursor setup

In Cursor, open MCP settings, add a new HTTP server, name it apify-harris-county-court-records, and use this URL:

https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper

VS Code setup

In VS Code with an MCP-capable assistant extension, add a new HTTP MCP server named apify-harris-county-court-records and paste the same Apify MCP URL:

https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper

Example prompts:

  • "Run the Harris County Court Records Scraper for today's civil filings and summarize action types."
  • "Collect today's criminal filings and group them by court."
  • "Compare yesterday's dataset with today's new Harris County filings."

Scheduling

A common setup is one daily run for civil filings and one daily run for criminal filings.

Use Apify schedules to run after business hours or early morning.

Store historical datasets for daily deltas.

Limitations

The actor extracts public listing rows only.

It does not access private accounts.

It does not solve CAPTCHA.

It does not download court documents.

It does not guarantee that the county website has posted new records at the time of your run.

FAQ

Does this actor require a Harris County account?

No. It uses anonymous public listing pages only.

Does it download court documents?

No. Document flows can include additional controls, so this actor limits scope to public listing metadata.

Can I monitor civil and criminal filings together?

Yes. Select both civilNewFilings and criminalNewFilings in recordTypes.

Why do I see fewer records than maxItems?

The public portal may expose fewer records than requested for the selected listing pages at run time.

Troubleshooting

If the run returns fewer rows than expected, increase maxItems or check whether the public portal currently lists enough filings.

If the run returns zero rows, the county portal may be temporarily unavailable or the public table markup may have changed.

If you need document downloads, this actor is not the right tool because the portal protects document flows with additional controls.

Legality and responsible use

This actor is designed for publicly available court-record metadata.

Review the Harris County District Clerk terms and applicable laws before using data in production.

Do not use scraped data for unlawful discrimination, harassment, or prohibited background-check workflows.

Respect privacy and data-retention requirements that apply to your organization.

Other Automation Lab actors that may be useful:

Dataset exports

Apify datasets can be exported as:

  • JSON
  • JSONL
  • CSV
  • Excel
  • XML
  • RSS

Use CSV or Excel for legal operations review.

Use JSONL for data pipelines.

Reliability notes

The scraper uses HTTP requests instead of a browser.

This keeps runs inexpensive and quick.

The source pages are large, so the parser targets the public searchTable element directly.

Whitespace and HTML entities are normalized before records are saved.

Version notes

Version 0.1 focuses on new civil and criminal filing listings.

Future versions may add more public search modes if they can be handled without login, CAPTCHA bypass, or unstable browser automation.