Harris County Court Records Scraper
Pricing
Pay per event
Harris County Court Records Scraper
Scrape public Harris County District Clerk new civil and criminal filing records into structured datasets.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Extract public Harris County District Clerk court filing records from the anonymous public records portal. The actor focuses on fast, reliable new-filing monitoring for civil and criminal listings without logging in and without attempting to bypass CAPTCHA-protected document downloads.
What does Harris County Court Records Scraper do?
Harris County Court Records Scraper collects rows from the public Harris County District Clerk record listings.
It currently supports:
- ⚖️ Today's new civil filings
- 🚔 Today's new criminal filings
- 📄 Case number and style fields
- 📅 File dates
- 🏛️ Court and case-region columns
- 🔎 Type of action or offense
- 🧾 Optional public viewer tokens found in listing buttons
Who is it for?
Legal operations teams can monitor new Harris County matters every day.
Litigation researchers can collect newly filed civil case metadata for intake, market mapping, or docket monitoring.
Journalists and public-record researchers can build repeatable datasets from county-level court records.
Compliance teams can watch local criminal filing activity without manually opening the District Clerk site.
Data vendors can enrich broader public-record pipelines with county-specific filing rows.
Why use this actor?
The public portal is an ASP.NET WebForms site. Manual scraping requires handling large HTML pages, public listing tables, inconsistent whitespace, and entity decoding.
This actor wraps the public pages into a simple Apify dataset.
You get structured rows, clear source provenance, and a repeatable run input.
Data source
The actor reads anonymous public pages from:
https://www.hcdistrictclerk.com/edocs/public/Search.aspx
The new-filing listing URLs are:
Search.aspx?NewSuits=0Search.aspx?NewSuits=1
The actor does not log in.
The actor does not solve CAPTCHA.
The actor does not download protected documents.
Data fields
| Field | Description |
|---|---|
recordType | Civil or criminal new-filing mode. |
sourceUrl | Public listing URL used for the row. |
county | Harris. |
state | Texas. |
courtSystem | Harris County District Clerk. |
caseNumber | Case/cause number as displayed. |
style | Case style, such as plaintiff vs. defendant or State of Texas vs. defendant. |
fileDate | Filing date displayed by the portal. |
court | Court column value. |
caseRegion | Civil or Criminal region value. |
typeOfActionOrOffense | Civil action type or criminal offense text. |
detailToken | Optional encrypted public case-detail token from the listing HTML. |
documentImageToken | Optional encrypted public image-viewer token when present. |
scrapedAt | ISO timestamp when the row was scraped. |
How much does it cost to scrape Harris County court records?
The actor uses pay-per-event pricing.
There is a small start event per run and a per-record event for each saved row.
Use a low maxItems value for testing.
Increase maxItems for daily monitoring or back-office exports.
Final platform pricing is visible on the Apify Store page before you start a run.
Input options
recordTypes
Choose which listing pages to scrape.
Allowed values:
civilNewFilingscriminalNewFilings
maxItems
Maximum records saved across all selected listing pages.
Default: 50.
Maximum: 500.
includeViewerTokens
Advanced option.
When enabled, the actor includes encrypted public tokens from the site's case-detail and document-viewer buttons.
These tokens are useful for traceability and debugging.
They are not direct stable document URLs.
Example input
{"recordTypes": ["civilNewFilings", "criminalNewFilings"],"maxItems": 25,"includeViewerTokens": false}
Example output
{"recordType": "civilNewFilings","sourceUrl": "https://www.hcdistrictclerk.com/edocs/public/Search.aspx?NewSuits=0","county": "Harris","state": "Texas","courtSystem": "Harris County District Clerk","caseNumber": "202639824- 7","style": "LE, AHN vs. COLONIAL COUNTY MUTUAL INSURANCE COMPANY","fileDate": "6/12/2026","court": "061","caseRegion": "Civil","typeOfActionOrOffense": "Motor Vehicle Accident","scrapedAt": "2026-06-13T08:26:29.085Z"}
How to run
- Open the actor on Apify.
- Select one or both record types.
- Set
maxItems. - Leave viewer tokens disabled unless you need them.
- Click Start.
- Export the dataset as JSON, CSV, Excel, or via API.
Tips for better results
Run the actor after the county portal has posted new records for the day.
Select both civil and criminal modes for broad daily monitoring.
Use separate scheduled runs if you want separate datasets per record type.
Keep maxItems small for smoke tests.
Use scrapedAt to compare daily snapshots.
Integrations
Use this actor with Apify schedules for daily filing monitoring.
Connect the dataset to Google Sheets for paralegal review queues.
Send dataset exports to a data warehouse for public-record trend analysis.
Trigger a webhook when new rows are scraped.
Feed records into a deduplication workflow keyed by caseNumber and recordType.
API usage with Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor('automation-lab/harris-county-court-records-scraper').call({recordTypes: ['civilNewFilings', 'criminalNewFilings'],maxItems: 25,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
API usage with Python
from apify_client import ApifyClientclient = ApifyClient('MY-APIFY-TOKEN')run = client.actor('automation-lab/harris-county-court-records-scraper').call(run_input={'recordTypes': ['civilNewFilings', 'criminalNewFilings'],'maxItems': 25,})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items)
API usage with cURL
curl -X POST "https://api.apify.com/v2/acts/automation-lab~harris-county-court-records-scraper/runs?token=$APIFY_TOKEN" \-H 'Content-Type: application/json' \-d '{"recordTypes":["civilNewFilings","criminalNewFilings"],"maxItems":25}'
MCP usage
Use the Apify MCP server to call this actor from Claude Code, Claude Desktop, Cursor, or VS Code.
MCP URL:
https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper
Claude Code setup
Add the Apify MCP server with HTTP transport:
$claude mcp add apify-harris-county-court-records --transport http "https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper"
Claude Desktop setup
Add this server to your Claude Desktop MCP configuration:
{"mcpServers": {"apify-harris-county-court-records": {"url": "https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper"}}}
Restart Claude Desktop after saving the configuration.
Cursor setup
In Cursor, open MCP settings, add a new HTTP server, name it apify-harris-county-court-records, and use this URL:
https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper
VS Code setup
In VS Code with an MCP-capable assistant extension, add a new HTTP MCP server named apify-harris-county-court-records and paste the same Apify MCP URL:
https://mcp.apify.com/?tools=automation-lab/harris-county-court-records-scraper
Example prompts:
- "Run the Harris County Court Records Scraper for today's civil filings and summarize action types."
- "Collect today's criminal filings and group them by court."
- "Compare yesterday's dataset with today's new Harris County filings."
Scheduling
A common setup is one daily run for civil filings and one daily run for criminal filings.
Use Apify schedules to run after business hours or early morning.
Store historical datasets for daily deltas.
Limitations
The actor extracts public listing rows only.
It does not access private accounts.
It does not solve CAPTCHA.
It does not download court documents.
It does not guarantee that the county website has posted new records at the time of your run.
FAQ
Does this actor require a Harris County account?
No. It uses anonymous public listing pages only.
Does it download court documents?
No. Document flows can include additional controls, so this actor limits scope to public listing metadata.
Can I monitor civil and criminal filings together?
Yes. Select both civilNewFilings and criminalNewFilings in recordTypes.
Why do I see fewer records than maxItems?
The public portal may expose fewer records than requested for the selected listing pages at run time.
Troubleshooting
If the run returns fewer rows than expected, increase maxItems or check whether the public portal currently lists enough filings.
If the run returns zero rows, the county portal may be temporarily unavailable or the public table markup may have changed.
If you need document downloads, this actor is not the right tool because the portal protects document flows with additional controls.
Legality and responsible use
This actor is designed for publicly available court-record metadata.
Review the Harris County District Clerk terms and applicable laws before using data in production.
Do not use scraped data for unlawful discrimination, harassment, or prohibited background-check workflows.
Respect privacy and data-retention requirements that apply to your organization.
Related scrapers
Other Automation Lab actors that may be useful:
- https://apify.com/automation-lab/court-records-scraper
- https://apify.com/automation-lab/google-search-scraper
- https://apify.com/automation-lab/website-health-report
Dataset exports
Apify datasets can be exported as:
- JSON
- JSONL
- CSV
- Excel
- XML
- RSS
Use CSV or Excel for legal operations review.
Use JSONL for data pipelines.
Reliability notes
The scraper uses HTTP requests instead of a browser.
This keeps runs inexpensive and quick.
The source pages are large, so the parser targets the public searchTable element directly.
Whitespace and HTML entities are normalized before records are saved.
Version notes
Version 0.1 focuses on new civil and criminal filing listings.
Future versions may add more public search modes if they can be handled without login, CAPTCHA bypass, or unstable browser automation.