Senator Financial Disclosures Scraper avatar
Senator Financial Disclosures Scraper

Pricing

$10.00 / 1,000 results

Go to Apify Store
Senator Financial Disclosures Scraper

Senator Financial Disclosures Scraper

Developed by

Kenny Yona

Kenny Yona

Maintained by Community

This actor scrapes and downloads U.S. House of Representatives financial disclosure PDFs by member last name and/or filing year. Perfect for journalists, researchers, and compliance professionals seeking fast, structured access to official disclosure documents.

0.0 (0)

Pricing

$10.00 / 1,000 results

0

3

1

Last modified

5 months ago

Senator Disclosures Scraper

This project is an Apify Actor that scrapes financial disclosure PDFs for a given U.S. House member and year from the U.S. House of Representatives Financial Disclosure Reports.

Features

  • Automated scraping of disclosure PDFs by member last name and/or filing year
  • Form-based input UI in Apify Console (no need to edit JSON)
  • Outputs results to the Apify dataset for further processing
  • Built with Apify SDK, got, and jsdom

Inputs

The Actor expects the following input (via Apify input UI or JSON):

FieldTypeDescriptionRequired
lastNamestringMember's last nameNo
yearintegerFiling year (1994–2025)No

You can fill either or both fields. If neither is filled, the actor will exit with a warning.

Outputs

The Actor pushes results to the default dataset. Each item has the following structure:

{
"senator": "Pelosi, Hon.. Nancy",
"year": 2025,
"url": "https://disclosures-clerk.house.gov/public_disc/ptr-pdfs/2025/20026590.pdf"
}

Getting Started

  1. Install dependencies:
    $npm install
  2. Run locally:
    apify run
    # or
    npm start
  3. Deploy to Apify:
    • Log in: apify login
    • Deploy: apify push

Deploy to GitHub

  1. Create a new GitHub repository (e.g., using GitHub CLI: gh repo create sen-disclosures-scraper --public --source=. --remote=origin --push)
  2. Push your code:
    git add .
    git commit -m "Initial commit"
    git push -u origin main

Project Structure

  • src/main.js — Main Actor logic
  • .actor/actor.json — Apify Actor definition
  • .actor/input_schema.json — Input schema for Apify UI
  • storage/ — Local Apify/Crawlee storage (git-ignored)
  • README.md — Project documentation

Notes

  • The project uses Node.js 18+.
  • Memory usage is optimized (default 256–512 MB is sufficient).
  • The input UI is now form-based in Apify Console for easy use.

Resources