Lobbying Disclosure Scraper - Senate LDA Filings avatar

Lobbying Disclosure Scraper - Senate LDA Filings

Pricing

Pay per event

Go to Apify Store
Lobbying Disclosure Scraper - Senate LDA Filings

Lobbying Disclosure Scraper - Senate LDA Filings

Scrape US federal lobbying disclosure filings from the Senate LDA database. Extract registrants, clients, lobbyists, activities, issue areas, and government entities. Filter by year, period, issue, registrant, or client.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Senate LDA Lobbying Disclosure Scraper

Extract US federal lobbying disclosure filings from the Senate Lobbying Disclosure Act database. Covers roughly 1.94 million filings from 1999 to present — registrants, clients, lobbyists, issue areas, government entities contacted, foreign entities, and reported income and expenses.

Lobbying Disclosure Crawler Features

  • Extracts every filing field the Senate LDA publishes — registrant contact info, client details, lobbying activities, issue codes, and financials
  • Flattens nested structures into scalar fields and clean string arrays, so you can pipe the output straight into a spreadsheet or database
  • Filters by year, period, filing type, registrant name, client name, client state, issue area code, specific-issue text search, or date-posted range
  • Returns deduplicated lobbyist rosters and government-entity contact lists across all activities in a filing
  • Pure JSON API — no HTML scraping, no browser, no proxies, no authentication
  • Pay-per-event pricing at about $0.001 per filing
  • Honors the Senate's rate conventions with a 250ms courtesy delay between pages

Who Uses Senate LDA Lobbying Data?

  • Political intelligence firms — track new registrations and issue-area trends for clients monitoring specific legislation
  • Compliance teams — screen vendors and counterparties against the canonical federal lobbying registry before contracts close
  • Investigative journalists — follow the money from foreign entities, affiliated orgs, and quarterly expense reports across years
  • Policy researchers — build issue-area datasets spanning decades of filings without paying for commercial aggregators
  • Advocacy and public-interest orgs — surface which firms represent which clients on which issues, with source links back to the original filings

How the Senate LDA Crawler Works

  1. You pick at least one narrowing filter — a filing year, a registrant name, a client name, a state, an issue code, or a date window.
  2. The crawler queries the Senate LDA API at lda.senate.gov/api/v1/filings/, paginating through the standard DRF envelope until it reaches your maxItems cap or the last page.
  3. Each raw filing is flattened: registrant and client objects collapse into scalar fields, lobbying activities turn into a formatted string array, and nested lobbyists get deduplicated across activities.
  4. Results save to your Apify dataset with a stable schema, one row per filing.

Input

Basic: pull 2024 Q1 filings

{
"filingYear": 2024,
"filingPeriod": "first_quarter",
"maxItems": 100
}

Search by client name

{
"clientName": "GOOGLE",
"maxItems": 50
}

Registrations only, filtered by year

{
"filingYear": 2024,
"filingType": "RR",
"maxItems": 200
}

Issue-area slice (all Health filings from a given year)

{
"filingYear": 2024,
"issueAreaCode": "HCR",
"maxItems": 500
}

Date window on when filings were posted

{
"datePostedFrom": "2024-01-01",
"datePostedTo": "2024-03-31",
"maxItems": 1000
}

Input Parameters

FieldTypeDefaultDescription
filingYearinteger2024Calendar year the filing reports on. Leave empty if using a different narrowing filter. Valid range: 1999-present.
filingPeriodstring""Reporting period. One of first_quarter, second_quarter, third_quarter, fourth_quarter, mid_year, year_end, undetermined, or empty for all periods.
filingTypestring""Filing type code. RR = new registration, Q1Q4 = quarterly reports, 1A4A = amendments, 1T4T = terminations. Empty returns all types.
registrantNamestring""Case-insensitive contains match on the lobbying firm name.
clientNamestring""Case-insensitive contains match on the client organization name.
clientStatestring""Two-letter US state code filtering the client's state.
issueAreaCodestring""Three-letter general issue area code (e.g., HCR, BUD, TEC, DEF, TAX).
specificIssueSearchstring""Full-text search on the "specific lobbying issues" description.
datePostedFromstring""Return filings posted on or after this date (YYYY-MM-DD).
datePostedTostring""Return filings posted on or before this date (YYYY-MM-DD).
maxItemsinteger100Maximum number of filings to return. The API serves 25 per page, so small values finish in one or two requests.
proxyConfigurationobject{ useApifyProxy: false }Proxy settings. The Senate LDA API is public and does not require proxies.

At least one narrowing filter is required. Running the full corpus unfiltered is blocked — that would be about 78,000 pages and nobody's day goes well after that.

Lobbying Disclosure Crawler Output Fields

Example Output

{
"filing_uuid": "467e4a97-6351-4902-8ffa-dd51632e156b",
"filing_type": "Q1",
"filing_type_display": "1st Quarter - Report",
"filing_year": 2024,
"filing_period": "first_quarter",
"filing_period_display": "1st Quarter (Jan 1 - Mar 31)",
"filing_document_url": "https://lda.senate.gov/filings/public/filing/467e4a97-6351-4902-8ffa-dd51632e156b/print/",
"dt_posted": "2024-01-02T13:14:26-05:00",
"effective_date": "2023-08-01",
"termination_date": "",
"posted_by_name": "Sean Farrell",
"income": 30000,
"income_amount": "30000.00",
"expenses": null,
"expense_amount": "",
"expenses_method": "",
"registrant_id": 401107792,
"registrant_name": "EAST CAPITOL ADVISORS LLC",
"registrant_description": "",
"registrant_address": "921 H Street, NE, #252",
"registrant_city": "Washington",
"registrant_state": "District of Columbia",
"registrant_zip": "20002",
"registrant_country": "United States of America",
"registrant_contact_name": "SEAN FARRELL",
"registrant_contact_phone": "+1 202-944-0520",
"registrant_house_id": 56170,
"client_id": 56764,
"client_name": "CTIA - THE WIRELESS ASSOCIATION",
"client_description": "CTIA is the trade association of the cellular/wireless industry.",
"client_state": "District of Columbia",
"client_country": "United States of America",
"client_ppb_state": "District of Columbia",
"client_ppb_country": "United States of America",
"client_self_select": false,
"client_is_government_entity": false,
"lobbyists": [
"SEAN FARRELL"
],
"lobbying_activities": [
"TEC - Telecommunications: H.R.3949, End Cells in Cells Act, a bill to increase criminal penalties for contraband cell phones in prisons and jails."
],
"government_entities": [
"HOUSE OF REPRESENTATIVES"
],
"issue_area_codes": [
"TEC"
],
"foreign_entities": [],
"affiliated_organizations": [],
"conviction_disclosures": [],
"api_url": "https://lda.senate.gov/api/v1/filings/467e4a97-6351-4902-8ffa-dd51632e156b/",
"document_url": "https://lda.senate.gov/filings/public/filing/467e4a97-6351-4902-8ffa-dd51632e156b/print/"
}

Output Field Reference

FieldTypeDescription
filing_uuidstringUnique UUID of the filing
filing_typestringFiling type code (RR, Q1, Q2, Q3, Q4, MM, YE, 1A, 2T, etc.)
filing_type_displaystringHuman-readable filing type (e.g., "Registration", "1st Quarter - Report")
filing_yearintegerCalendar year the filing reports on
filing_periodstringReporting period code
filing_period_displaystringHuman-readable reporting period
filing_document_urlstringFiler-submitted document URL
dt_postedstringWhen the filing was posted to the LDA database (ISO 8601)
effective_datestringEffective date of the client-registrant relationship (ISO 8601)
termination_datestringWhen the relationship was terminated, if applicable
posted_by_namestringName of the person who submitted the filing
incomenumberReported lobbying income in USD, or null
income_amountstringRaw decimal string for precision-sensitive downstream systems
expensesnumberReported lobbying expenses in USD, or null
expense_amountstringRaw decimal string for expenses
expenses_methodstringMethod used to calculate expenses (a, b, or c)
registrant_idintegerInternal ID of the registrant
registrant_namestringRegistrant (lobbying firm) name
registrant_descriptionstringRegistrant's self-description of its business
registrant_addressstringRegistrant street address (line 1 + line 2, comma-joined)
registrant_citystringRegistrant city
registrant_statestringRegistrant state/region name
registrant_zipstringRegistrant postal code
registrant_countrystringRegistrant country (display name)
registrant_contact_namestringRegistrant primary contact name
registrant_contact_phonestringRegistrant primary contact phone
registrant_house_idintegerRegistrant's ID in the companion House system, if available
client_idintegerInternal ID of the client organization
client_namestringClient organization name
client_descriptionstringClient's self-description / general business
client_statestringClient state/region name
client_countrystringClient country (display name)
client_ppb_statestringClient's principal place of business state
client_ppb_countrystringClient's principal place of business country
client_self_selectbooleanTrue if the client self-registered (vs. represented by a firm)
client_is_government_entitybooleanTrue if the client is itself a government entity
lobbyistsstring[]Deduplicated list of lobbyists, each as First Last (covered_position) [NEW]
lobbying_activitiesstring[]Formatted activity records: <code> - <display>: <description>
government_entitiesstring[]Deduplicated list of government entities contacted across all activities
issue_area_codesstring[]Unique general issue area codes covered by this filing
foreign_entitiesstring[]Foreign entities with a financial interest, with country, ownership, and contribution where available
affiliated_organizationsstring[]Affiliated organizations that contribute to the lobbying, with city/state/country
conviction_disclosuresstring[]Conviction disclosures: Name - offense - date
api_urlstringAbsolute URL to this filing's own Senate LDA API record
document_urlstringPublic Senate LDA viewer URL for the filing document

FAQ

How do I scrape Senate lobbying disclosures?

Pick a narrowing filter — filing year, client name, registrant name, issue area code, or a date range — and run the actor. The full Senate LDA API is exposed via standard fields, and the output comes back as flat JSON ready for a dataset, CSV, or database.

How many lobbying filings does the Senate LDA Crawler cover?

About 1.94 million filings from 1999 through today, with roughly 97,000 new filings per year. The actor streams straight from the canonical Senate source, so new filings show up as they're posted.

What filters work on the Senate LDA API?

The actor supports filingYear, filingPeriod, filingType, registrantName, clientName, clientState, issueAreaCode, specificIssueSearch, datePostedFrom, and datePostedTo. Combine them as needed — the API applies AND-semantics. At least one filter is required to prevent accidental full-corpus runs.

Do I need proxies or an API key?

No. The Senate LDA API is public, unauthenticated, and free. Proxies are disabled by default.

How much does it cost to run?

About $0.10 to start plus roughly $0.001 per filing returned. Pulling a thousand filings lands near $1.10. A full year slice (about 97,000 filings) lands near $97.

What's the deal with the lda.gov deprecation notice?

The Senate-hosted API at lda.senate.gov carries a deprecation header naming lda.gov/api/v1/ as the successor, with a sunset of 2026-06-30. The successor host isn't open to the public yet. The crawler uses the working Senate host today, and the migration is a one-line change once the new host goes live.

How fast does the crawler run?

The Senate API returns 25 records per page. Small queries (10–100 filings) complete in under 10 seconds. A thousand filings takes a few minutes with the 250ms polite delay between pages.

Need More Features?

Need custom fields, additional filters, or a different data source? File an issue or get in touch.

Why Use the Senate LDA Lobbying Disclosure Crawler?

  • Canonical source — Reads the Senate's own JSON API, so output tracks whatever the filer reported, not an aggregator's interpretation.
  • Priced per record — About $0.001 per filing. A thousand filings costs a little over a dollar, which is what you might call "reasonable."
  • Clean output schema — Nested activities, lobbyists, and government entities get flattened and deduplicated into scalar fields and string arrays. No post-processing required before you load it into a warehouse.
  • No proxy overhead — Public US government API with no anti-bot measures, so your run cost is just compute and records.