# U.S. Senate Trading Pipeline (`seralifatih/congress-trading-pipeline`) Actor

Fetches U.S. Senate Periodic Transaction Reports (PTRs) directly from the official efdsearch.senate.gov source. Normalizes filings into a clean, deduplicated dataset with politician, ticker, asset, type, amount range, dates, and owner. No third-party vendors. Public domain data. STOCK Act compliant.

- **URL**: https://apify.com/seralifatih/congress-trading-pipeline.md
- **Developed by:** [Fatih İlhan](https://apify.com/seralifatih) (community)
- **Categories:** Developer tools, Jobs, News
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.50 / 1,000 transaction records

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Congress Trading Pipeline — API

Ingests U.S. Senate Periodic Transaction Reports (PTRs) directly from the Senate Electronic Financial Disclosures office, normalizes them, and exposes a JSON API compatible with the existing frontend — replacing the QuiverQuant dependency entirely.

No third-party data vendor required. No scraping. Source is public domain U.S. government disclosure data.

---

### Prerequisites

- Node.js 18+
- No external services, databases, or API keys required for MVP

---

### Setup

```bash
npm install
cp .env.example .env   ## edit as needed — all vars have defaults
npm run dev
````

Server starts on `http://localhost:3001`.\
On first boot the scheduler runs the pipeline immediately, then every 6 hours.

#### Environment variables

| Variable | Default | Description |
|---|---|---|
| `PORT` | `3001` | HTTP listen port |
| `DB_PATH` | `./data/pipeline.db` | SQLite file path (created automatically) |
| `LOG_LEVEL` | `info` | `debug` / `info` / `warn` / `error` |
| `NODE_ENV` | `development` | Set to `production` for JSON-lines log output |
| `CRON_SCHEDULE` | `0 */6 * * *` | node-cron schedule expression |
| `FETCH_DAYS_BACK` | `90` | Rolling window of PTRs to fetch |
| `CRON_SECRET` | *(empty)* | Shared secret for `/api/cron` and `/api/sync-committees` |
| `FRONTEND_ORIGIN` | `http://localhost:3000` | Allowed CORS origin when running standalone |
| `LAST_RUN_PATH` | `./data/last_run.json` | Persisted last-run stats file |

***

### Pipeline architecture

```
┌──────────┐   ┌──────────┐   ┌─────────────┐   ┌────────┐   ┌────────┐
│  Fetch   │──▶│  Parse   │──▶│  Transform  │──▶│  Dedup │──▶│ Store  │
│          │   │          │   │             │   │        │   │        │
│ Senate   │   │ JSON     │   │ type        │   │ key:   │   │ SQLite │
│ EFD API  │   │ primary  │   │ amount      │   │ name + │   │ INSERT │
│ GET      │   │          │   │ dates       │   │ date + │   │ OR     │
│ 100/page │   │ HTML     │   │ owner       │   │ asset +│   │ IGNORE │
│          │   │ fallback │   │ ticker      │   │ amount │   │        │
└──────────┘   └──────────┘   └─────────────┘   └────────┘   └────────┘
                                                                   │
                                                                   ▼
                                                           ┌──────────────┐
                                                           │  Express API │
                                                           │  :3001       │
                                                           └──────────────┘
```

**Source endpoint:** `GET https://efts.senate.gov/LATEST/search-index`\
**Pagination:** 100 records/page, loops until `hits.total` exhausted\
**Fallback:** if JSON parse yields empty `asset_name` on all rows, re-parses raw HTML\
**Retry:** 3 attempts with exponential backoff + ±25% jitter on all HTTP calls

***

### API reference

#### `GET /health`

```bash
curl http://localhost:3001/health
```

```json
{
  "status": "ok",
  "db_count": 847,
  "last_run": "2026-04-29T14:23:00.000Z"
}
```

***

#### `GET /api/refresh`

Returns timestamp of most recently stored record. Called by the frontend on every page mount.

```bash
curl http://localhost:3001/api/refresh
```

```json
{ "lastUpdated": "2026-04-29T14:23:00.000Z" }
```

`lastUpdated` is `null` if no records exist yet.

***

#### `POST /api/refresh`

Triggers a full pipeline run. Called when the user clicks "Refresh Data" in the frontend.

```bash
curl -X POST http://localhost:3001/api/refresh
```

```json
{ "ok": true, "signals": 14, "lastUpdated": "2026-04-29T14:23:00.000Z" }
```

On failure:

```json
{ "ok": false, "error": "Fetch failed: HTTP 503 Service Unavailable" }
```

***

#### `GET /api/cron`

Same pipeline run as `POST /api/refresh`, protected by `CRON_SECRET`. Called by an external scheduler (Cloudflare Worker, cron job, etc.).

```bash
curl -H "x-cron-secret: your-secret" http://localhost:3001/api/cron
## or
curl "http://localhost:3001/api/cron?secret=your-secret"
```

```json
{
  "ok": true,
  "summary": {
    "ingested": 340,
    "newTrades": 14,
    "signalsGenerated": 14,
    "topScore": null,
    "topScoreTicker": null,
    "runAt": "2026-04-29T14:23:00.000Z"
  }
}
```

Returns `401` if secret is missing or wrong.

***

#### `GET /api/sync-committees`

Syncs congressional committee membership. Protected by `CRON_SECRET`. Run once on setup, then weekly.

```bash
curl -H "x-cron-secret: your-secret" http://localhost:3001/api/sync-committees
```

```json
{ "ok": true, "synced": 0 }
```

***

#### `GET /api/transactions`

Queryable read endpoint. Returns transactions serialized to match the frontend `Signal` field names.

```bash
## All recent transactions (default limit 500)
curl http://localhost:3001/api/transactions

## Filter by ticker
curl "http://localhost:3001/api/transactions?ticker=AAPL"

## Filter by politician (LIKE match, case-insensitive)
curl "http://localhost:3001/api/transactions?politician=Pelosi"

## Date range
curl "http://localhost:3001/api/transactions?date_from=2026-04-01&date_to=2026-04-30"

## Type + owner + pagination
curl "http://localhost:3001/api/transactions?type=buy&owner=joint&limit=50&offset=0"
```

```json
{
  "count": 2,
  "data": [
    {
      "id": "a3f...c1",
      "filer_name": "Nancy Pelosi",
      "filer_type": "congress",
      "trade_type": "purchase",
      "ticker": "NVDA",
      "asset_name": "NVIDIA Corporation",
      "asset_type": "Stock",
      "amount_low": 1000001,
      "amount_high": 5000000,
      "amount_midpoint": 3000000,
      "trade_date": "2026-04-29",
      "filing_date": "2026-04-29",
      "owner": "joint",
      "is_active": true
    }
  ]
}
```

**Query parameters:**

| Param | Type | Description |
|---|---|---|
| `politician` | string | Substring match (LIKE) |
| `ticker` | string | Exact match, auto-uppercased |
| `date_from` | YYYY-MM-DD | Inclusive lower bound on `transaction_date` |
| `date_to` | YYYY-MM-DD | Inclusive upper bound on `transaction_date` |
| `type` | `buy` | `sell` | Exact match |
| `owner` | `self` | `joint` | `spouse` | `child` | Exact match |
| `limit` | integer 1–1000 | Default 500 |
| `offset` | integer ≥ 0 | Default 0 |

Invalid params return `400`:

```json
{ "error": { "date_from": ["Must be YYYY-MM-DD"] } }
```

***

#### `GET /api/debug`

Dev diagnostics. No auth. Returns DB count and 2 sample records.

```bash
curl http://localhost:3001/api/debug
```

***

### Cron schedule

Default: `0 */6 * * *` (every 6 hours).

Change via `CRON_SCHEDULE` env var — any valid [node-cron](https://github.com/node-cron/node-cron) expression.

```bash
CRON_SCHEDULE="0 */2 * * *" npm run dev   ## every 2 hours
CRON_SCHEDULE="0 8 * * *" npm run dev     ## once daily at 08:00
```

Last run stats (timestamp, inserted, skipped, errors) are persisted to `./data/last_run.json` after each run.

***

### Seeding and smoke test

Load 20 realistic fake records covering edge cases (null tickers, spouse/child owners, large amounts, same-day multi-trades, clusters):

```bash
npm run seed
```

Verify the running server responds correctly:

```bash
## Terminal 1
npm run dev

## Terminal 2
npm run smoke
```

Smoke test exits 0 on all pass, 1 on any failure.

***

### Phase 2 roadmap

House of Representatives disclosures (efd.house.gov) use a different filing format and will be added after Senate coverage is stable. Planned additions: PDF parsing for older PTRs that lack structured data, ticker enrichment via OpenFIGI or a static CUSIP mapping table (resolving the `ticker: null` cases currently stored as-is), a scoring engine that ranks transactions by conviction signal (cluster detection, filing delay, filer track record), and Telegram/email alerts for high-score transactions. Multi-tenant auth (Supabase RLS + Paddle billing) is tracked separately under the SaaS roadmap.

***

### Data source

All data is sourced from the [U.S. Senate Electronic Financial Disclosures](https://efts.senate.gov) system — a public government database. Senate PTR filings are required under the STOCK Act and are public domain. This pipeline does not scrape third-party aggregators.

# Actor input Schema

## `fetchDaysBack` (type: `integer`):

Rolling window of PTRs to fetch (default 90).

## `fromDate` (type: `string`):

Explicit start date. Overrides fetchDaysBack if set.

## `toDate` (type: `string`):

Explicit end date. Defaults to today.

## `debugPtrLimit` (type: `integer`):

If > 0, fetch detail HTML for only the first N PTRs. Useful for diagnostics.

## Actor input object example

```json
{
  "fetchDaysBack": 90,
  "debugPtrLimit": 0
}
```

# Actor output Schema

## `transactions` (type: `string`):

All normalized trades from PTRs filed in the requested date window.

## `runStats` (type: `string`):

Pipeline run statistics: inserted, skipped, errors. Stored under OUTPUT key.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("seralifatih/congress-trading-pipeline").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("seralifatih/congress-trading-pipeline").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call seralifatih/congress-trading-pipeline --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=seralifatih/congress-trading-pipeline",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "U.S. Senate Trading Pipeline",
        "description": "Fetches U.S. Senate Periodic Transaction Reports (PTRs) directly from the official efdsearch.senate.gov source. Normalizes filings into a clean, deduplicated dataset with politician, ticker, asset, type, amount range, dates, and owner. No third-party vendors. Public domain data. STOCK Act compliant.",
        "version": "0.0",
        "x-build-id": "6DLbjfE36dBxWtak9"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/seralifatih~congress-trading-pipeline/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-seralifatih-congress-trading-pipeline",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/seralifatih~congress-trading-pipeline/runs": {
            "post": {
                "operationId": "runs-sync-seralifatih-congress-trading-pipeline",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/seralifatih~congress-trading-pipeline/run-sync": {
            "post": {
                "operationId": "run-sync-seralifatih-congress-trading-pipeline",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "fetchDaysBack": {
                        "title": "Days back to fetch",
                        "minimum": 1,
                        "maximum": 365,
                        "type": "integer",
                        "description": "Rolling window of PTRs to fetch (default 90).",
                        "default": 90
                    },
                    "fromDate": {
                        "title": "From date (YYYY-MM-DD)",
                        "pattern": "^\\d{4}-\\d{2}-\\d{2}$",
                        "type": "string",
                        "description": "Explicit start date. Overrides fetchDaysBack if set."
                    },
                    "toDate": {
                        "title": "To date (YYYY-MM-DD)",
                        "pattern": "^\\d{4}-\\d{2}-\\d{2}$",
                        "type": "string",
                        "description": "Explicit end date. Defaults to today."
                    },
                    "debugPtrLimit": {
                        "title": "Debug: limit PTR detail fetches",
                        "minimum": 0,
                        "type": "integer",
                        "description": "If > 0, fetch detail HTML for only the first N PTRs. Useful for diagnostics.",
                        "default": 0
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
