
Hacker News Data Scraper
epctex/hackernews-scraper
Extract Y Combinator's Hacker News based on any search criteria. Crawl the front page, Show HN, Ask HN, news, job listings, and historical data. Get links, titles, comments, ratings, and more!
The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn mode
from apify_client import ApifyClient
# Initialize the ApifyClient with your API token
client = ApifyClient("<YOUR_API_TOKEN>")
# Prepare the Actor input
run_input = {
"startUrls": [
{ "url": "https://news.ycombinator.com/front" },
{ "url": "https://news.ycombinator.com/show" },
{ "url": "https://news.ycombinator.com/jobs" },
{ "url": "https://news.ycombinator.com" },
{ "url": "https://news.ycombinator.com/item?id=26566373" },
],
"maxItems": 50,
"endPage": 1,
"extendOutputFunction": "($) => { return {} }",
"proxy": { "useApifyProxy": True },
}
# Run the Actor and wait for it to finish
run = client.actor("epctex/hackernews-scraper").call(run_input=run_input)
# Fetch and print Actor results from the run's dataset (if there are any)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)