Bloomberg News Extractor avatar

Bloomberg News Extractor

Pricing

Pay per usage

Go to Apify Store
Bloomberg News Extractor

Bloomberg News Extractor

Bloomberg news scraper that pulls headlines, body text, authors, and tags from article and section pages, so your data pipelines get financial news without the copy-paste.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Kawsar

Kawsar

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Extract structured article data from Bloomberg.com. Paste one or more Bloomberg article URLs and the actor returns a clean dataset with headlines, authors, publish dates, full article body text, image URLs, content tags, categories, reading time, and more.

Every article is fully scraped — body text is always included, no extra configuration needed.


What it extracts

Every article record contains the following fields:

FieldTypeDescription
urlstringCanonical Bloomberg article URL
articleIdstringBloomberg internal SUID identifier
headlinestringMain article headline
seoHeadlinestringSEO-optimised version of the headline
bylinestringAuthor name as it appears on the article
authorNamestringFirst credited author full name
authorTwitterstringAuthor Twitter handle (without @)
publishedAtstringISO 8601 UTC publish timestamp
updatedAtstringISO 8601 UTC last-update timestamp
articleSummarystringArticle lede or summary paragraph
bodyTextstringFull plain-text article body
imageUrlstringMain article image URL
imageCaptionstringCaption for the main image
imageCreditstringPhotographer or agency credit
sectionstringBloomberg section: markets, technology, politics, etc.
categoriesstringComma-separated list of section categories
tagsstringComma-separated list of content tag names
isPremiumbooleanTrue if the article requires a Bloomberg subscription
readingTimeMinutesnumberEstimated reading time in minutes
slugstringURL date-slug (e.g. 2026-05-23/article-title)
scrapedAtstringISO 8601 UTC timestamp of when the record was collected
errorstringError message if the article failed to scrape, null on success

How to use it

1. Open the actor on Apify

Go to the actor page and click Try for free to open the input editor.

2. Add Bloomberg article URLs

Paste one or more full Bloomberg article URLs into the Start URLs field. Use the full article URL with the /news/articles/ path:

https://www.bloomberg.com/news/articles/2026-05-22/abrego-garcia-wins-dismissal-of-us-human-smuggling-case
https://www.bloomberg.com/news/articles/2026-05-23/india-raises-diesel-gasoline-prices-for-third-time-in-eight-days

Query string parameters like ?srnd=phx-markets are stripped automatically before scraping.

3. Set your limits

  • Max articles — cap on total articles processed per run (default: 50, max: 1000)
  • Request timeout — per-request timeout in seconds (default: 30)

4. Run and download

Click Start. The actor processes each URL and pushes results to the dataset. Download as JSON, CSV, Excel, or XML from the Storage tab when the run finishes.


Input reference

{
"startUrls": [
"https://www.bloomberg.com/news/articles/2026-05-22/abrego-garcia-wins-dismissal-of-us-human-smuggling-case",
"https://www.bloomberg.com/news/articles/2026-05-23/india-raises-diesel-gasoline-prices-for-third-time-in-eight-days"
],
"maxArticles": 50,
"requestTimeoutSecs": 30
}
FieldRequiredDefaultDescription
startUrlsYesList of Bloomberg article URLs (/news/articles/... paths)
maxArticlesNo50Maximum articles to process per run (1–1000)
requestTimeoutSecsNo30Per-request timeout in seconds (5–120)

URL format

Use the full article URL. The path must contain /news/articles/:

https://www.bloomberg.com/news/articles/YYYY-MM-DD/article-slug

Any query parameters (?srnd=..., ?utm_source=...) are removed automatically.


Example output record

{
"url": "https://www.bloomberg.com/news/articles/2026-05-23/india-raises-diesel-gasoline-prices-for-third-time-in-eight-days",
"articleId": "TFGSTUKK3NYD00",
"headline": "India Raises Diesel, Gasoline Prices for Third Time in Eight Days",
"seoHeadline": "India Raises Diesel, Gasoline Prices for Third Time in Eight Days",
"byline": "Rakesh Sharma",
"authorName": "Rakesh Sharma",
"authorTwitter": "journorakesh",
"publishedAt": "2026-05-23T01:30:56.367Z",
"updatedAt": "2026-05-23T03:42:27.089Z",
"articleSummary": "India's state-run refiners raised retail prices again of diesel and gasoline on Saturday to help processors cut losses on discounted sales and to control a spike in demand.",
"bodyText": "India's state-run refiners raised retail prices again of diesel and gasoline on Saturday...",
"imageUrl": "https://assets.bwbx.io/images/users/iqjWHBFdfxIU/itJ0yPa0NDcg/v0/-1x-1.webp",
"imageCaption": "A fuel station in New Delhi.",
"imageCredit": "Photographer: Anindito Mukherjee/Bloomberg",
"section": "markets",
"categories": "markets",
"tags": "Retail, Government, Taxes, Energy, India",
"isPremium": false,
"readingTimeMinutes": 2.5,
"slug": "2026-05-23/india-raises-diesel-gasoline-prices-for-third-time-in-eight-days",
"scrapedAt": "2026-05-23T05:12:00.000Z",
"error": null
}

Notes on premium articles

The isPremium field is true for subscriber-only articles. Metadata fields — headline, author, publish date, summary, image URL, tags — are always collected regardless of subscription status. Full body text on paywalled articles may be truncated; the isPremium flag lets you identify and filter these records downstream.


Output formats

The dataset can be downloaded from Apify in several formats:

FormatBest for
JSONDatabase ingestion, APIs, Python/Node scripts
CSVExcel, Google Sheets, pandas DataFrames
JSONLStreaming pipelines, BigQuery, S3
XMLLegacy system integrations

Use cases

Financial research — bulk-scrape Bloomberg articles on a specific market sector and run sentiment analysis or topic modeling across the corpus.

News monitoring — paste a fresh set of article URLs daily and track how Bloomberg covers specific companies, geopolitical events, or industries over time.

Competitive intelligence — collect article metadata at scale and filter by tags, section, or authorName to understand Bloomberg's editorial focus on a topic.

Data journalism — pull authorship and publication patterns across hundreds of articles for investigative or academic research.

News aggregation pipelines — feed clean structured Bloomberg data into internal dashboards, Slack alerts, or downstream NLP systems.


How to get Bloomberg article URLs

Bloomberg article URLs follow this pattern:

https://www.bloomberg.com/news/articles/YYYY-MM-DD/article-slug

Ways to collect them:

  • Browse any Bloomberg section (Markets, Technology, Politics, etc.) and copy article links from the page
  • Use Bloomberg's own search at bloomberg.com/search to find articles by keyword, then copy the URLs
  • Monitor Bloomberg's RSS feeds or Twitter/X account for article links
  • Use another actor or script to collect article URLs from Bloomberg section pages and pass them as input here

Performance tips

  • Increase requestTimeoutSecs to 60 if you see timeout errors on slow article pages.
  • Use maxArticles to cap scope during test runs before processing a large batch.
  • For batches over 200 articles, consider splitting into multiple runs of 100–200 each.

Scheduling

Use Apify's built-in Schedules feature to run this actor on a recurring basis:

  1. Go to Schedules in your Apify account
  2. Click Create new schedule
  3. Select this actor and configure your article URL list
  4. Choose a cron expression, e.g. 0 8 * * * for daily at 8am UTC
  5. Results accumulate in the dataset automatically with each run

This works well for monitoring a fixed list of Bloomberg articles for updates — the updatedAt field tells you when Bloomberg last edited each piece.


Error handling

Each article is processed independently. If one URL fails (network error, page not found, parse failure), the actor logs the error and continues to the next URL. Failed records appear in the dataset with error set to a message string and all other fields set to null. The run does not stop on individual article failures.