Substack Posts Scraper 📚 avatar

Substack Posts Scraper 📚

Try for free

2 hours trial then $19.99/month - No credit card required now

Go to Store
Substack Posts Scraper 📚

Substack Posts Scraper 📚

easyapi/substack-posts-scraper
Try for free

2 hours trial then $19.99/month - No credit card required now

Scrape Substack posts and articles by keywords. Extract comprehensive post data including title, author, publication details, podcast information, reactions, and more. Perfect for content analysis and research.

Developer
Maintained by Community

Actor Metrics

  • 3 monthly users

  • No reviews yet

  • No bookmarks yet

  • >99% runs succeeded

  • Created in Feb 2025

  • Modified 7 days ago

Powerful scraper for extracting posts and articles from Substack based on keywords. Get detailed information about posts, publications, and authors with advanced search capabilities.

Features ✨

  • 🔍 Search posts by keywords
  • 📊 Extract comprehensive post metadata
  • 🎙️ Support for podcast episodes data
  • 👥 Get author and publication details
  • ❤️ Capture engagement metrics (reactions, comments)
  • 🔄 Auto-scrolling for pagination
  • ⚡ High-performance with Puppeteer
  • 🛡️ Built-in anti-detection mechanisms

Output Data Structure 📋

The actor provides rich post data including:

  • Post title, subtitle, and description
  • Publication details
  • Author information
  • Podcast episode data (if applicable)
  • Cover images and media
  • Engagement metrics
  • Tags and categories
  • Publication timestamps
  • And much more!

Usage 💡

Simply provide:

  1. Keywords to search for
  2. Maximum number of items to scrape (optional)

The actor will automatically:

  • Search Substack for your keywords
  • Scroll through results
  • Extract detailed post information
  • Handle pagination
  • Export structured JSON data

Use Cases 🎯

  • Content Research
  • Market Analysis
  • Topic Monitoring
  • Audience Engagement Analysis
  • Content Aggregation
  • Newsletter Analytics
  • Competitive Analysis

Limitations ⚠️

  • Respects Substack's terms of service
  • Public posts only
  • Rate limiting applied for stability

Input Example

A full explanation of an input example in JSON.

1{
2    "keywords": [
3        "ai"
4    ],
5    "maxItems": 50
6}

Output sample

The results will be wrapped into a dataset which you can always find in the Storage tab. Here's an excerpt from the data you'd get if you apply the input parameters above:

And here is the same data but in JSON. You can choose in which format to download your data: JSON, JSONL, Excel spreadsheet, HTML table, CSV, or XML.

1[
2    {
3        "keyword": "ai",
4        "id": 156491923,
5        "editor_v2": false,
6        "publication_id": 2270667,
7        "title": "New AI image models, free AI music generators, GPT can THINK now, new top AI models, DeepSeek Janus",
8        "social_title": null,
9        "search_engine_title": null,
10        "search_engine_description": null,
11        "type": "podcast",
12        "slug": "new-ai-image-models-free-ai-music",
13        "post_date": "2025-02-04T23:10:38.722Z",
14        "audience": "everyone",
15        "podcast_duration": 2684.9436,
16        "video_upload_id": null,
17        "podcast_upload_id": "06e2c81a-16e8-4c32-a936-a1f89d596005",
18        "write_comment_permissions": "everyone",
19        "should_send_free_preview": false,
20        "free_unlock_required": false,
21        "default_comment_sort": null,
22        "canonical_url": "https://aisearch.substack.com/p/new-ai-image-models-free-ai-music",
23        "section_id": null,
24        "top_exclusions": [],
25        "pins": [],
26        "is_section_pinned": false,
27        "section_slug": null,
28        "section_name": null,
29        "reactions": {
30            "❤": 0
31        },
32        "restacked_post_id": null,
33        "restacked_post_slug": null,
34        "restacked_pub_name": null,
35        "restacked_pub_logo_url": null,
36        "position": 1,
37        "subtitle": "Welcome to the AI Search podcast. Here are the top highlights in AI this week.",
38        "cover_image": "https://substack-post-media.s3.amazonaws.com/public/images/1d231857-e4ee-468b-b626-0deb428ee7d6_1400x1400.png",
39        "cover_image_is_square": true,
40        "cover_image_is_explicit": false,
41        "podcast_episode_image_url": "https://substack-post-media.s3.amazonaws.com/public/images/1d231857-e4ee-468b-b626-0deb428ee7d6_1400x1400.png",
42        "podcast_episode_image_info": {
43            "url": "https://substack-post-media.s3.amazonaws.com/public/images/1d231857-e4ee-468b-b626-0deb428ee7d6_1400x1400.png",
44            "isDefaultArt": false,
45            "isDefault": false
46        },
47        "podcast_url": "https://api.substack.com/api/v1/audio/upload/06e2c81a-16e8-4c32-a936-a1f89d596005/src",
48        "videoUpload": null,
49        "podcastFields": {
50            "post_id": 156491923,
51            "podcast_episode_number": null,
52            "podcast_season_number": null,
53            "podcast_episode_type": null,
54            "should_syndicate_to_other_feed": null,
55            "syndicate_to_section_id": null,
56            "hide_from_feed": false,
57            "free_podcast_url": null,
58            "free_podcast_duration": null
59        },
60        "podcast_preview_upload_id": null,
61        "podcastUpload": {
62            "id": "06e2c81a-16e8-4c32-a936-a1f89d596005",
63            "name": "news-24.mp3",
64            "created_at": "2025-02-04T23:09:13.865Z",
65            "uploaded_at": "2025-02-04T23:09:23.872Z",
66            "publication_id": 2270667,
67            "state": "transcoded",
68            "post_id": 156491923,
69            "user_id": 191014175,
70            "duration": 2684.9436,
71            "height": null,
72            "width": null,
73            "thumbnail_id": 1,
74            "preview_start": null,
75            "preview_duration": null,
76            "media_type": "audio",
77            "primary_file_size": "42959560",
78            "is_mux": false,
79            "mux_asset_id": null,
80            "mux_playback_id": null,
81            "mux_preview_asset_id": null,
82            "mux_preview_playback_id": null,
83            "mux_rendition_quality": null,
84            "mux_preview_rendition_quality": null,
85            "explicit": false,
86            "copyright_infringement": null,
87            "src_media_upload_id": null,
88            "live_stream_id": null,
89            "transcription": {
90                "media_upload_id": "06e2c81a-16e8-4c32-a936-a1f89d596005",
91                "created_at": "2025-02-04T23:10:05.279Z",
92                "requested_by": 191014175,
93                "status": "transcribed",
94                "modal_call_id": "fc-01JK9KMQDG2KRT327JTSR976DS",
95                "approved_at": "2025-02-04T23:12:43.876Z",
96                "transcript_url": "s3://substack-video/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/transcription.json",
97                "attention_vocab": null,
98                "speaker_map": null,
99                "captions_map": {
100                    "en": {
101                        "url": "s3://substack-video/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/en.vtt",
102                        "language": "en",
103                        "original": true
104                    }
105                },
106                "cdn_url": "https://substackcdn.com/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/transcription.json?Expires=1739315665&Key-Pair-Id=APKAIVDA3NPSMPSPESQQ&Signature=kkyvVNWtOpL1VD3Y4n0LkIioiMU3r10tYrBXXlrNC927uzauCnLvzbp4j4-VU5UBwCm5HTayghaSnVeUPWfr6GPD-YTYgeHufNrkwCtmgqinTner3DwKh7z4EsvxbTkH58qXOAR82qLG8MHuu~iSTsXJ5CARuEeGPTW121bHK74poh6QH6jMT3iW-8qqRv4VP4aioSWL8OQyolUxoalTWSiejR6RE9RTxdRUMUbg8pk60GN3nzq3NTRff0qiZtnwJuvh~-A0L4FiTCiNtdFJsHOfYmcieyRydEDj7rHLsgY7yzuFXnsQx2qau9aoF79XAsJ5s4T1EySb~vg7fMTPNQ__",
107                "cdn_unaligned_url": "https://substackcdn.com/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/unaligned_transcription.json?Expires=1739315665&Key-Pair-Id=APKAIVDA3NPSMPSPESQQ&Signature=TpwJtOqaEBWXzOc2orXqEOjPumCJZzfg0wCNb22pdu~T8GQ2PCRcJaPAd6qZZzEPMFLsHbtiEvxC23ZIiD4D6ft94TvlQVYJV~TKKfTy0nC8Ut77ni9FJvzfRQhfYw1tUPdQ2mQEx2s5~TVCKlplUoaceWJ03B55xSURcT9apy4-8X2MjZk57O9Z-almjQ2QtkvxOUxNWvGiM1HN4RGKCPNfu211OpEn1rVMbmU~0WdZ3Sz7QaaXRnJbc3~tQKkev4MWXyq-E8lYa88lpNFj1LJnpF9piIAwfJUtkPym8SXdqiMLFHj0B7pBc4Th15eTUeL1MgdM145boHfnuAlP~w__",
108                "signed_captions": [
109                    {
110                        "language": "en",
111                        "url": "https://substackcdn.com/video_upload/post/156491923/06e2c81a-16e8-4c32-a936-a1f89d596005/1738710621/en.vtt?Expires=1739315665&Key-Pair-Id=APKAIVDA3NPSMPSPESQQ&Signature=JpSxDJtOUrhIB698DRiCsap2PZTwPNbQaH1zBYxivUyFGmr8oTHIU80z0SELPqmnsaJmpRTgXjuAKJMbBBMeZlg7FwCNAPxijb9jhS5Oai-SZrDKH4jG21RJwh2TiF0Yg0yp9kF0z8xpK56RzqeS7JCHUJG8iVQmjqSNrrvCVtOGBGN3fBIHlr7Z3RKioxD0grIAatAkDbCwjkc49~~XWa9hd-awKhqByx2o2w6uHlNfuC5ZRRpwnaiX8Ju6rxU9MwW24QQCKCpjbND6S6kaGJ-Z~N90shdh-fD31FymCJ4quq6M1JaEyonZIjcwyUc6FduiKeQi25x-MvUIITtrow__",
112                        "original": true
113                    }
114                ]
115            }
116        },
117        "podcastPreviewUpload": null,
118        "voiceover_upload_id": null,
119        "voiceoverUpload": null,
120        "has_voiceover": false,
121        "description": "Welcome to the AI Search podcast. Here are the top highlights in AI this week.",
122        "body_json": null,
123        "body_html": null,
124        "truncated_body_text": "INSANE AI news: OpenAI o3-mini, DeepSeek Janus-Pro, Qwen2.5-Max, Riffusion FUZZ, YuE AI music generator, Doubao 1.5 Pro, Google Daily Listen, Tulu 3",
125        "wordcount": 38,
126        "postTags": [
127            {
128                "id": "06ec7467-035f-41d3-aa2a-f2dafd005ba2",
129                "publication_id": 2270667,
130                "name": "research",
131                "slug": "research",
132                "hidden": false
133            },
134            {
135                "id": "134583b8-fd61-4289-83f8-2768f0e74637",
136                "publication_id": 2270667,
137                "name": "machine learning",
138                "slug": "machine-learning",
139                "hidden": false
140            },
141            {
142                "id": "3b762592-4665-4885-b8c3-5c01a13fbd93",
143                "publication_id": 2270667,
144                "name": "artificial intelligence",
145                "slug": "artificial-intelligence",
146                "hidden": false
147            },
148            {
149                "id": "aae204af-3183-4835-9184-ac27d860a342",
150                "publication_id": 2270667,
151                "name": "science",
152                "slug": "science",
153                "hidden": false
154            },
155            {
156                "id": "d07870f1-c05d-4f6b-9a7e-c94b7e0ba2c5",
157                "publication_id": 2270667,
158                "name": "tech",
159                "slug": "tech",
160                "hidden": false
161            },
162            {
163                "id": "e9d24f2a-02b0-4774-bc85-8b311ff1ab12",
164                "publication_id": 2270667,
165                "name": "ai",
166                "slug": "ai",
167                "hidden": false
168            }
169        ],
170        "teaser_post_eligible": true,
171        "postCountryBlocks": [],
172        "coverImagePalette": {
173            "Vibrant": {
174                "rgb": [
175                    60,
176                    180,
177                    252
178                ],
179                "population": 3621
180            },
181            "DarkVibrant": {
182                "rgb": [
183                    109,
184                    52,
185                    68
186                ],
187                "population": 14
188            },
189            "LightVibrant": {
190                "rgb": [
191                    100,
192                    196,
193                    252
194                ],
195                "population": 5
196            },
197            "Muted": {
198                "rgb": [
199                    164,
200                    87,
201                    108
202                ],
203                "population": 5
204            },
205            "DarkMuted": {
206                "rgb": [
207                    86,
208                    47,
209                    61
210                ],
211                "population": 115
212            },
213            "LightMuted": {
214                "rgb": [
215                    218,
216                    174,
217                    194
218                ],
219                "population": 99
220            }
221        },
222        "publishedBylines": [
223            {
224                "id": 191014175,
225                "name": "AI Search",
226                "handle": "aisearch",
227                "previous_name": null,
228                "photo_url": "https://substack-post-media.s3.amazonaws.com/public/images/e1ef43b4-d382-41ad-8e0b-86080c6f0b2a_1400x1400.png",
229                "bio": "Stay up to date with AI news, tech, & research",
230                "profile_set_up_at": "2024-01-18T18:32:07.954Z",
231                "publicationUsers": [
232                    {
233                        "id": 2288577,
234                        "user_id": 191014175,
235                        "publication_id": 2270667,
236                        "role": "admin",
237                        "public": true,
238                        "is_primary": false,
239                        "publication": {
240                            "id": 2270667,
241                            "name": "AI Search",
242                            "subdomain": "aisearch",
243                            "custom_domain": null,
244                            "custom_domain_optional": false,
245                            "hero_text": "Welcome to the AI Search newsletter. We bring you the highlights in AI every week. No fluff, just the interesting stuff. \n\nSubscribe and get a FREE cheat sheet on the top 50 most useful AI tools!",
246                            "logo_url": "https://substack-post-media.s3.amazonaws.com/public/images/2905d20a-608c-4fa8-9bdf-7af0c3792e1a_1280x1280.png",
247                            "author_id": 191014175,
248                            "theme_var_background_pop": "#FF0000",
249                            "created_at": "2024-01-18T18:32:47.729Z",
250                            "rss_website_url": null,
251                            "email_from_name": "AI Search",
252                            "copyright": "AI Search",
253                            "founding_plan_name": null,
254                            "community_enabled": false,
255                            "invite_only": false,
256                            "payments_state": "disabled",
257                            "language": null,
258                            "explicit": false,
259                            "is_personal_mode": false
260                        }
261                    }
262                ],
263                "is_guest": false,
264                "bestseller_tier": null
265            }
266        ],
267        "reaction": null,
268        "reaction_count": 0,
269        "comment_count": 0,
270        "child_comment_count": 0,
271        "is_geoblocked": false,
272        "hasCashtag": false,
273        "scrapedAt": "2025-02-10T05:26:49.500Z"
274    },
275    ...
276]