BlueSky Feed Scraper avatar

BlueSky Feed Scraper

Try for free

1 day trial then $25.00/month - No credit card required now

Go to Store
BlueSky Feed Scraper

BlueSky Feed Scraper

harvest/bluesky-feed-scraper
Try for free

1 day trial then $25.00/month - No credit card required now

Scrapes data from a specified BlueSky feed URL and outputs detailed information about the posts, including metadata, authors, embedded media, and statistics such as likes, replies, and reposts.

Bluesky Feed Scraper for Apify

This is an Apify actor that scrapes data from a specified Bluesky feed URL and outputs detailed information about the posts, including metadata, authors, embedded media, and statistics such as likes, replies, and reposts.

Features

  • Scrapes Bluesky feed posts from a given feed URL.
  • Extracts detailed post data, including:
    • Author details (DID, handle, display name, avatar URL, etc.).
    • Post text, tags, and languages.
    • Embedded images, with metadata (alt text, aspect ratio, URLs).
    • Engagement statistics (likes, replies, reposts, quotes).
    • Thread and reply information.
    • Record metadata, including creation and indexing timestamps.

Input

The actor requires the following input:

FieldTypeDescription
urlStringThe URL of the Bluesky feed you want to scrape. Example: https://bsky.app/profile/username/feed.

Example Input

1{
2  "url": "https://bsky.app/profile/c3rmen.bsky.social/feed"
3}

Output

The actor produces a JSON array where each object represents a post from the feed. The structure includes:

  • uri and cid: Unique identifiers for the post.
  • author: Details about the author (DID, handle, avatar, etc.).
  • record: Post text, tags, languages, and embedded media.
  • embed: View-ready image metadata (e.g., thumbnails, full-size URLs).
  • Engagement metrics (replyCount, repostCount, likeCount, quoteCount).
  • Thread and reply-related data.
  • Timestamps (createdAt, indexedAt).

Example Output

1[
2  {
3    "uri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.feed.post/3lbsizxfxa22r",
4    "cid": "bafyreifohcetdw6e5mudaz6anigzsm5ssjpm3oreyxu4a2l665k7hpxo4q",
5    "author": {
6      "did": "did:plc:z72i7hdynmk6r22z27h6tvur",
7      "handle": "bsky.app",
8      "displayName": "Bluesky",
9      "avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:z72i7hdynmk6r22z27h6tvur/bafkreihagr2cmvl2jt4mgx3sppwe2it3fwolkrbtjrhcnwjk4jdijhsoze@jpeg",
10      "associated": {
11        "chat": {
12          "allowIncoming": "none"
13        }
14      },
15      "labels": [],
16      "createdAt": "2023-04-12T04:53:57.057Z"
17    },
18    "record": {
19      "createdAt": "2024-11-25T21:52:30.840Z",
20      "embed": {
21        "external": {
22          "description": "Bluesky is social media as it should be. Find your community among millions of users, unleash your creativity, and have some fun again. https://bsky.app",
23          "thumb": {
24            "ref": {
25              "$link": "bafkreihh7dthuxfqel6zwcmxapcu47tr34rat7thjtxlfmrwidvxfsmqne"
26            },
27            "mimeType": "image/jpeg",
28            "size": 384236,
29            "$type": "blob"
30          },
31          "title": "BlueskySocial - Twitch",
32          "uri": "https://www.twitch.tv/blueskysocial"
33        },
34        "$type": "app.bsky.embed.external"
35      },
36      "facets": [
37        {
38          "features": [
39            {
40              "did": "did:plc:qjeavhlw222ppsre4rscd3n2",
41              "$type": "app.bsky.richtext.facet#mention"
42            }
43          ],
44          "index": {
45            "byteEnd": 55,
46            "byteStart": 40
47          },
48          "$type": "app.bsky.richtext.facet"
49        },
50        {
51          "features": [
52            {
53              "did": "did:plc:ragtjsm2j2vknwkz3zp4oxrd",
54              "$type": "app.bsky.richtext.facet#mention"
55            }
56          ],
57          "index": {
58            "byteEnd": 76,
59            "byteStart": 64
60          },
61          "$type": "app.bsky.richtext.facet"
62        },
63        {
64          "features": [
65            {
66              "did": "did:plc:4ewnpnebeh7zuk5pbardaxqz",
67              "$type": "app.bsky.richtext.facet#mention"
68            }
69          ],
70          "index": {
71            "byteEnd": 226,
72            "byteStart": 203
73          },
74          "$type": "app.bsky.richtext.facet"
75        }
76      ],
77      "langs": [
78        "en"
79      ],
80      "text": "Join us for another livestream with COO @rose.bsky.team and CTO @pfrazee.com, where they'll share team updates, the story of how Bluesky began, and what’s next. \n\nPlus, a special guest appearance from @flavorflav.bsky.social! 🎉\n\nToday 11/25 @ 5 pm PT / 8 pm ET / 1 am GMT / 10am JST",
81      "$type": "app.bsky.feed.post"
82    },
83    "embed": {
84      "external": {
85        "uri": "https://www.twitch.tv/blueskysocial",
86        "title": "BlueskySocial - Twitch",
87        "description": "Bluesky is social media as it should be. Find your community among millions of users, unleash your creativity, and have some fun again. https://bsky.app",
88        "thumb": "https://cdn.bsky.app/img/feed_thumbnail/plain/did:plc:z72i7hdynmk6r22z27h6tvur/bafkreihh7dthuxfqel6zwcmxapcu47tr34rat7thjtxlfmrwidvxfsmqne@jpeg"
89      },
90      "$type": "app.bsky.embed.external#view"
91    },
92    "replyCount": 324,
93    "repostCount": 1041,
94    "likeCount": 9147,
95    "quoteCount": 84,
96    "indexedAt": "2024-11-25T21:52:35.058Z",
97    "labels": []
98  },
99  // ...more posts
100]

Usage

  1. Deploy the Actor: Use the Apify console to set up and deploy this actor.
  2. Provide Input: Supply the url in the input configuration.
  3. Run the Actor: Start the actor, and it will scrape the feed URL and return the posts as JSON.

Notes

  • Ensure the url is publicly accessible.
  • The actor fetches only visible posts; private or restricted feeds will not be included.

Feel free to suggest additional features or report any issues! 🚀

Developer
Maintained by Community

Actor Metrics

  • 3 monthly users

  • 1 star

  • >99% runs succeeded

  • Created in Nov 2024

  • Modified 10 days ago