Pricing

Pay per event

Go to Apify Store

Live Context Feed for Paperclip

Try for free

Keep Paperclip memory fresh with sanitized daily signals from Reddit and GitHub.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Solutions Smart

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

What does Live Context Feed for Paperclip do?

Live Context Feed for Paperclip enables you to build automated, freshly-updated memory feeds for AI agents and decision systems without turning them into raw scraping dumps.

Instead of pushing unfiltered posts or pages into agent memory, it:

Collects recent items from Reddit and GitHub
Normalizes every item into a common internal structure
Removes duplicates and near-duplicates
Sanitizes raw text before storage
Extracts repeated themes and useful signals
Generates a Markdown daily note ready for agent consumption
Generates a structured JSON feed for programmatic use
Optionally posts the final feed to a callback URL

Live Context Feed for Paperclip can collect:

Reddit discussions by subreddit and topic
GitHub issues and repositories by search query
Engagement metrics and freshness signals
Thematic patterns and repeated pain points across sources

Why use Live Context Feed for Paperclip?

Agent memory systems often rely on static setup-time knowledge, making them slow to react to market shifts, competitor moves, and ecosystem changes. Reddit and GitHub are rich sources of real-world signals, pain points, new tools, and emerging practices, but raw scraped content creates noise and bloat in memory systems.

Live Context Feed for Paperclip bridges that gap. It keeps agent memory fresh and focused by delivering only sanitized, deduplicated, thematically-organized signals.

Here are some ways you could use it:

Competitor intelligence: Monitor Reddit communities for pain points your competitors' customers face
Ecosystem monitoring: Track GitHub for new tools, libraries, and practices relevant to your domain
Market research: Build daily context feeds for startup or niche research teams
Agent memory refresh: Automate Paperclip-style memory updates without manual curation
Trend detection: Spot emerging themes across multiple communities before they become mainstream

How to scrape with Live Context Feed for Paperclip

Using Live Context Feed for Paperclip is straightforward:

Click Try for free
Fill in your topic, for example "HVAC competitors in Germany"
Select your sources: Reddit, GitHub, or both
Enter subreddits and GitHub search queries you want to monitor
Adjust optional filters such as time window, Reddit backend, sanitization mode, and output format
Click Run
When the Actor finishes, preview or download your Markdown and JSON feed from the Key-Value Store tab

Example input:

{
  "topic": "AI coding assistants",
  "sources": ["reddit", "github"],
  "subreddits": ["LocalLLaMA", "artificial", "programming"],
  "githubQueries": ["AI coding assistant", "code generation agent"],
  "redditBackend": "painFinder",
  "maxItemsPerSource": 10,
  "daysBack": 3,
  "outputFormat": ["markdown", "json"],
  "includeSourceLinks": true,
  "sanitizeMode": "strict"
}

Input parameters

Parameter	Type	Description	Example
`topic`	string	Topic or context label for this feed	`"HVAC competitors in Germany"`
`sources`	array	Which sources to monitor (`reddit`, `github`)	`["reddit", "github"]`
`subreddits`	array	Subreddits to scan (`reddit` only)	`["smallbusiness", "hvac"]`
`githubQueries`	array	Search queries for GitHub issues and repositories	`["field service management"]`
`redditBackend`	string	Reddit source backend: `painFinder` or `practicaltoolsApi`. `painFinder` is the recommended default for current cloud runs.	`"painFinder"`
`maxItemsPerSource`	number	Maximum items to keep per source after scoring and filtering	`20`
`daysBack`	number	How many days of history to scan	`3`
`outputFormat`	array	Output formats: `markdown`, `json`, or both	`["markdown", "json"]`
`includeSourceLinks`	boolean	Include URLs to original posts and issues	`true`
`sanitizeMode`	string	Sanitization level: `strict` or `balanced`	`"strict"`
`callbackUrl`	string	Optional webhook URL to POST the final feed	`"https://example.com/memory-feed"`

Output and results

The Actor generates:

FEED_MARKDOWN: A Markdown-formatted daily note, ready for Paperclip or other agent systems
FEED_JSON: A structured JSON object with signals, themes, and metadata
RUN_SUMMARY: Metadata about the run, including source stats, dedupe counts, callback status, and Reddit backend diagnostics

Example Markdown output

## Daily Context Feed - 2026-03-29

### Topic
HVAC competitors in Germany

### Top signals
- Repeated complaints around technician scheduling surfaced in 4 recent items across reddit and github.
- Fresh momentum around route planning appeared in 3 recent items across github.

### Recommended actions
- Review scheduling and dispatch workflows related to technician scheduling.
- Track ecosystem movement around route planning against your roadmap.

Example JSON output

{
  "topic": "HVAC competitors in Germany",
  "generatedAt": "2026-03-29T09:00:00.000Z",
  "topSignals": [
    "Repeated complaints around technician scheduling surfaced in 4 recent items across reddit and github."
  ],
  "topThemes": [
    {
      "label": "technician scheduling",
      "signalCount": 4,
      "weight": 6.21,
      "sources": ["github", "reddit"],
      "dominantCategory": "pain_point"
    }
  ],
  "recommendedActions": [
    "Review scheduling and dispatch workflows related to technician scheduling."
  ],
  "signals": [
    {
      "type": "pain_point",
      "source": "reddit",
      "sourceType": "reddit_post",
      "title": "Users complain about slow technician scheduling",
      "summary": "Users complain about slow technician scheduling. Repeated frustration around long wait times and missed scheduling windows.",
      "relevanceScore": 0.89,
      "freshnessDays": 1,
      "sourceUrl": "https://reddit.com/...",
      "author": "example-user",
      "createdAt": "2026-03-28T12:00:00.000Z",
      "updatedAt": "2026-03-28T12:00:00.000Z",
      "tags": ["hvac"],
      "engagement": 37,
      "subreddit": "hvac"
    }
  ]
}

How much will it cost?

Apify gives you $5 in free usage credits every month on the Apify Free plan. Live Context Feed for Paperclip is designed to stay efficient for small to medium monitoring tasks, especially when you start with a narrow topic, a short time window, and a limited number of subreddits or GitHub queries.

Your total run cost depends on two things:

The compute and storage used by this Actor
The Reddit backend you choose, since supported Reddit backends can have their own pricing models

In practice:

GitHub-only runs are typically the cheapest
Reddit + GitHub runs depend heavily on the selected redditBackend
Narrow daily runs are much cheaper than broad exploratory runs across many communities and queries

For regular, higher-volume monitoring across many subreddits and queries, we recommend the $49/month Starter plan or higher. Check the Apify pricing page for the latest rates and plan details.

Supported sources in v1

Reddit: Public subreddit discussions and posts collected through a selectable backend
GitHub: Public issues and repositories

Web scraping beyond these sources is intentionally out of scope in v1.

Supported Reddit backends in v1:

solutionssmart/reddit-pain-finder — recommended default for current cloud runs
practicaltools/apify-reddit-api — optional backend, but availability may depend on Actor/account permissions in cloud runs

Tips for using Live Context Feed for Paperclip

Start narrow: Monitor 2-3 subreddits or search queries first, then expand once you see what signals matter
Adjust daysBack: Shorter windows, such as 1-2 days, work best for daily automation
Use strict sanitization: Removes more noise and risky raw content; choose balanced only if you need higher recall
Callback URLs: Use webhooks to automatically push feeds into your agent memory system or Slack
Backend choice matters: painFinder is the safer default for current cloud runs, while practicaltoolsApi is an optional backend when your Actor/account permissions allow it
Theme weights: Higher-weight themes appear more often; use them to spot emerging trends early

Reddit backend note

This Actor lets you choose between two Reddit backends:

painFinder
practicaltoolsApi

For current cloud deployments, painFinder is the recommended default because it has been the more reliable backend in real Actor runs.

practicaltoolsApi can still be useful, but some cloud environments may fail to call it due to Actor/account permission restrictions. If that happens, the run will degrade cleanly and continue with other available sources, but Reddit data from that backend will not be included.

Is it legal to scrape Reddit and GitHub?

Both Reddit and GitHub allow automated access under their Terms of Service, provided you:

Respect rate limits and robots.txt where applicable
Do not scrape personal data such as names, emails, or phone numbers without consent
Use collected data in compliance with privacy laws such as GDPR and CCPA
Attribute content and respect intellectual property

Note that personal data is protected by GDPR in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your use case is legitimate, consult your lawyers.

We also recommend reading our blog post: Is web scraping legal?

What this is not

This Actor is not a chatbot or conversational AI system
It does not require an external LLM in v1
It is designed for sanitized context feeds, not raw data dumps
It does not perform real-time monitoring; it works best for scheduled runs such as daily refreshes

Questions or feedback?

If this Actor helps you, please:

leave a review on the Apify Store
share feedback about which Reddit backend worked best for your use case
suggest source types or output improvements you want next

For issues, feature requests, or support, please contact the actor maintainer or open an issue on the Apify Community.

Tech Signal Feed

bustling_oriole/tech-signal-feed

Collect tech trend signals from GitHub and Hacker News.

wei li

GitHub MCP Wrapper — Model Context Protocol for GitHub Data

apricot_blackberry/github-mcp-wrapper

GitHub MCP Wrapper — Model Context Protocol for GitHub Data helps teams get quick, high-signal results with reliable output, clear fields, and fast setup.