Product Vision

CuratedFeed: Building an LLM-Native Feed Curation Portal

How we're rethinking content consumption for tech professionals β€” with AI at every layer of the stack, from ingestion to personalized delivery.

Product Management PerspectiveMay 202612 min read

Every morning, the average tech professional opens a dozen tabs: Hacker News, Reddit, TechCrunch, a few newsletters, maybe a Substack or two. They skim headlines, open articles, abandon half of them, and still walk away feeling like they missed something important. We're building CuratedFeed to make that ritual obsolete.

The Problem We're Solving

Information overload isn't a new problem, but it's getting worse. The number of high-quality tech content sources has exploded β€” blogs, subreddits, newsletters, podcasts, research papers. The signal-to-noise ratio keeps dropping. And every source has its own UI, its own account, its own email cadence.

The result? Tech professionals spend more time finding information than actually absorbing it. The context-switching cost is enormous. And the best content β€” the article that would've changed how you think about a problem β€” gets buried under the noise.

CuratedFeed is our answer: a single portal that crawls the sources you care about, uses LLMs to summarize, score, cluster, and personalize every piece of content, and delivers it in a format optimized for your time. Not a dumb aggregator. An intelligent reading companion.


Who We're Building For

πŸ’»

The Tech Professional

Software engineers, ML engineers, DevOps β€” who want a daily digest of what actually matters, without the tab-hopping.

πŸ’Ό

The Tech Leader

Engineering managers and CTOs who need a high-level pulse on industry trends β€” not the full firehose.

πŸŽ“

The Curious Learner

Students and career-switchers who want curated, high-quality learning material in one place.

What unites these personas is a shared constraint: limited attention, unlimited information. They don't need more sources. They need a smarter way to consume the sources that already exist.


The Content Universe

We're launching with five feed categories, each backed by carefully chosen sources that our users already trust. The source list is admin-configurable β€” no code changes needed to add a new blog or subreddit.

πŸ“°

News

Wired, TechCrunch, Ars Technica, The Verge

πŸ€–

AI / ML

Hugging Face Blog, r/LocalLLaMA, r/MachineLearning, r/artificial

πŸ”¨

Development

GitHub Blog, Hacker News, r/programming, r/webdev, r/devops

πŸ’¬

Reddit

All the above communities, unified into a single Reddit feed

βœ‰

Newsletters

a16z, TLDR, Interconnects, select Substacks

This isn't about casting the widest net. It's about curating the right net β€” sources our users already read, just made dramatically easier to consume.


Where LLM Meets Product: Our Intelligence Layer

Here's where CuratedFeed diverges from every RSS reader or link aggregator you've used. We're not just collecting links and slapping them on a page. Every piece of content passes through a multi-stage LLM pipeline that transforms raw articles into structured, scored, summarized, and clustered intelligence. From a product standpoint, this is the moat.

Let me walk through the nine LLM-powered capabilities we're shipping in v1 and why each one matters for our users.

1. Relevance Scoring & Auto-Filtering

Not everything published is worth reading. Before any article reaches the summarization pipeline, an LLM scores it 0–10 for relevance and quality. Articles below a configurable threshold (default: 4) are automatically filtered out.

Why this matters: Users don't see the noise. The feed they open is already the top quartile of what was published. This is the difference between an inbox and a curated feed β€” and it's invisible to the user. They just notice the feed feels consistently good.

For admins, we surface relevance score distribution charts so they can tune the threshold β€” are we filtering too aggressively? Not aggressively enough? The data is transparent.

2. Multi-Level Summaries

Different users want different depths. A CTO scanning headlines during a commute needs a one-liner. An ML engineer evaluating a new paper wants a full paragraph. We generate three tiers for every article:

  • Headline β€” A single-line distillation of the core takeaway
  • Short summary β€” 2–3 sentences capturing the key points
  • Detailed summary β€” A full paragraph with nuance and context

Feed cards default to the short summary with expand/collapse. Users can set their preferred default depth in settings. This is a small interaction detail that dramatically changes the reading experience.

3. Semantic Deduplication & Story Clustering

When Apple announces a new product, five of our sources will cover it. Without dedup, the user sees five near-identical summaries. That's noise, not signal.

We generate vector embeddings for every article and use cosine similarity to detect semantically similar content β€” even when headlines differ. Similar articles are clustered into a single "story" with a primary article and linked source chips. No more scrolling past the same news from four angles.

4. Cross-Source Synthesis

Story clustering is half the picture. The other half is synthesis. For clustered stories, we don't just pick the "best" article β€” we generate a new, unified summary that weaves together perspectives from all sources and cites each one.

Why this matters: This is something no human reader can do efficiently. When TechCrunch emphasizes the business angle, The Verge focuses on design, and Hacker News commenters dissect the technical architecture β€” our synthesis brings all three perspectives into a single coherent briefing. Users get a more complete picture in less time.

5. Trend Detection with Explainers

Every 6 hours, we analyze tag and topic frequency across sliding windows β€” 24 hours, 7 days, 30 days. When mentions of a topic spike significantly, we surface it as a trend card with the percentage increase, an LLM-generated explainer, and links to related posts.

The explainer is critical. If "edge AI" suddenly trends and you've never heard of it, a raw trend badge is useless. Our LLM writes a 2–3 sentence context paragraph: what it is, why it's trending now, and why you should care. It turns a data point into understanding.

6. Daily AI Briefing

This is the feature I'm most excited about from a product perspective. Every morning, our system generates a coherent narrative briefing β€” not a bullet list, but an actual newsletter-style editorial stitching together the top 5–10 stories of the past 24 hours.

Think of it as your personal tech correspondent who read everything so you don't have to. It's available as a "Today's Briefing" page in the portal and doubles as the body of the daily email digest. Users who subscribe to specific categories get per-category briefings (e.g., "Your AI/ML Daily").

Product insight: The daily briefing converts the portal from a "visit when you feel like it" tool into a "start your day here" habit. It's the hook for daily retention.

7. Ask the Feed (RAG-Based Q&A)

Sometimes you don't want to browse. You want to ask: "What happened with OpenAI this week?" or "Compare the new React and Svelte releases."

Ask the Feed is a RAG-powered conversational interface. It retrieves relevant posts via embedding similarity, feeds them to an LLM as grounded context, and generates an answer with inline citations linking back to the source posts. Users can scope queries by time range, category, or source. Multi-turn follow-ups are supported within a session.

From a product lens, this transforms CuratedFeed from a feed reader into a knowledge base. The value of every article we ingest compounds over time because it becomes searchable, queryable, and referenceable through natural language.

8. Personalized Feed Ranking

The default feed is chronological. But users can toggle to a "For You" view where posts are re-ranked based on their interest profile.

We build that profile passively: what articles they click, how long they spend reading, what they save, and what they subscribe to. The user's reading history is encoded into an interest vector, and posts are scored by cosine similarity to that vector. Posts with high affinity get a subtle "Recommended for you" badge.

Crucially, we make the personalization transparent. Users can view their inferred interest profile in settings as a natural-language summary (e.g., "You're mostly interested in LLM infrastructure, Rust tooling, and startup funding news"). No black-box algorithm β€” you can see why you're seeing what you're seeing.

9. Smart Narrated Email Digests

Traditional email digests are lists of links. Nobody reads them. Our digests are LLM-narrated: the system selects the most important stories from your subscriptions, orders them by significance, and writes transitions and context. The digest reads like a mini-newsletter, not a database dump.

Users choose daily or weekly cadence. Each digest starts with the daily briefing narrative and then covers stories matching their subscriptions.


The Ingestion-to-Delivery Pipeline

Understanding the architecture helps explain why each feature exists and how they compose. Here's the full pipeline, from crawl to delivery:

1

Crawl & Store

Per-source crawlers (RSS, web scraping, Reddit API) run on configurable schedules. Raw content is stored with title, body, URL, and metadata.

2

Dedup (URL + Hash)

Exact duplicates are caught immediately via URL matching and content hashing. This is fast and cheap.

3

Embed & Score

Every article gets a vector embedding. The LLM scores it for relevance (0–10). Articles below the quality threshold are filtered out.

4

Cluster

Semantic similarity groups related articles into story clusters. Near-duplicates from different sources converge.

5

Summarize & Tag

Each article gets multi-level summaries. Story clusters get cross-source synthesis. Tags, categories, and sub-categories are extracted.

6

Trend & Brief

Periodic jobs detect trending topics and generate explainers. Daily briefings are composed from the top stories.

7

Index & Serve

Processed posts are indexed in pgvector for semantic search and RAG retrieval. The API serves the feed, trends, briefings, and Q&A.

8

Personalize & Deliver

User activity updates interest vectors. The "For You" feed re-ranks based on affinity. Smart digests are narrated and emailed.

Simplified Data Flow
Sources
β†’
Crawlers
β†’
Embed + Score
β†’
Cluster
β†’
Summarize
β†’
Feed API
β†’
Portal + Digest

The User Experience

All the LLM intelligence in the world means nothing if the experience isn't right. Here's how it comes together in the UI:

The Home Page

The home page is a river-of-news feed with the daily briefing link and trending topics pinned at the top. Category tabs let users filter by News, AI/ML, Development, Reddit, and Newsletters. A prominent "Ask" input bar invites natural-language questions.

Each feed card shows the short summary by default with source favicon, published date, tags, and a relevance badge. Users can expand for the detailed summary or click through to the original. Story clusters appear as unified cards with multiple source chips.

Search That Understands You

Search supports both keyword and semantic queries. Typing "edge computing for ML inference" returns results by meaning, not just keyword match. Filters for category, sub-category, tags, and date range refine results further.

Personalization That's Transparent

Users toggle between "Latest" and "For You" feed views. The "For You" ranking is based on their reading behavior, but they can always see why: their interest profile is viewable as a natural-language summary in settings. No hidden algorithm. Trust through transparency.

Saving and Subscribing

One-click save. Subscriptions to categories, tags, or even trending topics. Delivery preferences (in-app, daily email, weekly email) per subscription. The email digest isn't a link dump β€” it's a narrated briefing.


Technical Decisions and Trade-offs

From a product management perspective, several technical decisions deserve explanation because they directly impact user experience, cost, and our ability to iterate:

DecisionChoiceWhy
Tiered LLM usageGPT-4o-mini for scoring/tagging, GPT-4o for synthesis/briefings90% of LLM calls are scoring and tagging β€” using a lighter model here cuts cost dramatically while reserving quality for high-value outputs.
pgvector over a separate vector DBEmbeddings stored in PostgreSQL via pgvectorOne database for relational and vector data. Simpler ops, simpler joins, no sync headaches. We can graduate to a dedicated vector store if we outgrow it.
Semantic search alongside keyword searchBoth in the same search endpointUsers don't know the difference β€” they just type. We run both approaches and merge results. The experience is "it just finds what I meant."
Graceful LLM degradationQueue and retry; surface raw title + truncated bodyLLM APIs go down. When they do, the feed still works β€” articles show up with raw titles. Summaries backfill when the API recovers.
Transparent personalizationInterest profile shown as natural languageUsers distrust black-box algorithms. Showing "You're interested in X, Y, Z" builds trust and gives users agency over their feed.

What Success Looks Like

We're measuring CuratedFeed against four metrics that map directly to user value:

500+
Daily Active Users (3 months)
< 1hr
Crawl-to-summary latency (p90)
40%+
Weekly retention rate
4+ / 5
Summary quality rating

The retention metric is the one that keeps me up at night. If users come back every week, it means the feed is genuinely better than their existing workflow. If they don't, we've built a novelty, not a habit. The daily briefing and smart digest are designed specifically to drive this β€” they give users a reason to return every morning.


The Roadmap Beyond v1

V1 is ambitious by design β€” we're shipping nine LLM-powered features from day one because they compound. Relevance scoring makes summarization better (less noise to summarize). Embeddings power dedup, clustering, RAG, personalization, and semantic search simultaneously. The daily briefing reuses the same summaries and trend data. Each feature makes the others stronger.

But there's plenty we're deliberately deferring to v2:

  • "Explain Like I'm 5" mode β€” Regenerate any summary in simpler language
  • Chat with an article β€” Deep-dive Q&A using the full original text as context
  • Clickbait & bias detection β€” Score articles for sensationalism and flag potential bias
  • Fact-check signals β€” Cross-reference claims across sources and flag disagreements
  • User-submitted sources β€” Let users add their own blogs and feeds
  • Slack / Discord integration β€” Deliver subscriptions to team channels
  • Mobile app β€” Native or PWA for on-the-go reading
  • Subscription recommendations β€” "Based on your reading, you might like..."

Closing Thoughts

The thesis behind CuratedFeed is simple: LLMs are good enough now to be trusted with the entire content pipeline β€” not just summarization, but scoring, clustering, synthesizing, trend-spotting, personalization, and narrative generation. The technology is ready. The user pain is acute. The gap in the market is clear.

Every existing tool does one piece of this. RSS readers aggregate but don't summarize. AI newsletter tools summarize but don't personalize. Search engines find but don't brief. We're building the product that does all of it, coherently, in one place.

The feed should be smarter than you are busy. That's the promise.

We're building CuratedFeed in the open. Follow along as we go from PRD to production. The next post will cover the technical architecture in detail β€” including our approach to tiered LLM cost optimization, pgvector schema design, and the crawl-to-summary pipeline.

Product ManagementLLMContent CurationRAGFeed AggregationPersonalizationAI Product

CuratedFeed β€” Product Requirements Document Β· May 2026