Under the Hood

How Rethread works — end to end.

A complete walkthrough of how Rethread observes your AI conversations, extracts structured memories, organizes them, and quietly re-injects them into every new chat — without ever uploading anything to a server (unless you ask it to).

Add to Chrome — Free

The four-stage pipeline

Conceptually, Rethread is a small, sandboxed pipeline that runs entirely inside your browser:

  1. Capture — observe what's said on the page
  2. Extract — distill conversations into structured memories
  3. Organize — buckets, tags, folders, search index
  4. Recall — pick the right memories and inject them into a new chat

Optional Sync sits behind the whole pipeline as a zero-knowledge encrypted layer. Let's walk through each stage.

01

Capture: reading what's already on the page

Rethread is a Manifest V3 Chrome extension built on WXT. On each of the six supported AI platforms (chatgpt.com, claude.ai, gemini.google.com, grok.com, www.perplexity.ai, chat.deepseek.com), a content script attaches a lightweight DOM observer.

When you send a message and the AI responds, the observer reads the new turns from the rendered conversation — exactly as they appear to you. No network requests are intercepted, no API tokens are read, and the page is not modified beyond a small Recall button and a context preview UI.

A debounced job picks up the new turns and forwards them to the extraction worker.

02

Extract: distilling raw chat into structured memories

Raw chat logs are noisy and rarely useful as long-term memory. Rethread runs each new conversation chunk through a lightweight extractor that produces structured "memories" of four types:

  • Fact — durable information about you ("I'm a backend engineer using Go and PostgreSQL")
  • Preference — stylistic / behavioral choices ("I prefer functional over OOP")
  • Decision — outcomes you've committed to ("We chose Cloudflare Workers for the v2 backend")
  • Context — current situational state ("My project uses Next.js 14 with App Router")

Each memory is timestamped, tagged with its source platform and conversation, and assigned a confidence score. You can edit any memory afterwards, change its type, add tags, or move it to a different bucket.

Local extraction. By default, extraction runs on-device — your conversations never leave your browser for the purpose of building memories.
03

Organize: buckets, folders, tags, snapshots

A flat memory list works for the first 50 entries. After that you need structure. Rethread gives you several layers:

  • Buckets — color-coded top-level groups (e.g. "Work", "Side project", "Personal")
  • Folders — nested groupings inside buckets
  • Tags — free-form labels for cross-cutting concerns ("typescript", "client-feedback")
  • Snapshots — immutable copies of any conversation at a point in time, with word-level diff against later snapshots or the live conversation
  • Time range filters — slice by "last 7 days", "this month", custom range

Bulk operations let you multi-select memories or whole conversations and re-bucket, re-tag, export, or delete them in one sweep.

04

Recall: Selective Recall and context injection

The most-used surface in Rethread is Selective Recall. Press Alt+Shift+R on any of the six AI platforms (or click the floating Recall button) and a panel opens with:

  • A search box that ranks memories by relevance to your current page / current draft
  • Filters by bucket, tag, type, time range, source platform
  • Per-memory toggles for inclusion + per-memory edit-in-place
  • A live token estimate that updates as you check / uncheck memories — so you never blow your context window
  • A choice between Summary mode (compact bullet list) and Detailed mode (full memory bodies)
  • Per-memory feedback (helpful / hide) to teach Rethread which memories are signal vs. noise

Click Inject and Rethread drops a clean context block into the AI's prompt area. From the AI's perspective, this is just regular text — no hidden hacks.

Optional: zero-knowledge cloud sync

Pro users can layer end-to-end encrypted sync on top of the local pipeline:

Read the full deep dive on the encrypted sync architecture →

What Rethread does not do

Performance considerations

Rethread is engineered to feel invisible. The DOM observer uses a debounced batched listener; extraction runs in a background script worker; the IndexedDB store is indexed on the fields you actually query (text, type, tags, buckets, timestamp); and the Recall UI uses virtualized lists for libraries with thousands of memories.

Even on a $200 Chromebook with a 5-year-old battery, Rethread adds a fraction of a second to page interactions you wouldn't notice in a blind test.

See it for yourself.

Free to install, no account required, works in 60 seconds.

Add to Chrome — Free

Related reading