Future
Everything below is designed for but not yet built. The architecture is already pluggable — these are the natural extensions.
Cloud sync backend
Section titled “Cloud sync backend”The WyrenSync interface has two methods: pull() and push(). GitSync is the default. A CloudSync implementation would slot in with zero changes to hooks.
Candidate backends:
| Backend | Pros | Cons |
|---|---|---|
| Cloudflare Worker + KV | 15-min deploy, global edge, free tier. | Shared secret distribution. |
| Supabase | Auth + realtime + postgres in one. | Overkill for just memory sync. |
| Firebase Realtime DB | Dead simple, free tier. | Locked to Google. |
| S3 + conditional writes | Cheap, durable, simple. | No realtime, need polling. |
Preferred: Cloudflare Worker + KV. Simple, fast, cheap.
Interface:
class CloudSync extends WyrenSync { constructor({ endpoint, teamId, secret }) { super(); /* ... */ } async pull() { /* GET /memory/<teamId> */ } async push(memory) { /* PUT /memory/<teamId> with If-Match ETag */ }}ETag conditional writes give optimistic concurrency control without app-level locks.
MCP server for on-demand transcript RAG
Section titled “MCP server for on-demand transcript RAG”Current Wyren injects distilled memory. A separate MCP server could expose tools for on-demand retrieval:
wyren_search(query: string) -> top-K transcript chunks from teammateswyren_history(topic: string) -> timeline of decisions on a topicwyren_who_decided(question: string) -> session/turn citationImplementation: index teammate transcripts into a local embedding store (SQLite + sqlite-vec), expose via MCP. Claude calls the tool when it needs deep context — on-demand, not always-on.
Complements (doesn’t replace) the SessionStart injection path.
Permissions & multi-team
Section titled “Permissions & multi-team”Current Wyren assumes trust model = same team. A real product needs:
- Per-user memory visibility. “Alice can see what Bob decided on
/auth/*but not on private sessions.” - Team boundaries. Memory is scoped to a team, not a repo.
- Audit log. Who saw what memory, when.
All of this requires a server (or at least a signed-identity layer over git). Out of scope for the current release.
Dashboard / web UI
Section titled “Dashboard / web UI”A browser view of:
- Current memory per repo.
- Recent distillations per teammate.
- Decisions timeline.
- Skills broadcast history.
- Memory diff view (before/after each distill).
Not a core value driver — memory.md in a text editor is already fine — but great for stakeholder pitches and async teams.
Most hackable version: a tiny Astro site that reads .wyren/* and renders. GitHub Pages is enough.
Cursor / Windsurf / other editors
Section titled “Cursor / Windsurf / other editors”Wyren is Claude Code-specific today (uses Claude Code hooks). Generalizing requires:
- A “transcript watcher” daemon that understands each editor’s log format.
- A unified memory format (already markdown — done).
- A shared sync layer (already pluggable — done).
The plumbing is there. It’s a matter of writing adapters.
Self-hosted / offline distillation
Section titled “Self-hosted / offline distillation”If a team doesn’t want any data to leave their network:
- Tier 0 already runs locally.
- Tier 1 could swap Haiku for a local model (Llama 3.1 8B via Ollama, Qwen 2.5, Gemma 3n).
- Tier 2 could call a self-hosted Claude via
ANTHROPIC_BASE_URLoverride.
Quality drops with small local models, but the pipeline is identical.
Temporal memory / decision timeline
Section titled “Temporal memory / decision timeline”Current memory is a snapshot — latest state only. A future version could keep a timeline:
## Timeline of decisions
### 2026-04-21 10:32- Picked SQLite over Postgres [session 7a2e]
### 2026-04-21 14:45- Reversed — moved to Postgres (hit SQLite concurrency limits in /api/bulk) [session 3f1b]Useful for retrospectives, post-mortems, demo narratives. Implementation: append-only log alongside the snapshot file.
Persistent memory graph
Section titled “Persistent memory graph”Current Wyren culls. When memory.md fills up, the distiller drops the least load-bearing entries to stay under the 60-line cap. Old-but-valid decisions fall out.
The long-term direction: replace culling with rendering. memory.md becomes a viewport into a persistent knowledge graph. Nothing is ever discarded — entries outside the current view remain in the graph, connected and queryable.
What changes:
- The 60-line cap becomes a rendering limit, not a storage limit.
- Decisions persist through connections, not manual retention. A decision made in session 12 resurfaces in session 47 when relevant context reappears — without anyone explicitly keeping it alive.
- Reasoning is queryable: not just what is known, but why, what it impacts, and what’s been tried.
- Cross-time, cross-domain connectivity: a rejected approach from three months ago resurfaces when the same problem recurs, regardless of who originally tried it.
Mental model shift:
| Before | Memory = what survived culling |
| After | Memory = everything, selectively rendered by relevance |
Backwards compatibility: memory.md stays as the rendered output format. The graph is the backing store. v1 consumers see no change.
Implementation sketch: a local graph DB (SQLite + edges table) stores every distilled entry with session ID, timestamp, and source turns. On session start, a relevance query against current cwd and recent transcript signal determines the rendered slice. The distiller writes to the graph first, then renders memory.md from it.
Not yet designed in detail. Building this requires settling on a storage layer and query model before touching the distiller.
Per-module memory
Section titled “Per-module memory”For larger codebases, split memory by directory:
.wyren/├── memory.md # cross-cutting├── memory/│ ├── auth.md # scoped to /src/auth│ ├── api.md # scoped to /src/api│ └── frontend.md # scoped to /src/frontendDistiller routes entries to the right module based on file paths mentioned in the transcript. Injection picks the relevant ones based on cwd at session start.
Out of scope for now — most projects don’t need it early. Obvious addition as the codebase scales.
Evaluation harness
Section titled “Evaluation harness”A regression test suite for the distiller prompt:
- 10-20 real transcripts annotated with “what a good memory would contain.”
- CI runs distiller on each, diffs against gold memory, reports drift.
- Gates prompt changes on quality metrics.
Prevents prompt regressions when iterating.
Integration with existing tools
Section titled “Integration with existing tools”- Linear / Jira:
wyren_sync_to_lineartool — push open questions into tickets. - Slack: bot posts memory diffs to a team channel.
- GitHub Actions:
.wyren/memory.mdgets rendered into PR descriptions automatically.
All low-hanging once the core memory loop works.