Deep Dive: How OpenClaw's Memory System Works
A comprehensive look at OpenClaw's file-first memory system, exploring its hybrid search architecture, automatic memory flush, and implementation details.
Introduction
OpenClaw is an open-source AI agent framework that stands out for its sophisticated memory system. Unlike traditional RAG (Retrieval-Augmented Generation) systems that rely on vector databases, OpenClaw takes a file-first approach: Markdown files are the source of truth, and the memory system is designed to help AI agents remember context across conversations.
In this deep dive, we'll explore how OpenClaw's memory system works under the hood, examining its architecture, implementation details, and unique innovations that make it production-ready.
Architecture Overview
OpenClaw implements a file-based, Markdown-driven memory system with semantic search capabilities. The core philosophy is simple yet powerful: files are the source of truth — the AI agent only retains what gets written to disk.

Key Components
Markdown Storage Layer: Plain text files in the workspace directory
Vector Search Engine: SQLite-based with hybrid (BM25 + vector) retrieval
Embedding Providers: Auto-selection between local/OpenAI/Gemini
Automatic Memory Flush: Pre-compaction trigger to persist context
Memory Types & Storage Structure
OpenClaw uses a two-tier memory design to balance short-term context with long-term knowledge:
1. Ephemeral Memory (Daily Logs)
Location: memory/YYYY-MM-DD.md
Daily logs are append-only files that capture day-to-day activities, decisions, and context. The system automatically:
Creates a new file each day
Loads today's and yesterday's logs at session start
Provides a running context window for recent work
Evidence from memory.md:
"Daily log (append-only). Read today + yesterday at session start."
2. Durable Memory (Curated Knowledge)
Location: MEMORY.md
This is the curated long-term memory file containing:
Important decisions and preferences
Project conventions and patterns
Long-term todos and goals
Critical facts that should persist
Important: MEMORY.md is only loaded in private sessions, never in group contexts, to protect sensitive information.
3. Session Memory
Location: sessions/YYYY-MM-DD-<slug>.md
When starting a new session, OpenClaw can automatically save the previous conversation to a timestamped file with a descriptive slug (generated by LLM). These session transcripts are indexed and searchable, allowing agents to recall past conversations.

Core Implementation: MemoryIndexManager
The central class managing all memory operations is MemoryIndexManager (manager.ts:119-232).
Key responsibilities:
Features:
Singleton pattern with caching: Prevents duplicate indexes (
INDEX_CACHE)Per-agent isolation: Separate SQLite stores via
agentIdFile watching: Debounced sync on file changes
Provider fallback chain: Graceful degradation across embedding providers
Session integration: Tracks and indexes conversation transcripts
Markdown Chunking Algorithm
One of the critical aspects of any memory system is how content is chunked before embedding. OpenClaw uses a sophisticated sliding window algorithm with overlap preservation.
Algorithm Details
Source: internal.ts:144-215
Characteristics:
Target: ~400 tokens per chunk (~1600 chars approximation)
Overlap: 80 tokens (~320 chars) between consecutive chunks
Line-aware: Preserves line boundaries with line numbers
Hash-based deduplication: Each chunk gets SHA-256 hash for cache lookup

Why This Approach?
Overlap prevents context loss: Related information at chunk boundaries stays connected
Line numbers: Enable precise source attribution (path + line range)
Token approximation: 4 chars ≈ 1 token is reasonable for English text
Hash stability: Same content → same hash → cache hit → no re-embedding
Hybrid Search: BM25 + Vector
OpenClaw doesn't rely solely on vector similarity. Instead, it uses weighted score fusion combining two complementary retrieval methods:
1. Vector Search (Semantic Similarity)
Great for conceptual matches:
"gateway host" ≈ "machine running gateway"
"authentication flow" ≈ "login process"
Uses cosine similarity with embeddings stored in SQLite via sqlite-vec extension.
2. BM25 Search (Lexical Matching)
Excellent for exact tokens:
Error codes:
ERR_CONNECTION_REFUSEDFunction names:
handleUserAuth()IDs and unique identifiers
Uses SQLite's FTS5 (Full-Text Search) virtual tables.
Hybrid Merge Algorithm
Source: hybrid.ts:39-111
Default weights: 70% vector + 30% text
BM25 score normalization (hybrid.ts:34-37):
This converts BM25 rank (lower is better) to a score in [0, 1] range for fusion.

Embedding Provider System
OpenClaw supports three embedding providers with intelligent auto-selection:
Auto-Selection Chain
Source: embeddings.ts:135-167
Provider Implementations
1. Local Provider
Source: embeddings.ts:65-111
Uses
node-llama-cppfor local inferenceDefault model:
hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf(~600MB)Auto-downloads missing models
Requires:
pnpm approve-builds(native compilation)
Pros: Privacy, no API costs, offline operation Cons: Requires ~1GB disk space, slower than cloud APIs
2. OpenAI Provider
Source: embeddings-openai.ts
Default model:
text-embedding-3-small(1536 dimensions)Supports Batch API for bulk indexing (50% cost reduction)
Fast and reliable
3. Gemini Provider
Source: embeddings-gemini.ts
Default model:
gemini-embedding-001(768 dimensions)Async batch endpoint support
Free tier available
Batch Embedding Optimization
For large memory files, embedding every chunk individually would be expensive and slow. OpenClaw implements batch processing with caching to optimize this.
Cache-First Strategy
Source: manager.ts:1769-1848
Batch Features
SHA-256 hash-based deduplication: Same content → same embedding (cache hit)
OpenAI Batch API: 50% cost reduction compared to sync API
Gemini async batches: Similar cost savings
Failure tolerance: Auto-disable after 2 failures, fallback to sync
Concurrency: Default 2 parallel batch jobs
Cost Savings Example
Indexing 10,000 chunks with text-embedding-3-small:
Sync API: 10,000 × $0.00002 = $0.20
Batch API: 10,000 × $0.00001 = $0.10
With 50% cache hit: 5,000 × $0.00001 = $0.05
SQLite Schema & Vector Storage
OpenClaw uses SQLite as its storage backend with several specialized tables:
Core Tables
Source: memory-schema.ts:9-75
Vector Acceleration
Source: manager.ts:677-689
OpenClaw uses the sqlite-vec extension for in-database vector similarity queries:
Stores embeddings as
FLOAT[]in virtual tablePerforms cosine similarity search entirely in SQL
Falls back to JavaScript implementation if extension unavailable
Schema Design Benefits
Embedding cache: Prevents re-embedding identical content across files
FTS5: Fast lexical search without external dependencies
Virtual tables: Efficient vector operations without loading all data into memory
Delta tracking: File hash comparison for incremental updates
Automatic Memory Flush
One of OpenClaw's most innovative features is automatic memory flush before context compaction.
The Problem
Long conversations eventually hit the context window limit. When this happens, the system must "compact" (summarize or truncate) older messages. Without intervention, valuable context gets lost.
The Solution
Source: memory.md:37-74
When a session is close to auto-compaction, OpenClaw triggers a silent, agentic turn that reminds the model to write durable memory before the context is compacted.
Configuration:
Trigger Logic
Memory flush activates when:
For a 200K context window:
Behavior
Usually silent (
NO_REPLYresponse) if nothing important to saveOne flush per compaction cycle to avoid spam
Skipped in read-only sandbox mode (no file write access)
Gives the agent a final chance to extract insights before truncation

Session Memory Integration
OpenClaw can automatically save and index past conversations, making them searchable in future sessions.
Session Save Handler
Source: session-memory handler.ts:64-183
Session Indexing
Source: manager.ts:1101-1197
Features
JSONL parsing: Extracts user/assistant messages from session transcripts
Delta-based incremental indexing: Only processes new messages
Debounced background sync: Default thresholds:
100KB of new data, OR
50 new messages
LLM-generated slugs: Descriptive filenames like
2026-01-30-memory-system-research.md
Why Session Indexing Matters
Imagine working on a project for weeks. You might ask:
"When did we decide to use TypeScript?"
"What was that bug we fixed in the auth flow?"
"What approach did we try for caching last week?"
With session indexing, the agent can search past conversations and recall decisions made weeks ago.
Memory Search Tools
The memory system exposes two tools to agents:
1. memory_search
Source: memory-tool.ts:22-69
Returns: Snippets (~700 chars) with:
File path
Line range (start_line, end_line)
Relevance score
Snippet text
Use cases:
"What did I decide about the API design?"
"When did we last discuss authentication?"
"What are my current todos?"
2. memory_get
Reads specific memory files with optional line range filtering.
Use cases:
Reading full
MEMORY.mdfor comprehensive contextFetching a specific daily log
Retrieving exact lines after search narrows down location
Tool Availability
Both tools are only enabled when memorySearch.enabled resolves to true in the agent configuration.
Key Innovations
OpenClaw's memory system introduces several novel design decisions:
1. File-First Philosophy
No database as source of truth — just Markdown files. This means:
Human-readable, version-controllable memory
Easy backup and migration
Debuggable with standard text tools
No vendor lock-in
2. Hybrid Retrieval
Combining BM25 + vector search gives balanced precision/recall:
Vector search catches semantic matches
BM25 catches exact terms and rare tokens
Weighted fusion prevents either from dominating
3. Provider Auto-Selection
Local → OpenAI → Gemini fallback chain with:
Graceful degradation
User transparency (tool results show provider used)
No manual configuration required for most users
4. Batch Optimization
Uses discounted Batch APIs for:
50% cost reduction on bulk indexing
Better resource utilization
Automatic fallback to sync on failure
5. Cache-First Embedding
SHA-256 hash deduplication prevents re-embedding:
Same paragraph across files → embed once
Session replay with same messages → cache hit
Significant cost savings on repeated content
6. Delta-Based Sync
Incremental session indexing with:
Byte/message thresholds
Debounced background sync
No full reindex on every message
7. Pre-Compaction Flush
Automatic context → memory transfer before truncation:
Prevents context loss
No manual intervention required
Silent when nothing important to save
8. Per-Agent Isolation
Separate SQLite stores per agent ID:
Multi-agent workflows don't cross-contaminate
Each agent has its own memory namespace
Supports different embedding models per agent
Performance Characteristics
Constants from manager.ts:92-110:
Typical performance:
Local embedding: ~50 tokens/sec (node-llama-cpp on M1 Mac)
OpenAI embedding: ~1000 tokens/sec (with batching)
Search latency: <100ms for 10K chunks (hybrid search)
Index size: ~5KB per 1K tokens (with 1536-dim embeddings)
Comparison with Traditional RAG
How does OpenClaw's approach differ from typical RAG systems?
Source of truth
Vector database
Markdown files
Search method
Vector only
Hybrid (BM25 + vector)
Storage
Pinecone/Weaviate/Chroma
SQLite
Embedding
Always remote API
Local-first with fallback
Chunking
Fixed-size
Line-aware with overlap
Caching
Usually none
SHA-256 hash-based
Updates
Full reindex
Delta-based incremental
Context preservation
Manual
Automatic pre-compaction flush
Human-readable
No
Yes (plain Markdown)
Cost optimization
Limited
Batch API + caching
Use Cases
OpenClaw's memory system shines in scenarios where:
Long-running projects: Work on a codebase for weeks/months with persistent context
Personal AI assistants: Remember preferences, habits, and long-term goals
Research workflows: Accumulate knowledge over time, build on past insights
Multi-agent systems: Each agent maintains its own memory space
Offline-first applications: Local embedding provider works without internet
Limitations & Trade-offs
No system is perfect. OpenClaw's memory system has trade-offs:
Storage Growth
Daily logs and session transcripts accumulate over time. A year of daily use could generate:
~365 daily logs
~1000 session files
~500MB SQLite index
Mitigation: Manual archiving, or implement retention policies.
Embedding Drift
Different providers use different embedding models (1536-dim vs 768-dim). Switching providers requires reindexing.
Mitigation: The system tracks embedding model per chunk and handles gracefully.
FTS5 vs Modern Search
SQLite FTS5 is solid but lacks features like:
Fuzzy matching
Typo tolerance
Advanced ranking signals
Why it's acceptable: Most queries are semantic (vector) anyway, and BM25 handles exact matches well.
No Cross-File Context
Each chunk is embedded independently. A concept spanning multiple files might not be connected.
Mitigation: Use section headers and explicit cross-references in Markdown.
Future Directions
Based on the codebase, potential enhancements could include:
Graph-based memory: Link related memories explicitly
Importance scoring: Prioritize frequently-accessed memories
Automatic summarization: Compress old daily logs periodically
Multi-modal embeddings: Index images, code, diagrams
Federated memory: Share curated memories across agents/teams
Retention policies: Auto-archive old sessions
Conclusion
OpenClaw's memory system represents a thoughtful evolution of RAG architecture. By prioritizing file-first storage, hybrid search, and automatic context preservation, it addresses real pain points in long-running AI agent workflows.
Key takeaways:
Files are the source of truth: Human-readable, version-controllable memory
Hybrid retrieval works: BM25 + vector gives better results than either alone
Cache everything: SHA-256 deduplication prevents redundant embedding costs
Incremental is better: Delta-based sync scales to large memory stores
Automate memory management: Pre-compaction flush prevents context loss
For developers building AI agents, OpenClaw's memory system offers a production-ready blueprint that balances performance, cost, and developer experience.
References
This analysis is based on OpenClaw commit f99e3dd (January 2026).
Last updated