Video-based AI memory library. Store millions of text chunks in MP4 files with lightning-fast semantic search. No database needed.
Early-access notice
Memvid v1 is still experimental. The file format and API may change until we lock in a stable release.Memvid v2 – what’s next
- Living-Memory Engine – keep adding new data and let LLMs remember it across sessions.
- Capsule Context – shareable
.mv2
capsules, each with its own rules and expiry.- Time-Travel Debugging – rewind or branch any chat to review or test.
- Smart Recall – local cache guesses what you’ll need and loads it in under 5 ms.
- Codec Intelligence – auto-tunes AV1 now and future codecs later, so files keep shrinking.
- CLI & Dashboard – simple tools for branching, analytics, and one-command cloud publish.
Sneak peek of Memvid v2 - a living memory engine that can be used to chat with your knowledge base.
Memvid compresses an entire knowledge base into MP4 files while keeping millisecond-level semantic search. Think of it as SQLite for AI memory portable, efficient, and self-contained. By encoding text as QR codes in video frames, we deliver 50-100× smaller storage than vector databases with zero infrastructure.
What it enables | How video codecs make it possible |
---|---|
50-100× smaller storage | Modern video codecs compress repetitive visual patterns (QR codes) far better than raw embeddings |
Sub-100ms retrieval | Direct frame seek via index → QR decode → your text. No server round-trips |
Zero infrastructure | Just Python and MP4 files-no DB clusters, no Docker, no ops |
True portability | Copy or stream memory.mp4 -it works anywhere video plays |
Offline-first design | After encoding, everything runs without internet |
Text → QR → Frame
Each text chunk becomes a QR code, packed into video frames. Modern codecs excel at compressing these repetitive patterns.
Smart indexing
Embeddings map queries → frame numbers. One seek, one decode, millisecond results.
Codec leverage
30 years of video R&D means your text gets compressed better than any custom algorithm could achieve.
Future-proof
Next-gen codecs (AV1, H.266) automatically make your memories smaller and faster-no code changes needed.
pip install memvid
# For PDF support
pip install memvid PyPDF2
from memvid import MemvidEncoder, MemvidChat
# Create video memory from text
chunks = ["NASA founded 1958", "Apollo 11 landed 1969", "ISS launched 1998"]
encoder = MemvidEncoder()
encoder.add_chunks(chunks)
encoder.build_video("space.mp4", "space_index.json")
# Chat with your memory
chat = MemvidChat("space.mp4", "space_index.json")
response = chat.chat("When did humans land on the moon?")
print(response) # References Apollo 11 in 1969
from memvid import MemvidEncoder
import os
encoder = MemvidEncoder(chunk_size=512)
# Index all markdown files
for file in os.listdir("docs"):
if file.endswith(".md"):
with open(f"docs/{file}") as f:
encoder.add_text(f.read(), metadata={"file": file})
encoder.build_video("docs.mp4", "docs_index.json")
# Index multiple PDFs
encoder = MemvidEncoder()
encoder.add_pdf("deep_learning.pdf")
encoder.add_pdf("machine_learning.pdf")
encoder.build_video("ml_library.mp4", "ml_index.json")
# Semantic search across all books
from memvid import MemvidRetriever
retriever = MemvidRetriever("ml_library.mp4", "ml_index.json")
results = retriever.search("backpropagation", top_k=5)
from memvid import MemvidInteractive
# Launch at http://localhost:7860
interactive = MemvidInteractive("knowledge.mp4", "index.json")
interactive.run()
# Maximum compression for huge datasets
encoder.build_video(
"compressed.mp4",
"index.json",
fps=60, # More frames/second
frame_size=256, # Smaller QR codes
video_codec='h265', # Better compression
crf=28 # Quality tradeoff
)
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-mpnet-base-v2')
encoder = MemvidEncoder(embedding_model=model)
encoder = MemvidEncoder(n_workers=8)
encoder.add_chunks_parallel(million_chunks)
# Process documents
python examples/file_chat.py --input-dir /docs --provider openai
# Advanced codecs
python examples/file_chat.py --files doc.pdf --codec h265
# Load existing
python examples/file_chat.py --load-existing output/memory
Memvid is redefining AI memory. Join us: