Introduction

OpenViking is an open-source context database designed specifically for AI Agents. OpenViking unifies the management of context (memory, resources, and skills) that Agents need through a file system paradigm, enabling hierarchical context delivery and self-iteration. The ultimate goal is to lower the barrier for Agent development, allowing developers to focus on business innovation rather than underlying context management.

Why OpenViking

In the AI era, data is abundant, but high-quality context is scarce. When building AI Agents, developers often face these challenges:

Context Fragmentation: Memory in code, resources in vector databases, skills scattered everywhere — difficult to manage uniformly
Context Explosion: Long-running Agent tasks generate context with each execution; simple truncation or compression leads to information loss
Poor Retrieval Quality: Traditional RAG uses flat storage, lacking global perspective and struggling to understand complete context
Context Opacity: Traditional RAG's implicit retrieval pipeline is like a black box, making debugging difficult
Limited Memory Iteration: Current memory systems only record user memories, lacking Agent-related task memories

OpenViking is designed to solve these pain points.

Core Features

1. File System Management Paradigm

Moving away from traditional flat database thinking, all context is organized as a virtual file system. Agents no longer rely solely on vector search to find data — they can locate and browse data through deterministic paths and standard file system commands.

Unified URI Identification: Each context is assigned a unique viking:// URI, enabling precise location and access to resources stored in different locations.

viking://
├── resources/              # Resources: project docs, code repos, web pages
│   └── my_project/
├── user/
│   └── {user_id}/          # Private context for the current user
│       ├── memories/       # User memories
│       ├── resources/      # Private user resources
│       ├── skills/         # Private user skills (default)
│       ├── peers/
│       │   └── {peer_id}/
│       │       ├── memories/
│       │       └── resources/
│       └── sessions/
└── agent/
    └── skills/             # Optional account-wide shared skills

Three Context Types:

Type	Purpose	Lifecycle
Resource	Knowledge and rules (docs, code, FAQ)	Long-term, relatively static
Memory	Agent's cognition (user preferences, learned experiences)	Long-term, dynamically updated
Skill	Callable capabilities (tools, MCP)	Long-term, static

Unix-like API: Familiar command-style operations

python

client.find("user authentication")       # Semantic search
client.ls("viking://resources/")         # List directory
client.read("viking://resources/doc")    # Read content
client.abstract("viking://...")          # Get L0 abstract
client.overview("viking://...")          # Get L1 overview

2. Hierarchical Context On-Demand Loading

Stuffing massive context into prompts all at once is not only expensive but also risks exceeding model windows and introducing noise. OpenViking automatically processes context into three levels upon ingestion:

Level	Name	Token Limit	Purpose
L0	Abstract	~100 tokens	Vector search, quick filtering
L1	Overview	~2k tokens	Rerank, content navigation
L2	Detail	Unlimited	Full content, on-demand loading

viking://resources/my_project/
├── .abstract.md               # L0 layer: abstract
├── .overview.md               # L1 layer: overview
├── docs/
│   ├── .abstract.md          # Each directory has L0/L1 layers
│   ├── .overview.md
│   └── api.md                # L2 layer: full content
└── src/

3. Directory Recursive Retrieval

Single vector retrieval struggles with complex query intents. OpenViking implements an innovative directory recursive retrieval strategy:

Intent Analysis: Generate multiple retrieval conditions through intent analysis
Initial Positioning: Use vector retrieval to quickly locate high-scoring directories
Fine Exploration: Perform secondary retrieval within directories, updating candidate sets with high-scoring results
Recursive Descent: If subdirectories exist, recursively repeat the secondary retrieval
Result Aggregation: Return the most relevant context

This "lock onto high-scoring directories first, then explore content in detail" strategy not only finds semantically matching fragments but also understands the complete context of information.

4. Visualized Retrieval Traces

OpenViking's organization uses a hierarchical virtual file system structure, with all context integrated in a unified format and each entry corresponding to a unique URI, breaking away from traditional flat black-box management.

The retrieval process uses directory recursive strategy, with complete traces of directory browsing and file positioning preserved for each retrieval, enabling clear observation of problem sources and guiding retrieval logic optimization.

5. Automatic Session Management

OpenViking includes a memory self-iteration loop. After a session is committed, the system asynchronously analyzes task outcomes and user feedback, then updates memory for the current user or Peer according to the active memory policy.

Built-in memory types:

Purpose	Built-in types	Description
User and environment understanding	`profile`, `preferences`, `entities`, `events`	User profile, preferences, entities, and events
Assistant identity and continuity	`identity`, `soul`	Assistant identity, boundaries, style, and continuity
Task execution and learning	`cases`, `trajectories`, `experiences`, `tools`, `skills`	Trainable cases, execution traces, reusable experience, and tool/skill usage knowledge

OpenViking lets applications extend or adjust memory types for their own needs.

Enabling Agents to become "smarter with use" through world interaction, achieving self-evolution.

Next Steps

Quick Start - Get started in 5 minutes
Architecture Overview - Understand system design
Context Types - Deep dive into three context types
Retrieval Mechanism - Learn about retrieval flow

Introduction ​

Why OpenViking ​

Core Features ​

1. File System Management Paradigm ​

2. Hierarchical Context On-Demand Loading ​

3. Directory Recursive Retrieval ​

4. Visualized Retrieval Traces ​

5. Automatic Session Management ​

Next Steps ​