Configuration
OpenViking uses a JSON configuration file (~/.openviking/ov.conf) for settings.
For a first-time setup, the recommended flow is:
openviking-server init
openviking-server doctoropenviking-server init prompts for embedding and VLM settings separately. For API-based VLM choices such as OpenAI, Volcengine, Kimi, and GLM, enter the VLM API key when prompted. If you want to use Codex as the VLM provider, choose OpenAI Codex; the wizard can import existing Codex auth or guide you through login directly.
Configuration File
Create ~/.openviking/ov.conf in your project directory:
{
"storage": {
"workspace": "./data",
"vectordb": {
"name": "context",
"backend": "local"
},
"agfs": {
"backend": "local"
}
},
"embedding": {
"dense": {
"api_base" : "<api-endpoint>",
"api_key" : "<your-api-key>",
"provider" : "<provider-type>",
"dimension": 1024,
"model" : "<model-name>"
}
},
"vlm": {
"api_base" : "<api-endpoint>",
"api_key" : "<your-api-key>",
"provider" : "<provider-type>",
"model" : "<model-name>"
}
}For provider: "openai-codex", vlm.api_key is optional when Codex OAuth is already available.
Configuration Examples
Volcengine (Doubao Models)
{
"embedding": {
"dense": {
"api_base" : "https://ark.cn-beijing.volces.com/api/v3",
"api_key" : "your-volcengine-api-key",
"provider" : "volcengine",
"dimension": 1024,
"model" : "doubao-embedding-vision-251215",
"input": "multimodal"
}
},
"vlm": {
"api_base" : "https://ark.cn-beijing.volces.com/api/v3",
"api_key" : "your-volcengine-api-key",
"provider" : "volcengine",
"model" : "doubao-seed-2-0-pro-260215"
}
}OpenAI Models
{
"embedding": {
"dense": {
"api_base" : "https://api.openai.com/v1",
"api_key" : "your-openai-api-key",
"provider" : "openai",
"dimension": 1536,
"model" : "text-embedding-3-small"
}
},
"vlm": {
"api_base" : "https://api.openai.com/v1",
"api_key" : "your-openai-api-key",
"provider" : "openai",
"model" : "gpt-5.4"
}
}Volcengine Embedding + Codex VLM
Use openviking-server init to complete the Codex login/import step, then run openviking-server doctor.
{
"embedding": {
"dense": {
"api_base" : "https://ark.cn-beijing.volces.com/api/v3",
"api_key" : "your-volcengine-api-key",
"provider" : "volcengine",
"dimension": 1024,
"model" : "doubao-embedding-vision-251215"
}
},
"vlm": {
"provider" : "openai-codex",
"model" : "gpt-5.4",
"api_base" : "https://chatgpt.com/backend-api/codex"
}
}Volcengine Embedding + Kimi Coding VLM
{
"embedding": {
"dense": {
"api_base" : "https://ark.cn-beijing.volces.com/api/v3",
"api_key" : "your-volcengine-api-key",
"provider" : "volcengine",
"dimension": 1024,
"model" : "doubao-embedding-vision-251215"
}
},
"vlm": {
"provider" : "kimi",
"model" : "kimi-code",
"api_key" : "your-kimi-subscription-api-key",
"api_base" : "https://api.kimi.com/coding"
}
}kimi applies the Kimi Coding defaults automatically, including the default Kimi Coding user agent.
Volcengine Embedding + GLM Coding Plan VLM
{
"embedding": {
"dense": {
"api_base" : "https://ark.cn-beijing.volces.com/api/v3",
"api_key" : "your-volcengine-api-key",
"provider" : "volcengine",
"dimension": 1024,
"model" : "doubao-embedding-vision-251215"
}
},
"vlm": {
"provider" : "glm",
"model" : "glm-4.6v",
"api_key" : "your-zai-api-key",
"api_base" : "https://api.z.ai/api/coding/paas/v4"
}
}Use a vision-capable GLM model such as glm-4.6v or glm-5v-turbo when OpenViking needs image understanding.
Configuration Sections
embedding
Embedding model configuration for vector search, supporting dense, sparse, and hybrid modes.
Dense Embedding
{
"embedding": {
"max_concurrent": 10,
"max_retries": 3,
"dense": {
"provider": "volcengine",
"api_key": "your-api-key",
"model": "doubao-embedding-vision-251215",
"dimension": 1024,
"input": "multimodal"
}
}
}Parameters
| Parameter | Type | Description |
|---|---|---|
max_concurrent | int | Maximum concurrent embedding requests (embedding.max_concurrent, default: 10) |
max_retries | int | Maximum retry attempts for transient embedding provider errors (embedding.max_retries, default: 3; 0 disables retry) |
provider | str | "volcengine", "openai", "vikingdb", "jina", "voyage", "dashscope", or "gemini" |
api_key | str | API key |
model | str | Model name |
dimension | int | Vector dimension. For Voyage, this maps to output_dimension |
input | str | Input type: "text" or "multimodal" |
batch_size | int | Batch size for embedding requests |
embedding.max_retries only applies to transient errors such as 429, 5xx, timeouts, and connection failures. Permanent errors such as 400, 401, 403, and AccountOverdue are not retried automatically. The backoff strategy is exponential backoff with jitter, starting at 0.5s and capped at 8s.
Embedding Circuit Breaker
When the embedding provider experiences consecutive transient failures (e.g. 429, 5xx), OpenViking opens a circuit breaker to temporarily stop calling the provider and re-enqueue embedding tasks. After the base reset_timeout, it allows a probe request (HALF_OPEN). If the probe fails, the next reset_timeout is doubled (capped by max_reset_timeout).
{
"embedding": {
"circuit_breaker": {
"failure_threshold": 5,
"reset_timeout": 60,
"max_reset_timeout": 600
}
}
}| Parameter | Type | Description |
|---|---|---|
circuit_breaker.failure_threshold | int | Consecutive failures required to open the breaker (default: 5) |
circuit_breaker.reset_timeout | float | Base reset timeout in seconds (default: 60) |
circuit_breaker.max_reset_timeout | float | Maximum reset timeout in seconds when backing off (default: 600) |
Available Models
| Model | Dimension | Input Type | Notes |
|---|---|---|---|
doubao-embedding-vision-251215 | 1024 | multimodal | Recommended |
doubao-embedding-250615 | 1024 | text | Text only |
With input: "multimodal", OpenViking can embed text, images (PNG, JPG, etc.), and mixed content.
Supported providers:
openai: OpenAI Embedding APIvolcengine: Volcengine Embedding APIvikingdb: VikingDB Embedding APIjina: Jina AI Embedding APIvoyage: Voyage AI Embedding APIminimax: MiniMax Embedding APIgemini: Google Gemini Embedding API (text-only; requiresgoogle-genai>=1.0.0)
minimax provider example:
{
"embedding": {
"dense": {
"provider": "minimax",
"api_key": "your-minimax-api-key",
"model": "embo-01",
"dimension": 1536,
"query_param": "query",
"document_param": "db",
"extra_headers": {
"GroupId": "your-group-id"
}
}
}
}vikingdb provider example:
{
"embedding": {
"dense": {
"provider": "vikingdb",
"model": "bge_large_zh",
"ak": "your-access-key",
"sk": "your-secret-key",
"region": "cn-beijing",
"dimension": 1024
}
}
}jina provider example:
{
"embedding": {
"dense": {
"provider": "jina",
"api_key": "jina_xxx",
"model": "jina-embeddings-v5-text-small",
"dimension": 1024
}
}
}Available Jina models:
jina-embeddings-v5-text-small: 677M params, 1024 dim, max seq 32768 (default)jina-embeddings-v5-text-nano: 239M params, 768 dim, max seq 8192
Get your API key at https://jina.ai
voyage provider example:
{
"embedding": {
"dense": {
"provider": "voyage",
"api_key": "pa-xxx",
"api_base": "https://api.voyageai.com/v1",
"model": "voyage-4-lite",
"dimension": 1024
}
}
}Supported Voyage text embedding models include:
voyage-4-litevoyage-4voyage-4-largevoyage-code-3voyage-context-3voyage-3voyage-3.5voyage-3.5-litevoyage-finance-2voyage-law-2
If dimension is omitted, OpenViking uses the model's default output dimension when creating the vector schema.
OpenViking also expects dense float vectors throughout storage and retrieval, so Voyage quantized output dtypes are not exposed in config.
Local deployment (GGUF/MLX): Jina embedding models are open-weight and available in GGUF and MLX formats on Hugging Face. You can run them locally with any OpenAI-compatible server (e.g. llama.cpp, MLX, vLLM) and point the api_base to your local endpoint:
{
"embedding": {
"dense": {
"provider": "jina",
"api_key": "local",
"api_base": "http://localhost:8080/v1",
"model": "jina-embeddings-v5-text-nano",
"dimension": 768
}
}
}gemini provider example:
Note: Requires
pip install "google-genai>=1.0.0". For async batching:pip install "openviking[gemini-async]".
{
"embedding": {
"dense": {
"provider": "gemini",
"api_key": "your-google-api-key",
"model": "gemini-embedding-2-preview",
"dimension": 3072
}
}
}Available Gemini embedding models:
gemini-embedding-2-preview: 8192 token input limit, 1–3072 output dimension (MRL)gemini-embedding-001: 2048 token input limit, 1–3072 output dimension (MRL)text-embedding-004: 2048 token input limit, 768 output dimension (fixed)
Recommended dimensions: 768, 1536, or 3072 (default: 3072).
Get your API key at https://aistudio.google.com/apikey
DashScope (Alibaba Tongyi) provider:
{
"embedding": {
"dense": {
"provider": "dashscope",
"api_key": "${DASHSCOPE_API_KEY}",
"model": "text-embedding-v4",
"dimension": 1024
}
}
}Available DashScope models:
| Model | Dimension | Input Type | Notes |
|---|---|---|---|
text-embedding-v3 | 1024 | text | Optimized for Chinese |
text-embedding-v4 | 1024 | text | Optimized for Chinese |
tongyi-embedding-vision-plus | 1152 | multimodal | Supports fusion via enable_fusion |
tongyi-embedding-vision-flash | 768 | multimodal | Faster, lower cost |
qwen3-vl-embedding | 2560 | multimodal | Text + image + video |
qwen2.5-vl-embedding | 1024 | multimodal | Text + image + video |
Multimodal parameters (text+image/video models only):
| Parameter | Type | Default | Description |
|---|---|---|---|
input_type | str | "multimodal" or "text" | Embedding mode (default: "multimodal") |
enable_fusion | bool | false | Enable fusion vectors for tongyi-embedding-vision-* models |
res_level | int | 2 | Image resolution level (1=high, 2=medium, 3=low) |
max_video_frames | int | 16 | Maximum video frames to embed |
Endpoint selection — DashScope provides api_base defaults for China (cn) and international (intl) regions:
| Region | api_base | Notes |
|---|---|---|
| China | https://dashscope.aliyuncs.com (default) | Recommended for users in mainland China |
| International | https://dashscope-intl.aliyuncs.com | For users outside China |
Custom endpoint URLs are also supported by setting a full URL.
Get your API key at https://dashscope.console.aliyun.com/api-key
Non-symmetric retrieval (different task types for indexing vs. query):
{
"embedding": {
"dense": {
"provider": "gemini",
"api_key": "your-google-api-key",
"model": "gemini-embedding-2-preview",
"dimension": 3072,
"query_param": "RETRIEVAL_QUERY",
"document_param": "RETRIEVAL_DOCUMENT"
}
}
}Supported task types: RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING, CODE_RETRIEVAL_QUERY, QUESTION_ANSWERING, FACT_VERIFICATION.
Sparse Embedding
Note: Volcengine sparse embedding is supported starting from model
doubao-embedding-vision-251215.
{
"embedding": {
"sparse": {
"provider": "volcengine",
"api_key": "your-api-key",
"model": "doubao-embedding-vision-251215"
}
}
}Hybrid Embedding
Two approaches are supported:
Option 1: Single hybrid model
{
"embedding": {
"hybrid": {
"provider": "volcengine",
"api_key": "your-api-key",
"model": "doubao-embedding-hybrid",
"dimension": 1024
}
}
}Option 2: Combine dense + sparse
{
"embedding": {
"dense": {
"provider": "volcengine",
"api_key": "your-api-key",
"model": "doubao-embedding-vision-251215",
"dimension": 1024
},
"sparse": {
"provider": "volcengine",
"api_key": "your-api-key",
"model": "doubao-embedding-vision-251215"
}
}
}vlm
Vision Language Model for semantic extraction (L0/L1 generation).
{
"vlm": {
"api_key": "your-api-key",
"model": "doubao-seed-2-0-pro-260215",
"api_base": "https://ark.cn-beijing.volces.com/api/v3",
"max_retries": 3
}
}Parameters
| Parameter | Type | Description |
|---|---|---|
api_key | str | API key. Optional for openai-codex when Codex OAuth is available |
model | str | Model name |
api_base | str | API endpoint (optional) |
thinking | bool | Enable thinking mode for VolcEngine models (default: false) |
max_concurrent | int | Maximum concurrent semantic LLM calls (default: 100) |
max_retries | int | Maximum retry attempts for transient VLM provider errors (default: 3; 0 disables retry) |
timeout | float | Per-request HTTP timeout in seconds passed to the underlying OpenAI/LiteLLM client. Increase for slow endpoints (e.g., DashScope, local inference). Must be > 0 (default: 60.0) |
extra_headers | object | Custom HTTP headers for compatible HTTP providers. kimi also accepts header overrides, but already injects the required subscription headers by default |
stream | bool | Enable streaming mode (for OpenAI-compatible providers, default: false) |
vlm.max_retries only applies to transient errors such as 429, 5xx, timeouts, and connection failures. Permanent authentication, authorization, and billing errors are not retried automatically. The backoff strategy is exponential backoff with jitter, starting at 0.5s and capped at 8s.
Available Models
| Model | Notes |
|---|---|
doubao-seed-2-0-pro-260215 | Recommended for semantic extraction |
doubao-pro-32k | For longer context |
When resources are added, VLM generates:
- L0 (Abstract): ~100 token summary
- L1 (Overview): ~2k token overview with navigation
If VLM is not configured, L0/L1 will be generated from content directly (less semantic), and multimodal resources may have limited descriptions.
Supported providers:
volcengine: Volcengine VLM APIopenai: OpenAI-compatible VLM APIopenai-codex: Codex VLM via ChatGPT/Codex OAuthkimi: Kimi Coding subscription endpoint with built-in provider defaultsglm: Z.AI GLM Coding Plan endpoint with OpenAI-compatible requests
For openai-codex, authenticate through openviking-server init, then verify with openviking-server doctor.
Custom HTTP Headers
For OpenAI-compatible providers (e.g., OpenRouter), you can add custom HTTP headers via extra_headers:
{
"vlm": {
"provider": "openai",
"api_key": "your-api-key",
"model": "gpt-4o",
"api_base": "https://openrouter.ai/api/v1",
"extra_headers": {
"HTTP-Referer": "https://your-site.com",
"X-Title": "Your App Name"
}
}
}Common use cases:
- OpenRouter: Requires
HTTP-RefererandX-Titleto identify your application - Kimi Coding: Override or extend the default subscription headers when you need a custom user agent
- Custom proxies: Add authentication or tracing headers
- API gateways: Add version or routing identifiers
Streaming Mode
For OpenAI-compatible providers that return SSE (Server-Sent Events) format responses, enable stream mode:
{
"vlm": {
"provider": "openai",
"api_key": "your-api-key",
"model": "gpt-4o",
"api_base": "https://api.example.com/v1",
"stream": true
}
}Note: The OpenAI SDK requires
stream=trueto properly parse SSE responses. When using providers that force SSE format, you must set this option totrue.
feishu
Configuration for Feishu/Lark cloud document parsing. See Resources for supported URL patterns.
{
"feishu": {
"app_id": "",
"app_secret": "",
"domain": "https://open.feishu.cn",
"max_rows_per_sheet": 1000,
"max_records_per_table": 1000
}
}| Parameter | Type | Description |
|---|---|---|
app_id | str | Feishu app ID (can also be set via FEISHU_APP_ID env var) |
app_secret | str | Feishu app secret (can also be set via FEISHU_APP_SECRET env var) |
domain | str | Feishu API domain. Use https://open.larksuite.com for Lark international |
max_rows_per_sheet | int | Maximum rows to import per spreadsheet sheet (default: 1000) |
max_records_per_table | int | Maximum records to import per bitable table (default: 1000) |
Dependency: pip install 'openviking[bot-feishu]'
Lark international: For Lark URLs (*.larksuite.com), set domain to https://open.larksuite.com.
code
Controls how code files are summarized via code_summary_mode. Both config formats are equivalent:
{
"code": {
"code_summary_mode": "ast"
}
}{
"parsers": {
"code": {
"code_summary_mode": "ast"
}
}
}Set code_summary_mode to one of:
| Value | Description | Default |
|---|---|---|
"ast" | Extract AST skeleton (class names, method signatures, first-line docstrings, imports) for files ≥100 lines, skip LLM calls. Recommended for large-scale code indexing | ✓ |
"llm" | Always use LLM for summarization (higher cost) | |
"ast_llm" | Extract AST skeleton (with full docstrings) first, then pass it as context to LLM (highest quality, moderate cost) |
AST extraction supports: Python, JavaScript/TypeScript, Rust, Go, Java, C/C++. Other languages, extraction failures, or empty skeletons automatically fall back to LLM.
See Code Skeleton Extraction for details.
rerank
Reranking model for search result refinement. Supports VikingDB (Volcengine), Cohere, and OpenAI-compatible APIs.
Volcengine (VikingDB):
{
"rerank": {
"provider": "vikingdb",
"ak": "your-access-key",
"sk": "your-secret-key",
"model_name": "doubao-seed-rerank",
"model_version": "251028"
}
}OpenAI-compatible provider (e.g. DashScope):
{
"rerank": {
"provider": "openai",
"api_key": "your-api-key",
"api_base": "https://dashscope.aliyuncs.com/compatible-api/v1/reranks",
"model": "qwen3-vl-rerank",
"threshold": 0.1
}
}Parameters
| Parameter | Type | Description |
|---|---|---|
provider | str | "vikingdb", "cohere", or "openai". Auto-detected if omitted. |
ak | str | VikingDB Access Key (vikingdb provider only) |
sk | str | VikingDB Secret Key (vikingdb provider only) |
model_name | str | Model name (vikingdb provider only, default: doubao-seed-rerank) |
api_key | str | API key (for openai or cohere providers) |
api_base | str | Endpoint URL (for openai provider) |
model | str | Model name (for openai providers) |
threshold | float | Score threshold between 0.0 and 1.0; results below this are filtered out. Default: 0.1 |
extra_headers | object | Custom HTTP headers (for OpenAI-compatible providers, optional) |
Supported providers:
vikingdb: Volcengine VikingDB Rerank API (uses AK/SK)cohere: Cohere Rerank APIopenai: OpenAI-compatible Rerank API
If rerank is not configured, search uses vector similarity only.
storage
Storage configuration for context data, including file storage (RAGFS) and vector database storage (VectorDB).
Root Configuration
| Parameter | Type | Description | Default |
|---|---|---|---|
workspace | str | Local data storage path (main configuration) | "./data" |
agfs | object | RAGFS (Rust-based AGFS) configuration | {} |
vectordb | object | Vector database storage configuration | {} |
{
"storage": {
"workspace": "./data",
"agfs": {
"backend": "local",
"timeout": 10
},
"vectordb": {
"backend": "local"
}
}
}agfs (RAGFS)
| Parameter | Type | Description | Default |
|---|---|---|---|
backend | str | "local", "s3", or "memory" | "local" |
timeout | float | Request timeout in seconds | 10.0 |
queue_db_path | str (optional) | Override path of the queuefs sqlite database file. Defaults to {storage.workspace}/_system/queue/queue.db when not set. Useful when the workspace volume does not support sqlite (e.g. some network filesystems) | null |
s3 | object | S3 backend configuration (when backend is 's3') | - |
Configuration Examples
RAGFS uses Rust binding mode by default, directly accessing the file system through the Rust implementation.
S3 Backend Configuration
| Parameter | Type | Description | Default |
|---|---|---|---|
bucket | str | S3 bucket name | null |
region | str | AWS region where the bucket is located (e.g., us-east-1, cn-beijing) | null |
access_key | str | S3 access key ID | null |
secret_key | str | S3 secret access key corresponding to the access key ID | null |
endpoint | str | Custom S3 endpoint, required for S3-compatible services like MinIO or LocalStack. Accepts a full URL (https://... or http://...) or a bare hostname; bare hostnames are auto-prefixed with https:// or http:// based on use_ssl | null |
prefix | str | Optional key prefix for namespace isolation | "" |
use_ssl | bool | Enable/disable SSL (HTTPS) for S3 connections. Also controls the scheme auto-prefixed onto bare-hostname endpoint values | true |
use_path_style | bool | true for PathStyle used by MinIO and some S3-compatible services; false for VirtualHostStyle used by TOS and some S3-compatible services | true |
directory_marker_mode | str | How to persist directory markers: none, empty, or nonempty | "empty" |
normalize_encoding_chars | str | Characters to escape in S3 object keys as !HH hexadecimal bytes; empty string disables normalization | "?#%+@" |
directory_marker_mode controls how RAGFS materializes directory objects in S3:
emptyis the default. RAGFS writes a zero-byte directory marker and preserves empty-directory semantics.nonemptywrites a non-empty marker payload. Use this for S3-compatible services such as TOS that reject zero-byte directory markers.noneswitches RAGFS to prefix-style S3 semantics. RAGFS does not create directory marker objects, so empty directories are not persisted and may not be discoverable until they contain at least one child object.
Typical choices:
- For MinIO, SeaweedFS, and most PathStyle backends, keep the default
empty. - For TOS or other VirtualHostStyle backends that reject zero-byte directory markers, use
nonempty. - If you want pure prefix-style behavior and do not need persisted empty directories, use
none.
normalize_encoding_chars controls which characters RAGFS rewrites before issuing S3 requests:
- The default value is
"?#%+@", so only?,#,%,+, and@are escaped. - Escaped bytes are encoded as
!HH, whereHHis the uppercase hexadecimal value of the byte. - Characters not listed in
normalize_encoding_chars, including Chinese and other Unicode characters, remain unchanged. - Set
normalize_encoding_charsto""to keep original path segments in object keys.
PathStyle S3
Supports S3 storage in PathStyle mode, such as MinIO, SeaweedFS.{
"storage": {
"agfs": {
"backend": "s3",
"s3": {
"bucket": "my-bucket",
"endpoint": "s3.amazonaws.com",
"region": "us-east-1",
"access_key": "your-ak",
"secret_key": "your-sk",
"normalize_encoding_chars": "?#%+@"
}
}
}
}VirtualHostStyle S3
Supports S3 storage in VirtualHostStyle mode, such as TOS.{
"storage": {
"agfs": {
"backend": "s3",
"s3": {
"bucket": "my-bucket",
"endpoint": "s3.amazonaws.com",
"region": "us-east-1",
"access_key": "your-ak",
"secret_key": "your-sk",
"use_path_style": false,
"directory_marker_mode": "nonempty",
"normalize_encoding_chars": "?#%+@"
}
}
}
}vectordb
Vector database storage configuration
| Parameter | Type | Description | Default |
|---|---|---|---|
backend | str | VectorDB backend type: 'local' (file-based), 'http' (remote service), 'volcengine' (cloud VikingDB), or 'vikingdb' (private deployment) | "local" |
name | str | VectorDB collection name | "context" |
url | str | Remote service URL for 'http' type (e.g., 'http://localhost:5000') | null |
project_name | str | Project name (alias project) | "default" |
distance_metric | str | Distance metric for vector similarity search (e.g., 'cosine', 'l2', 'ip') | "cosine" |
dimension | int | Vector embedding dimension | 0 |
sparse_weight | float | Sparse weight for hybrid vector search, only effective when using hybrid index | 0.0 |
volcengine | object | 'volcengine' type VikingDB configuration | - |
vikingdb | object | 'vikingdb' type private deployment configuration | - |
Default local mode
{
"storage": {
"vectordb": {
"backend": "local"
}
}
}volcengine vikingDB
Supports cloud-deployed VikingDB on Volcengine{
"storage": {
"vectordb": {
"name": "context",
"backend": "volcengine",
"project": "default",
"volcengine": {
"region": "cn-beijing",
"ak": "your-access-key",
"sk": "your-secret-key"
}
}
}Config Files
OpenViking uses two config files:
| File | Purpose | Default Path |
|---|---|---|
ov.conf | SDK embedded mode + server config | ~/.openviking/ov.conf |
ovcli.conf | HTTP client and CLI connection to remote server | ~/.openviking/ovcli.conf |
When config files are at the default path, OpenViking loads them automatically — no additional setup needed.
If config files are at a different location, there are two ways to specify:
# Option 1: Environment variable
export OPENVIKING_CONFIG_FILE=/path/to/ov.conf
export OPENVIKING_CLI_CONFIG_FILE=/path/to/ovcli.conf
# Option 2: Command-line argument (serve command only)
openviking-server --config /path/to/ov.confov.conf
The config sections documented above (embedding, vlm, rerank, storage) all belong to ov.conf. SDK embedded mode and server share this file.
For memory-related settings, add a memory section in ov.conf:
{
"memory": {
"agent_scope_mode": "user+agent"
}
}| Field | Description | Default |
|---|---|---|
agent_scope_mode | Deprecated and ignored. Kept only for backward compatibility with older ov.conf files. Agent/user namespace behavior is now controlled by per-account namespace policy. | "user+agent" |
agent_scope_mode no longer changes namespace behavior. The server now uses account-level namespace policy to choose between viking://agent/{agent_id}/... and viking://agent/{agent_id}/user/{user_id}/....
ovcli.conf
Config file for the HTTP client (SyncHTTPClient / AsyncHTTPClient) and CLI to connect to a remote server:
{
"url": "http://localhost:1933",
"api_key": "your-secret-key",
"account": "acme",
"user": "alice",
"agent_id": "my-agent",
"output": "table",
"upload": {
"ignore_dirs": "node_modules,.cache,.nx",
"include": "*.md,*.pdf",
"exclude": "*.tmp,*.log"
}
}| Field | Description | Default |
|---|---|---|
url | Server address | (required) |
api_key | API key for authentication (root key or user key) | null (no auth) |
account | Default account sent as X-OpenViking-Account | null |
user | Default user sent as X-OpenViking-User | null |
agent_id | Agent identifier for agent space isolation | null |
output | Default output format: "table" or "json" | "table" |
upload.ignore_dirs | Default directory ignore list for add-resource (CSV) | null |
upload.include | Default include patterns for add-resource (CSV) | null |
upload.exclude | Default exclude patterns for add-resource (CSV) | null |
CLI flags can override these identity fields per command:
openviking --account acme --user alice --agent-id assistant-2 ls viking://For add-resource, upload filter flags are merged additively with ovcli.conf defaults:
# ovcli.conf: upload.exclude="*.log"
openviking add-resource ./docs --exclude "*.tmp"
# effective exclude sent to server: "*.log,*.tmp"See Deployment for details.
server Section
When running OpenViking as an HTTP service, add a server section to ov.conf:
{
"server": {
"host": "127.0.0.1",
"port": 1933,
"auth_mode": "api_key",
"root_api_key": "your-secret-root-key",
"cors_origins": ["*"]
}
}| Field | Type | Description | Default |
|---|---|---|---|
host | str | Bind address | 127.0.0.1 |
port | int | Bind port | 1933 |
auth_mode | str | Authentication mode: "api_key" or "trusted". Default is "api_key" | "api_key" |
root_api_key | str | Root API key for multi-tenant auth in api_key mode. In trusted mode it is optional on localhost, but required for any non-localhost deployment; it does not become the source of user identity | null |
cors_origins | list | Allowed CORS origins | ["*"] |
api_key mode uses API keys and is the default. trusted mode trusts X-OpenViking-Account / X-OpenViking-User headers from a trusted gateway or internal caller.
When root_api_key is configured in api_key mode, the server enables multi-tenant authentication. Use the Admin API to create accounts and user keys. In trusted mode, ordinary requests do not require user registration first; each request is resolved as USER from the injected identity headers. However, skipping root_api_key in trusted mode is allowed only on localhost. Development mode only applies when auth_mode = "api_key" and root_api_key is not set.
For startup and deployment details see Deployment, for authentication see Authentication.
storage.transaction Section
Path locks are enabled by default and usually require no configuration. The default behavior is no-wait: if the target path is already locked by another operation, the operation fails immediately with LockAcquisitionError. Set lock_timeout to a positive value to allow polling/retry.
{
"storage": {
"transaction": {
"lock_timeout": 5.0,
"lock_expire": 300.0
}
}
}| Parameter | Type | Description | Default |
|---|---|---|---|
lock_timeout | float | Path lock acquisition timeout (seconds). 0 = fail immediately if locked (default). > 0 = wait/retry up to this many seconds, then raise LockAcquisitionError. | 0.0 |
lock_expire | float | Lock inactivity threshold (seconds). Locks not refreshed within this window are treated as stale and reclaimed. | 300.0 |
For details on the lock mechanism, see Path Locks and Crash Recovery.
encryption Section
Enable at-rest data encryption to ensure data security and isolation in multi-tenant environments. Encryption is completely transparent to users with no API changes.
{
"encryption": {
"enabled": true,
"provider": "local|vault|volcengine_kms"
}
}| Parameter | Type | Description | Default |
|---|---|---|---|
enabled | bool | Whether encryption is enabled | false |
provider | str | Key provider: "local", "vault", or "volcengine_kms" | - |
api_key_hashing.enabled | bool | Whether to apply Argon2id one-way hashing to API key values (independent of file-level enabled); see Encryption Guide | false |
Local (File)
Suitable for development environments and single-node deployments:
{
"encryption": {
"enabled": true,
"provider": "local",
"local": {
"key_file": "~/.openviking/master.key"
}
}
}| Parameter | Type | Description | Default |
|---|---|---|---|
local.key_file | str | Root key file path | ~/.openviking/master.key |
Vault (HashiCorp Vault)
Suitable for production and multi-cloud deployments:
{
"encryption": {
"enabled": true,
"provider": "vault",
"vault": {
"address": "https://vault.example.com:8200",
"token": "vault-token-xxx",
"mount_point": "transit",
"key_name": "openviking-root"
}
}
}| Parameter | Type | Description | Default |
|---|---|---|---|
vault.address | str | Vault service address | - |
vault.token | str | Vault access token | - |
vault.mount_point | str | Transit engine mount point | "transit" |
vault.key_name | str | Root key name | "openviking-root" |
Volcengine KMS
Suitable for Volcengine cloud deployments:
{
"encryption": {
"enabled": true,
"provider": "volcengine_kms",
"volcengine_kms": {
"key_id": "kms-key-id-xxx",
"region": "cn-beijing",
"access_key": "AKLTxxxxxxxx",
"secret_key": "Tmpxxxxxxxx"
}
}
}| Parameter | Type | Description | Default |
|---|---|---|---|
volcengine_kms.key_id | str | KMS key ID | - |
volcengine_kms.region | str | Region | "cn-beijing" |
volcengine_kms.access_key | str | Volcengine Access Key | - |
volcengine_kms.secret_key | str | Volcengine Secret Key | - |
For detailed encryption explanations, see Data Encryption. For complete usage instructions, see Encryption Guide.
Full Schema
{
"embedding": {
"max_concurrent": 10,
"max_retries": 3,
"dense": {
"provider": "volcengine",
"api_key": "string",
"model": "string",
"dimension": 1024,
"input": "multimodal"
}
},
"vlm": {
"provider": "string",
"api_key": "string",
"model": "string",
"api_base": "string",
"thinking": false,
"max_concurrent": 100,
"max_retries": 3,
"extra_headers": {},
"stream": false
},
"rerank": {
"provider": "volcengine|openai",
"api_key": "string",
"model": "string",
"api_base": "string",
"threshold": 0.1,
"extra_headers": {}
},
"encryption": {
"enabled": false,
"provider": "local|vault|volcengine_kms",
"local": {
"key_file": "~/.openviking/master.key"
},
"vault": {
"address": "https://vault.example.com:8200",
"token": "string",
"mount_point": "transit",
"key_name": "openviking-root"
},
"volcengine_kms": {
"key_id": "string",
"region": "cn-beijing",
"access_key": "string",
"secret_key": "string"
}
},
"storage": {
"workspace": "string",
"agfs": {
"backend": "local|s3|memory",
"timeout": 10
},
"transaction": {
"lock_timeout": 0.0,
"lock_expire": 300.0
},
"vectordb": {
"backend": "local|remote",
"url": "string",
"project": "string"
}
},
"code": {
"code_summary_mode": "ast"
},
"server": {
"host": "127.0.0.1",
"port": 1933,
"root_api_key": "string",
"cors_origins": ["*"]
}
}Notes:
storage.vectordb.sparse_weightcontrols hybrid (dense + sparse) indexing/search. It only takes effect when you use a hybrid index; set it > 0 to enable sparse signals.
Troubleshooting
API Key Error
Error: Invalid API keyCheck your API key is correct and has the required permissions.
Vector Dimension Mismatch
Error: Vector dimension mismatchEnsure the dimension in config matches the model's output dimension.
VLM Timeout
Error: VLM request timeout- Check network connectivity
- Increase timeout in config
- For intermittent timeouts, increase
vlm.max_retriesmoderately - Try a smaller model
- For bulk ingestion, consider lowering
vlm.max_concurrent
Rate Limiting
Error: Rate limit exceededVolcengine has rate limits. Consider batch processing with delays or upgrading your plan.
- Lower
embedding.max_concurrent/vlm.max_concurrentfirst - Keep a small
max_retriesvalue for occasional429s; set it to0if you prefer fail-fast behavior
Related Documentation
- Volcengine Purchase Guide - API key setup
- API Overview - Client initialization
- Server Deployment - Server configuration
- Context Layers - L0/L1/L2
