API Reference
Two endpoints for discovering embedding models and generating vectors. Both require a Bearer token.
Authentication
Both endpoints require an Authorization header with a Bearer token. Generate keys from the dashboard.
Authorization: Bearer $API_KEY
/v1/models
Returns all available embedding models with their capabilities and limits.
curl https://api.vectors.space/v1/models \ -H "Authorization: Bearer $API_KEY"
idstringtypestringproviderstringembedding_dimnumbermax_input_tokensnumbermax_batch_inputsnumbermax_batch_tokensnumber/v1/embeddings
Generate embeddings for one or more texts. Pass a string or an array of strings to input.
curl -X POST https://api.vectors.space/v1/embeddings \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "embeddinggemma-300m",
"provider": "llama",
"input": "text to embed",
"content_type": "text",
"output_dimension": 768,
"strategy": {
"type": "fail",
"max_tokens": 2048
}
}'modelstringrequiredproviderstringinputstring | string[]requiredstrategyobjectcontent_typestringoutput_dimensionnumberdata[].embeddingnumber[]data[].indexnumberproviderstringoutput_dimensionnumberstrategyobjectcontent_typestringusage.prompt_tokensnumberusage.total_tokensnumberusage.estimated_prompt_tokensnumberStrategies
The strategy object controls what happens when an input exceeds max_tokens. If max_tokens is omitted, the model's max_input_tokens is used and must be <= model max_input_tokens.
fail
Returns an error if any input exceeds max_tokens. Use this for strict pipelines where oversized inputs should never be silently altered.
curl -sS https://api.vectors.space/v1/embeddings \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "embeddinggemma-300m",
"provider": "llama",
"input": "very long text...",
"strategy": { "type": "fail", "max_tokens": 2048 }
}'truncate
Trims the input to fit within max_tokens. Accepts a string or array. Each response item includes truncation metadata.
curl -sS https://api.vectors.space/v1/embeddings \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "embeddinggemma-300m",
"provider": "llama",
"input": ["first long input", "second long input"],
"strategy": { "type": "truncate", "max_tokens": 2048 }
}'truncatedbooleanoriginal_charsnumberused_charsnumberchunk
Splits each input into overlapping chunks. chunk_overlap controls overlap between chunks (token-estimate based). Use pooling to control the output shape.
max_tokensnumberchunk_overlapnumberpooling"none" | "mean"Returns one embedding per chunk with source-mapping fields.
curl -sS https://api.vectors.space/v1/embeddings \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "embeddinggemma-300m",
"provider": "llama",
"input": ["first input", "second input"],
"strategy": {
"type": "chunk",
"max_tokens": 1024,
"chunk_overlap": 128,
"pooling": "none"
}
}'input_indexnumberchunk_indexnumberchunk_startnumberchunk_endnumberOutput order follows input order then chunk order. Offsets are rune indexes in the original string.
Returns one embedding per input by averaging all chunk vectors. Each item includes a chunks array with per-chunk metadata.
curl -sS https://api.vectors.space/v1/embeddings \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "embeddinggemma-300m",
"provider": "llama",
"input": ["first input", "second input"],
"strategy": {
"type": "chunk",
"max_tokens": 1024,
"chunk_overlap": 128,
"pooling": "mean"
}
}'input_indexnumberchunk_countnumberchunks[].chunk_indexnumberchunks[].chunk_startnumberchunks[].chunk_endnumberchunks[].estimated_tokensnumber