Vector Database API - GDScript SDK
Vector database operations for semantic search, RAG (Retrieval-Augmented Generation), and AI applications.
Note: Vector operations are currently implemented using sqlite-vec but are designed with abstraction in mind to support future vector database providers.
Overview
The Vector API provides a unified interface for working with vector embeddings, enabling you to:
- Store and search vector embeddings
- Perform similarity search
- Build RAG applications
- Create recommendation systems
- Enable semantic search capabilities
Getting Started
var BosBase = preload("res://gdscript-sdk/src/bosbase.gd")
var pb = BosBase.new("http://localhost:8090")
# Authenticate as superuser (vectors require superuser auth)
var auth = await pb.admins().auth_with_password("admin@example.com", "password")
if auth is ClientResponseError:
push_error("Authentication failed: " + auth.to_string())
return
Types
VectorEmbedding
Array of numbers representing a vector embedding.
# Vector embedding is an array of floats
var embedding: Array[float] = [0.1, 0.2, 0.3, 0.4]
VectorDocument
A vector document with embedding, metadata, and optional content.
{
"id": String, # Unique identifier (optional, auto-generated if not provided)
"vector": Array[float], # The vector embedding
"metadata": Dictionary, # Optional metadata (key-value pairs)
"content": String # Optional content
}
VectorSearchOptions
Options for vector similarity search.
{
"queryVector": Array[float], # Query vector to search for
"limit": int, # Max results (default: 10, max: 100)
"filter": Dictionary, # Optional metadata filter
"minScore": float, # Minimum similarity score threshold
"maxDistance": float, # Maximum distance threshold
"includeDistance": bool, # Include distance in results
"includeContent": bool # Include content in results
}
VectorSearchResult
Result from a similarity search.
{
"document": Dictionary, # The matching document
"score": float, # Similarity score (0-1, higher is better)
"distance": float # Distance metric (optional)
}
Collection Management
Create Collection
Create a new vector collection with specified dimension and distance metric.
var result = await pb.vectors.create_collection("documents", {
"dimension": 384, # Vector dimension (default: 384)
"distance": "cosine" # Distance metric: "cosine" (default), "l2", "dot"
})
# Minimal example (uses defaults)
var result2 = await pb.vectors.create_collection("documents")
Parameters:
name(string): Collection nameconfig(dictionary, optional):dimension(int, optional): Vector dimension. Default: 384distance(string, optional): Distance metric. Default: “cosine”- Options: “cosine”, “l2”, “dot”
List Collections
Get all available vector collections.
var collections = await pb.vectors.list_collections()
if collections is ClientResponseError:
push_error("Failed to list collections: " + collections.to_string())
return
for collection in collections:
print("%s: %d vectors" % [collection.name, collection.get("count", 0)])
Response:
Array[{
"name": String,
"count": int, # Optional
"dimension": int # Optional
}]
Update Collection
Update a vector collection configuration (distance metric and options). Note: Collection name and dimension cannot be changed after creation.
await pb.vectors.update_collection("documents", {
"distance": "l2"
})
# Update with options
await pb.vectors.update_collection("documents", {
"distance": "inner_product",
"options": {"customOption": "value"}
})
Parameters:
name(string): Collection nameconfig(dictionary, optional):distance(string, optional): Distance metric to update. Options: “cosine”, “l2”, “inner_product”options(dictionary, optional): Custom collection options
Delete Collection
Delete a vector collection and all its data.
var result = await pb.vectors.delete_collection("documents")
if result is ClientResponseError:
push_error("Failed to delete collection: " + result.to_string())
⚠️ Warning: This permanently deletes the collection and all vectors in it!
Document Operations
Insert Document
Insert a single vector document.
# With custom ID
var result = await pb.vectors.insert({
"id": "doc_001",
"vector": [0.1, 0.2, 0.3, 0.4],
"metadata": {"category": "tech", "tags": ["AI", "ML"]},
"content": "Document about machine learning"
}, {"collection": "documents"})
if result is ClientResponseError:
push_error("Failed to insert document: " + result.to_string())
return
print("Inserted: ", result.id)
# Without ID (auto-generated)
var result2 = await pb.vectors.insert({
"vector": [0.5, 0.6, 0.7, 0.8],
"content": "Another document"
}, {"collection": "documents"})
Response:
{
"id": String, # The document ID
"success": bool
}
Batch Insert
Insert multiple vector documents efficiently.
var result = await pb.vectors.batch_insert({
"documents": [
{"vector": [0.1, 0.2, 0.3], "metadata": {"cat": "A"}, "content": "Doc A"},
{"vector": [0.4, 0.5, 0.6], "metadata": {"cat": "B"}, "content": "Doc B"},
{"vector": [0.7, 0.8, 0.9], "metadata": {"cat": "C"}, "content": "Doc C"},
],
"skipDuplicates": true # Skip documents with duplicate IDs
}, {"collection": "documents"})
if result is ClientResponseError:
push_error("Failed to batch insert: " + result.to_string())
return
print("Inserted: ", result.insertedCount)
print("Failed: ", result.failedCount)
print("IDs: ", result.ids)
Response:
{
"insertedCount": int, # Number of successfully inserted vectors
"failedCount": int, # Number of failed insertions
"ids": Array[String], # List of inserted document IDs
"errors": Array # Optional error details
}
Get Document
Retrieve a vector document by ID.
var doc = await pb.vectors.get("doc_001", {"collection": "documents"})
if doc is ClientResponseError:
push_error("Failed to get document: " + doc.to_string())
return
print("Vector: ", doc.vector)
print("Content: ", doc.content)
print("Metadata: ", doc.metadata)
Update Document
Update an existing vector document.
# Update all fields
await pb.vectors.update("doc_001", {
"vector": [0.9, 0.8, 0.7, 0.6],
"metadata": {"updated": true},
"content": "Updated content"
}, {"collection": "documents"})
# Partial update (only metadata and content)
await pb.vectors.update("doc_001", {
"metadata": {"category": "updated"},
"content": "New content"
}, {"collection": "documents"})
Delete Document
Delete a vector document.
var result = await pb.vectors.delete("doc_001", {"collection": "documents"})
if result is ClientResponseError:
push_error("Failed to delete document: " + result.to_string())
List Documents
List all documents in a collection with pagination.
# Get first page
var result = await pb.vectors.list({
"page": 1,
"perPage": 50
}, {"collection": "documents"})
if result is ClientResponseError:
push_error("Failed to list documents: " + result.to_string())
return
print("Page %d of %d" % [result.page, result.totalPages])
for item in result.items:
print(item.id, item.content)
Response:
{
"page": int,
"perPage": int,
"totalItems": int,
"totalPages": int,
"items": Array[VectorDocument]
}
Vector Search
Basic Search
Perform similarity search on vectors.
var results = await pb.vectors.search({
"queryVector": [0.1, 0.2, 0.3, 0.4],
"limit": 10
}, {"collection": "documents"})
if results is ClientResponseError:
push_error("Search failed: " + results.to_string())
return
for result in results.results:
print("Score: %.2f - %s" % [result.score, result.document.content])
Advanced Search
var results = await pb.vectors.search({
"queryVector": [0.1, 0.2, 0.3, 0.4],
"limit": 20,
"minScore": 0.7, # Minimum similarity threshold
"maxDistance": 0.3, # Maximum distance threshold
"includeDistance": true, # Include distance metric
"includeContent": true, # Include full content
"filter": {"category": "tech"} # Filter by metadata
}, {"collection": "documents"})
if results is ClientResponseError:
push_error("Search failed: " + results.to_string())
return
print("Found %d matches in %dms" % [results.totalMatches, results.queryTime])
for r in results.results:
print("Score: %.2f, Distance: %.2f" % [r.score, r.distance])
print("Content: ", r.document.content)
Response:
{
"results": Array[VectorSearchResult],
"totalMatches": int, # Optional
"queryTime": int # Optional (milliseconds)
}
Common Use Cases
Semantic Search
# 1. Generate embeddings for your documents
var documents = [
{"text": "Introduction to machine learning", "id": "doc1"},
{"text": "Deep learning fundamentals", "id": "doc2"},
{"text": "Natural language processing", "id": "doc3"},
]
for doc in documents:
# Generate embedding using your model
var embedding = await generate_embedding(doc.text)
await pb.vectors.insert({
"id": doc.id,
"vector": embedding,
"content": doc.text,
"metadata": {"type": "article"}
}, {"collection": "documents"})
# 2. Search
var query_embedding = await generate_embedding("What is AI?")
var results = await pb.vectors.search({
"queryVector": query_embedding,
"limit": 5,
"minScore": 0.7
}, {"collection": "documents"})
if not results is ClientResponseError:
for r in results.results:
print("%.2f: %s" % [r.score, r.document.content])
RAG (Retrieval-Augmented Generation)
func retrieve_context(query: String, limit: int = 5) -> Array[String]:
var query_embedding = await generate_embedding(query)
var results = await pb.vectors.search({
"queryVector": query_embedding,
"limit": limit,
"minScore": 0.75,
"includeContent": true
}, {"collection": "documents"})
if results is ClientResponseError:
push_error("Failed to search: " + results.to_string())
return []
var context = []
for r in results.results:
context.append(r.document.content)
return context
# Use with your LLM
var context = await retrieve_context("What are best practices for security?")
# Build prompt with context and generate answer
Best Practices
Vector Dimensions
Choose the right dimension for your use case:
- OpenAI embeddings: 1536 (
text-embedding-3-large) - Sentence Transformers: 384-768
all-MiniLM-L6-v2: 384all-mpnet-base-v2: 768
- Custom models: Match your model’s output
Distance Metrics
| Metric | Best For | Notes |
|---|---|---|
cosine |
Text embeddings | Works well with normalized vectors |
l2 |
General similarity | Euclidean distance |
dot |
Performance | Requires normalized vectors |
Performance Tips
- Use batch insert for multiple vectors
- Set appropriate limits to avoid excessive results
- Use metadata filtering to narrow search space
- Enable indexes (automatic with sqlite-vec)
Security
- All vector endpoints require superuser authentication
- Never expose credentials in client-side code
- Use environment variables for sensitive data
Error Handling
var results = await pb.vectors.search({
"queryVector": [0.1, 0.2, 0.3]
}, {"collection": "documents"})
if results is ClientResponseError:
match results.status:
404:
push_error("Collection not found")
400:
push_error("Invalid request: ", results.data)
_:
push_error("Error: " + results.to_string())