Vector Database

1When do I need this?

Semantic search - Search documents by meaning, not keywords (searching "puppy" also returns "pet dog" documents)
RAG (Retrieval-Augmented Generation) - Provide relevant documents as context to LLMs
Recommendation systems - Recommend similar products/content
Image search - Reverse image search to find similar images

2Key Services

Pinecone

SaaS, easiest to use

Fully managed vector DB. Simple setup with automatic scaling. Serverless model with pay-as-you-go pricing.

pgvector

PostgreSQL extension

An extension that adds vector search capabilities to existing PostgreSQL. Enables vector search using SQL without a separate service.

Qdrant

Open source

High-performance vector DB written in Rust. Can be self-hosted or used as a cloud service. Powerful filtering capabilities.

ChromaDB

For local development

Python-native vector DB. Runs locally out of the box, making it ideal for prototyping and development.

3Pricing

Pinecone - Free tier: 100K vectors (serverless), 1 index
pgvector - Included with PostgreSQL (free, no additional cost)
Qdrant Cloud - Free tier: 1GB storage
ChromaDB - Open source, free (runs locally)

Watch out for embedding API costs

While the vector DB itself is free or inexpensive, the embedding API (e.g., OpenAI) for converting text to vectors incurs separate costs. OpenAI text-embedding-3-small costs approximately $0.02/1M tokens.

4Connection Examples

FastAPI + pinecone-client (Python)

app/vector_db.py

# app/vector_db.py
from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")
index = pc.Index("my-index")

app/main.py

# app/main.py
from fastapi import FastAPI
from app.vector_db import index
import openai

app = FastAPI()
openai_client = openai.OpenAI()

def get_embedding(text: str) -> list[float]:
    """Convert text to a vector"""
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

@app.post("/documents")
async def upsert_document(doc_id: str, text: str):
    vector = get_embedding(text)
    index.upsert(vectors=[{
        "id": doc_id,
        "values": vector,
        "metadata": {"text": text}
    }])
    return {"id": doc_id, "status": "indexed"}

@app.get("/search")
async def search(query: str, top_k: int = 5):
    query_vector = get_embedding(query)
    results = index.query(
        vector=query_vector,
        top_k=top_k,
        include_metadata=True
    )
    return [
        {"id": m.id, "score": m.score, "text": m.metadata["text"]}
        for m in results.matches
    ]

Hono + @pinecone-database/pinecone (TypeScript)

src/vector.ts

// src/vector.ts
import { Pinecone } from '@pinecone-database/pinecone';

const pc = new Pinecone({ apiKey: 'your-api-key' });
const index = pc.index('my-index');

export default index;

src/index.ts

// src/index.ts
import { Hono } from 'hono';
import index from './vector';
import OpenAI from 'openai';

const app = new Hono();
const openai = new OpenAI();

async function getEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return response.data[0].embedding;
}

app.post('/documents', async (c) => {
  const { id, text } = await c.req.json();
  const vector = await getEmbedding(text);
  await index.upsert([{
    id,
    values: vector,
    metadata: { text },
  }]);
  return c.json({ id, status: 'indexed' }, 201);
});

app.get('/search', async (c) => {
  const query = c.req.query('q') || '';
  const vector = await getEmbedding(query);
  const results = await index.query({
    vector,
    topK: 5,
    includeMetadata: true,
  });
  return c.json(
    results.matches?.map((m) => ({
      id: m.id,
      score: m.score,
      text: m.metadata?.text,
    })) || []
  );
});

export default app;

Choosing an embedding model

OpenAI text-embedding-3-small offers the best value. If you prefer open-source models, you can generate embeddings locally with sentence-transformers (Python).