What's the time commitment for this bootcamp?

The bootcamp requires 10 hours per week over 6 weeks. This includes live sessions, hands-on projects, and self-paced learning. Most students find this manageable alongside their full-time jobs.

Do I need prior AI experience to join?

No prior AI experience is required, but you should have few years of software development experience. The bootcamp is designed for software engineers who want to upskill in AI engineering.

What if I can't make a live session?

All live sessions are recorded and available for replay. We also offer multiple office hours throughout the week, so you can catch up on any missed content or get help with assignments.

How much should I budget for APIs and resources?

We estimate €10-50 for the entire bootcamp, covering API costs for OpenAI and other services. We'll show you how to optimize costs and use free tiers when possible.

What happens if I can't attend this cohort?

You can defer to the next cohort at no additional cost. We run cohorts every 2-3 months, so you won't have to wait long to join.

How long will I have access to materials after the bootcamp?

You'll have lifetime access to all course materials, recordings, and the private community. This includes future updates and new content we add to the bootcamp.

What's the refund policy?

Yes, you can get a 100% refund if you've progressed less than 10% of the bootcamp or it's within 7 days of your purchase. We're confident in our curriculum and instructor quality, which is why we offer this guarantee.

Do you offer team discounts?

Yes! We offer 20%+ discounts for teams of 3 or more. Contact us at param@learnwithparam.com for team pricing and bulk enrollment options.

How to Choose Your Vector Database

In our previous posts, we've built RAG pipelines from scratch (see our introduction to RAG and vector databases guide). We treated the "vector database" as a simple box. We'd just dump our embeddings in and pull them out.

But as you move to production, you'll find that this "box" is the most critical component of your RAG system's performance, cost, and scalability.

This post is for you if you're stuck in "analysis paralysis" choosing your database. You've heard of Chroma, Qdrant, Weaviate, pgvector, Pinecone, and Vespa. They all store vectors. How are they different, and how do you choose?

Today, we'll demystify these platforms from an engineer's perspective. This isn't just about features; it's about philosophy, performance, and developer experience (DX) for your specific use case.

The Core Problem: Not All Vector Search is Equal

A vector database has two jobs, and they are often in conflict:

Write Performance: How fast can it "ingest" (embed and index) millions of documents?
Read Performance: How fast and accurately can it find the "Top-K" (e.g., top 5) most relevant chunks for a user's query?

A database that's great at ingesting data quickly might be slower at searching. A database that finds the perfect answer might use too much memory.

Your choice depends on what your application needs most.

graph TD
    A[Your Data 1M+ Docs] --> B[Vector Database]
    B -- "Write Path Ingestion" --> C[1. How fast?<br/>2. How much RAM/Disk?]
    D[User Query] --> B
    B -- "Read Path Search" --> E[1. How fast latency?<br/>2. How accurate recall?]
    
    style B fill:#e3f2fd,stroke:#0d47a1

Let's compare the top players, grouped by their core philosophy.

Category 1: The "Easy Start" Library

1. ChromaDB

Chroma is the "SQLite" of vector databases. It's often the first one engineers use, and for good reason.

Philosophy: "Get started in 30 seconds. No servers, no setup."
Developer Experience (DX): Fantastic. You pip install chromadb and it just works, running in-memory or saving to disk in your project folder.
Best For: Prototyping, development, and small-to-medium Python-native apps.

The "How": It feels like a Python dictionary

The code for Chroma is simple and intuitive. You just create a "collection".

import chromadb

# 1. Create a client. This one just runs in memory.
client = chromadb.Client() 

# 2. Create a "collection"
collection = client.get_or_create_collection(name="my_docs")

# 3. Add documents
collection.add(
    documents=["This is a doc about dogs.", "This is a doc about cats."],
    metadatas=[{"source": "doc1"}, {"source": "doc2"}],
    ids=["1", "2"] # IDs are required
)

# 4. Query it
results = collection.query(
    query_texts=["What is a pet?"],
    n_results=1
)

# results = {'documents': [['This is a document about cats.']]}

Observation: It's fast, simple, and "just works". Its primary "weakness" is that it wasn't built for massive, high-throughput production. When your app needs to handle 1,000 queries per second, you'll outgrow it and need a true server.

Category 2: The "Production-First" standalone server

These are true, dedicated database servers built for performance, reliability, and scale.

2. Qdrant

Qdrant (pronounced "Quadrant") is the "Rust-powered" performance beast.

Philosophy: "Vector search should be fast, reliable, and memory-efficient. Period."
Developer Experience (DX): Excellent. You run Qdrant as a separate Docker container (the server) and your Python app talks to it (the client). It's built in Rust, giving engineers confidence in its performance.
Best For: High-throughput apps and especially fast, filtered search (e.g., "find vectors WHERE color = 'blue'").

The "How": A Client-Server Model

Your code is explicitly connecting to a separate server (e.g., localhost:6333).

graph TD
    A[Your Python App Client] -- HTTP/gRPC --> B[Qdrant Server<br/>Docker Container]
    B -- "manages" --> C[Vector Index<br/>on-disk or in-memory]

from qdrant_client import QdrantClient, models

# 1. Create a client that connects to the server
client = QdrantClient(host="localhost", port=6333)

# 2. Create the collection *on the server*
client.recreate_collection(
    collection_name="my_docs",
    vectors_config=models.VectorParams(size=384, distance=models.Distance.COSINE)
)

# 3. Add documents (points)
client.upsert(
    collection_name="my_docs",
    points=[
        models.PointStruct(id=1, vector=embeddings[0], payload={"source": "doc1"}),
        models.PointStruct(id=2, vector=embeddings[1], payload={"source": "doc2"})
    ]
)

# 4. Query it
hits = client.search(
    collection_name="my_docs",
    query_vector=my_query_embedding,
    limit=1
)

Observation: It's more setup (you have to run a Docker container), but it's built for production scale and gives you fine-grained control over performance.

3. Weaviate

Weaviate is the most "batteries-included" of this group. It's not just a vector database; it's a platform that can also manage your data and even call LLMs for you.

Philosophy: "Let's bundle RAG's hardest parts—vector search, data management, and generative models—into one powerful, scalable database."
Developer Experience (DX): The most high-level. It's a "client-server" model (like Qdrant) but with many built-in modules for things like auto-embedding (text2vec-openai) or RAG (generative-openai).
Best For: Teams that want an "all-in-one" RAG server that handles embedding, hybrid search, and generation in one place.

The "How": A database that is "RAG-aware"

With Weaviate, you can tell the database to do the RAG for you.

graph TD
    A[User Query] --> B[Weaviate Server]
    B -- "1. Hybrid Search Vector + Keyword" --> C[Data & Vector Index]
    C --> B
    B -- "2. Generative Module Optional" --> D[LLM Call]
    D --> B
    B --> E[Final Answer]

import weaviate
import weaviate.classes.config as wvc

client = weaviate.connect_to_local() # Connects to server (e.g., in Docker)

# 1. Define the "schema"
client.collections.create(
    name="MyDocs",
    # This tells Weaviate to auto-embed docs using OpenAI
    vectorizer_config=wvc.Property(vectorizer="text2vec-openai"),
    # This enables the "generative" RAG module
    generative_config=wvc.Generative(generative="generative-openai")
)

# 2. Add documents (Weaviate handles embedding)
collection = client.collections.get("MyDocs")
collection.data.insert_many([
    {"source": "doc1", "content": "This is a document about dogs."},
    {"source": "doc2", "content": "This is a document about cats."}
])

# 3. Query it with RAG!
response = collection.query.generate(
    single_prompt="Answer this based on the context: {content} -- Question: What is a pet?",
    query="What is a pet?",
    limit=1
)

# response.generated = "Based on the context, a pet is a cat."

Observation: Weaviate is an "opinionated" platform. It wants to manage the entire RAG pipeline for you. This is incredibly powerful but means you are buying into the "Weaviate way" of doing RAG. For building complete RAG systems, see our RAG framework comparison.

Category 3: The "Extend Your Stack" solution

4. pgvector

pgvector is not a database. It's an extension for PostgreSQL.

Philosophy: "You already have a production-ready database. Just add vector search to it."
Developer Experience (DX): Amazing for teams already using Postgres. You just run CREATE EXTENSION vector;. Your vector data lives right next to your user data (names, accounts, etc.).
Best For: Teams heavily invested in PostgreSQL. It's perfect for RAG apps where you need to join vector similarity with traditional SQL WHERE clauses (e.g., "find docs matching this vector AND user_id = 123").

The "How": It's just SQL

You just add a new column of type vector.

-- 1. Enable the extension
CREATE EXTENSION vector;

-- 2. Create a table with a vector column
CREATE TABLE items (
    id bigserial PRIMARY KEY,
    content text,
    embedding vector(384) -- Must match your model's dimensions
);

-- 3. Insert your data
-- (You generate the 'embedding_vector' in your Python app first)
INSERT INTO items (content, embedding) VALUES 
('This is about dogs', '[0.1, 0.2, 0.3, ...]'),
('This is about cats', '[0.4, 0.5, 0.6, ...]');

-- 4. Query it
-- (You generate 'query_vector' in your Python app)
SELECT content FROM items
ORDER BY embedding <=> '[0.4, 0.5, 0.6, ...]' -- <=> is the cosine distance operator
LIMIT 1;

-- Output: "This is about cats"

Observation: This is the ultimate "low-friction" solution for existing applications. The trade-off is that pgvector is not as performant as a dedicated, specialized engine like Qdrant, but it's often "good enough" and much simpler to manage.

Category 4: The "Managed & Serverless" cloud DB

5. Pinecone

Pinecone was the first major "serverless" vector database. It's a fully managed cloud service.

Philosophy: "Stop managing servers. Just give us your vectors, and we'll give you a super-fast API endpoint. Pay for what you use."
Developer Experience (DX): The "easiest" production experience. You don't manage Docker, RAM, or CPUs. You just create an "index" on their website and get an API key.
Best For: Teams that want to go to production fast and are willing to pay for a managed service to handle all the infrastructure and scaling.

The "How": A pure API

Your code just talks to a URL. All the infrastructure is hidden.

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="YOUR_API_KEY")

index_name = "my-docs"

# 1. Create an index in the cloud
pc.create_index(
    name=index_name,
    dimension=384,
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-west-2")
)

index = pc.Index(index_name)

# 2. Upsert data
index.upsert(
    vectors=[
        {"id": "1", "values": embeddings[0], "metadata": {"source": "doc1"}},
        {"id": "2", "values": embeddings[1], "metadata": {"source": "doc2"}}
    ]
)

# 3. Query it
results = index.query(
    vector=my_query_embedding,
    top_k=1,
    include_metadata=True
)

# results = {'matches': [{'id': '2', 'metadata': {'source': 'doc2'}, ...}]}

Observation: This is the fastest path from "idea" to "scalable production app". The trade-off is cost and vendor lock-in. You are paying for the convenience of not managing your own servers.

Category 5: The "Big Data" search engine

6. Vespa

Vespa (from Yahoo/Verizon) is the "OG" of this space. It's not just a vector database; it's a complete big data serving engine.

Philosophy: "Modern search is hybrid search. You need keywords (BM25), vectors, and machine-learned ranking, all in one system that can scale to billions of documents."
Developer Experience (DX): The most complex, but the most powerful. It's for large-scale search engineers. You define your data and ranking logic in an XML-based schema.
Best For: Massive-scale, mission-critical applications (think Spotify, Amazon). When RAG is not just a "feature" but the entire product.

The "How": Application schemas

You don't just "add vectors". You define a complete search application.

schema my_docs {
    document {
        field content type string {
            indexing: summary | index
        }
        field embedding type tensor(x[384]) {
            indexing: attribute | index
            attribute {
                distance-metric: cosine
            }
        }
    }
    
    # Define how to rank results
    rank-profile default {
        inputs {
            query(query_embedding) tensor(x[384])
        }
        first-phase {
            expression: cosine_similarity(attribute(embedding), query(query_embedding))
        }
    }
}

Observation: Vespa is in a different league. It's what you use when Chroma or Qdrant are too small. It's a massive, scalable system for true "search engineers" who need to fine-tune every aspect of ranking.

The Engineer's Choice: Head-to-Head

Framework	Philosophy	Best For	Developer Experience (DX)
ChromaDB	"The Python Library"	Prototyping & Small Apps	`pip install`, easy, "SQLite for Vectors"
Qdrant	"The Performance Engine"	Speed & Filtering at Scale	Client-Server (Docker), built in Rust
Weaviate	"The RAG Platform"	All-in-One RAG & Hybrid Search	Client-Server (Docker), "RAG-aware"
pgvector	"The Integrated Extension"	Existing PostgreSQL Users	Just SQL, "low-friction"
Pinecone	"The Managed Service"	Fastest Production (No-Ops)	Pure API, "Serverless for Vectors"
Vespa	"The Search Engine"	Massive Scale Hybrid Search	Complex schemas, "search engineering"

How to Choose: Scenarios & Recommendations

Scenario 1: "I'm a solo dev building a quick prototype for a hackathon."

Choice: ChromaDB.
Reason: You'll be up and running in 5 minutes. No Docker, no servers, no schemas.

Scenario 2: "I'm on a team with a huge, existing PostgreSQL database. I want to add RAG to our existing user data."

Choice: pgvector.
Reason: Stay in your ecosystem. You can add a vector column and write SQL queries that join user data with vector data. It's the simplest, most integrated solution.

Scenario 3: "I'm building a high-performance e-commerce app that needs to find 10 'blue t-shirts' from 50M items, filtered by `size='M'`."

Choice: Qdrant.
Reason: This is a high-speed, high-throughput filtering problem. Qdrant's Rust-based engine and its ability to pre-filter on metadata make it the fastest tool for this job.

Scenario 4: "I'm at a startup. I need to go to production next week, and I don't have a DevOps team to manage a database."

Choice: Pinecone.
Reason: This is a business/time problem. Pinecone (or a managed version of Qdrant/Weaviate) is the fastest path to a scalable, production-ready endpoint. You pay for the convenience.

Scenario 5: "I'm building the next Spotify. I need to serve 100M users with a complex, multi-stage ranking system."

Choice: Vespa.
Reason: Your problem is massive scale and complex, fine-tuned ranking. You are a "search" company, not just a "RAG app". You need the industrial-strength engine.

Challenge for You

Use Case: You are building the "AI Support Bot" from our post on prompt engineering. It needs to search a knowledge base of 100,000 technical manuals.
The "Gotcha": 90% of user queries can be solved by filtering for their specific product model (e.g., model_id: "XPS-13") before doing the semantic search for their error message.
Your Task: Based on this, which of the databases would be the strongest choice, and why?

Key takeaways

Vector databases solve different problems: Choose based on your primary constraint—speed, scale, integration, or convenience
ChromaDB excels at prototyping: Use it when you need to get started quickly with zero setup
Qdrant excels at performance and filtering: Use it when you need fast, filtered vector search at scale
Weaviate excels at all-in-one RAG: Use it when you want a platform that handles embedding, search, and generation
pgvector excels at integration: Use it when you're already using PostgreSQL and want vector search alongside your existing data
Pinecone excels at managed infrastructure: Use it when you want to go to production fast without managing servers
Vespa excels at massive scale: Use it when you need industrial-strength search for billions of documents
Choose based on your primary need: Prototyping (Chroma), performance (Qdrant), all-in-one (Weaviate), integration (pgvector), managed (Pinecone), or scale (Vespa)

For more on building production AI systems, check out our AI Bootcamp for Software Engineers.

How to Choose Your Vector Database

Share this post

The Core Problem: Not All Vector Search is Equal

Category 1: The "Easy Start" Library

1. ChromaDB

The "How": It feels like a Python dictionary

Category 2: The "Production-First" standalone server

2. Qdrant

The "How": A Client-Server Model

3. Weaviate

The "How": A database that is "RAG-aware"

Category 3: The "Extend Your Stack" solution

4. pgvector

The "How": It's just SQL

Category 4: The "Managed & Serverless" cloud DB

5. Pinecone

The "How": A pure API

Category 5: The "Big Data" search engine

6. Vespa

The "How": Application schemas

The Engineer's Choice: Head-to-Head

How to Choose: Scenarios & Recommendations

Scenario 1: "I'm a solo dev building a quick prototype for a hackathon."

Scenario 2: "I'm on a team with a huge, existing PostgreSQL database. I want to add RAG to our existing user data."

Scenario 3: "I'm building a high-performance e-commerce app that needs to find 10 'blue t-shirts' from 50M items, filtered by `size='M'`."

Scenario 4: "I'm at a startup. I need to go to production next week, and I don't have a DevOps team to manage a database."

Scenario 5: "I'm building the next Spotify. I need to serve 100M users with a complex, multi-stage ranking system."

Challenge for You

Key takeaways

Share this post

Continue Reading

The 'Brain' of RAG: A Guide to Embeddings & Vector Databases

Batteries-Included RAG Platforms: Dify vs. RAGFlow vs. Onyx

Choosing Your RAG Framework: LangChain vs. LlamaIndex vs. Haystack

Choosing Your AI Agent Framework: LangGraph vs. LlamaIndex vs. CrewAI vs. AutoGen

Prompt Engineering: From Generic Bot to Expert Agent

How to Choose Your Vector Database

Share this post

Share this post

Continue Reading

The 'Brain' of RAG: A Guide to Embeddings & Vector Databases

Batteries-Included RAG Platforms: Dify vs. RAGFlow vs. Onyx

Choosing Your RAG Framework: LangChain vs. LlamaIndex vs. Haystack

Choosing Your AI Agent Framework: LangGraph vs. LlamaIndex vs. CrewAI vs. AutoGen

Prompt Engineering: From Generic Bot to Expert Agent

Weekly Bytes of AI — Newsletter by Param