What's the time commitment for this bootcamp?

The bootcamp requires 10 hours per week over 6 weeks. This includes live sessions, hands-on projects, and self-paced learning. Most students find this manageable alongside their full-time jobs.

Do I need prior AI experience to join?

No prior AI experience is required, but you should have few years of software development experience. The bootcamp is designed for software engineers who want to upskill in AI engineering.

What if I can't make a live session?

All live sessions are recorded and available for replay. We also offer multiple office hours throughout the week, so you can catch up on any missed content or get help with assignments.

How much should I budget for APIs and resources?

We estimate €10-50 for the entire bootcamp, covering API costs for OpenAI and other services. We'll show you how to optimize costs and use free tiers when possible.

What happens if I can't attend this cohort?

You can defer to the next cohort at no additional cost. We run cohorts every 2-3 months, so you won't have to wait long to join.

How long will I have access to materials after the bootcamp?

You'll have lifetime access to all course materials, recordings, and the private community. This includes future updates and new content we add to the bootcamp.

What's the refund policy?

Yes, you can get a 100% refund if you've progressed less than 10% of the bootcamp or it's within 7 days of your purchase. We're confident in our curriculum and instructor quality, which is why we offer this guarantee.

Do you offer team discounts?

Yes! We offer 20%+ discounts for teams of 3 or more. Contact us at param@learnwithparam.com for team pricing and bulk enrollment options.

Vector Databases and Embeddings: The Brain of RAG

In our last posts, we learned how to build a RAG pipeline and how to "chunk" documents for it. Now, we'll explore the magic at its core: embeddings and vector databases. For building production RAG systems, see our RAG framework comparison.

The fundamental idea: the magic library

Imagine a normal library. Books are organized alphabetically by title or author. If you want to find a book about "cats", you look under "C". This is keyword search.

Now, imagine a magic library. Books are organized by their meaning. A book titled Feline Friends is right next to a book titled The Lion's Roar. A sci-fi novel about Mars is next to a textbook on rocket science.

If you ask this magic librarian for "ways to leave the planet", they can instantly point you to this entire section. This is semantic search.

Embeddings are the "magic coordinates" that give every piece of text a physical location in this library based on its meaning.
A Vector Database is the magic library itself, a building designed to find the closest coordinates to your question instantly.

Part 1: What are embeddings?

An embedding is a vector (a long list of numbers) that represents the semantic meaning of a piece of data. We use a special "embedding model" to turn our text chunks into these vectors.

Open-source embeddings (local & free)

Models from libraries like sentence-transformers run locally on your machine. They are free, private, and give you full control.

from sentence_transformers import SentenceTransformer

# Load a model that runs locally on your machine
oss_embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

sentence = "A feline rested comfortably on the rug."

oss_embedding = oss_embedding_model.encode(sentence)

print(f"Model: all-MiniLM-L6-v2")
print(f"Embedding Dimensions: {oss_embedding.shape}")
print(f"First 5 values: {oss_embedding[:5]}")

# Output:
# Model: all-MiniLM-L6-v2
# Embedding Dimensions: (384,)
# First 5 values: [ 0.0346 -0.0163  0.0349  0.0526 -0.0215]

Proprietary embeddings (API & paid)

Models from providers like OpenAI are accessed via an API. They are often larger and more powerful, but you pay per use and send your data to a third party.

from openai import OpenAI
llm_client = OpenAI(api_key="...")

def get_openai_embedding(text, model="text-embedding-3-small"):
   # Helper function to call the OpenAI API
   text = text.replace("\n", " ")
   return llm_client.embeddings.create(input=[text], model=model).data[0].embedding

sentence = "A feline rested comfortably on the rug."

openai_embedding = get_openai_embedding(sentence)

print(f"Model: text-embedding-3-small")
print(f"Embedding Dimensions: {len(openai_embedding)}")
print(f"First 5 values: {openai_embedding[:5]}")

# Output:
# Model: text-embedding-3-small
# Embedding Dimensions: 1536
# First 5 values: [-0.0152, -0.0205, 0.0078, -0.0461, 0.0026]

Notice the dimensions (the length of the number list) are different: 384 vs. 1536. This is a key trade-off between model size, cost, and quality.

How embeddings capture "meaning"

So, they're lists of numbers. How does that help? We can prove they capture meaning by measuring the "distance" between them. Semantically similar sentences will have vectors that are mathematically "close" to each other.

The most common way to measure this is Cosine Similarity.

from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# 1. Define three sentences
target = "The cat sat on the mat."
similar = "A feline relaxed on the rug."
dissimilar = "The rocket launched into space."

# 2. Embed all three sentences
embeddings = oss_embedding_model.encode([target, similar, dissimilar])

# 3. Reshape for the similarity function
target_emb = embeddings[0].reshape(1, -1)
similar_emb = embeddings[1].reshape(1, -1)
dissimilar_emb = embeddings[2].reshape(1, -1)

# 4. Calculate similarity scores
sim_score = cosine_similarity(target_emb, similar_emb)[0][0]
dissim_score = cosine_similarity(target_emb, dissimilar_emb)[0][0]

print(f"Similarity (cat, feline): {sim_score:.4f}")
print(f"Similarity (cat, rocket): {dissim_score:.4f}")

# Output:
# Similarity (cat, feline): 0.8263
# Similarity (cat, rocket): 0.0816

The high score (0.82) proves the model "understands" that "cat" and "feline" are related. The low score (0.08) shows it knows "cat" and "rocket" are not. This is the simple, powerful math behind RAG's retriever.

Part 2: What are vector databases?

Now that we have millions of these vector embeddings, we need a special database to store them and search them instantly. This is a Vector Database.

It takes your query (e.g., "What did astronauts do?"), embeds it, and then searches its "magic library" for the document vectors with the closest coordinates.

Let's use ChromaDB, a popular, easy-to-use vector database.

import chromadb
chroma_client = chromadb.Client()

# 1. Create a "collection" (our library shelf)
collection = chroma_client.get_or_create_collection(name="history_facts")

# 2. Add documents (Chroma handles embedding them for us!)
documents = [
    "The Apollo 11 mission successfully landed the first humans on the Moon.",
    "The Hubble Space Telescope has provided some of the most detailed images of distant galaxies.",
    "The Great Wall of China is a series of fortifications stretching over 13,000 miles.",
    "The Roman Colosseum was used for gladiatorial contests and public spectacles."
]

collection.add(
    documents=documents,
    ids=["id_1", "id_2", "id_3", "id_4"] # Every document needs a unique ID
)

# 3. Query the collection
query = "What did astronauts do in space?"

results = collection.query(
    query_texts=[query],
    n_results=1 # Ask for the single best result
)

print(results['documents'])
# Output:
# [['The Apollo 11 mission successfully landed the first humans on the Moon.']]

The real power: Vector Search + Metadata Filtering

In a real application, you never just use semantic search alone. You combine it with traditional metadata filtering.

This is the difference between asking the magic librarian:

"Find me books about adventure." (Semantic search)
"Find me books about adventure, but only in the 'Sci-Fi' section, and only those with a 5-star rating." (Hybrid Search)

Vector databases are built for this.

movie_collection = chroma_client.get_or_create_collection(name="movie_reviews")

# 1. Add documents WITH metadata
reviews = [
    "A thrilling journey through space to save humanity.",
    "Two old friends embark on a road trip and rediscover their bond.",
    "In a galaxy far away, a hero rises to fight an evil empire.",
    "A heartwarming tale of a boy and his dog on a cross-country trek."
]

metadata = [
    {'genre': 'Sci-Fi', 'rating': 5},
    {'genre': 'Drama', 'rating': 4},
    {'genre': 'Sci-Fi', 'rating': 5},
    {'genre': 'Family', 'rating': 4}
]

movie_collection.add(
    documents=reviews,
    metadatas=metadata,
    ids=["review_1", "review_2", "review_3", "review_4"]
)

# 2. Define our query
query = "A story about friendship and adventure"

# 3. Run a PURE semantic search
# This will likely return the 'road trip' and 'boy and his dog' reviews.
semantic_results = movie_collection.query(
    query_texts=[query], 
    n_results=2
)
print(f"Semantic Search Results: {semantic_results['documents']}")

# 4. Run a FILTERED search
# This finds things that *sound like* our query, but *only* in the 'Sci-Fi' genre.
filtered_results = movie_collection.query(
    query_texts=[query],
    n_results=2,
    where={"genre": "Sci-Fi"}  # <-- This is the metadata filter!
)
print(f"Filtered Search Results: {filtered_results['documents']}")

# Output:
# Semantic Search Results: [['Two old friends embark on a road trip...', 'A heartwarming tale of a boy and his dog...']]
# Filtered Search Results: [['A thrilling journey through space to save humanity.', 'In a galaxy far away, a hero rises...']]

This is the "superpower" of a production-grade RAG system: combining fuzzy, meaning-based search with precise, rule-based filtering.

Key takeaways

Embeddings turn meaning into math: They are the bridge that allows computers to understand and compare the "meaning" of data
Model choice is a trade-off: Your choice of embedding model (open-source vs. proprietary) impacts cost, performance, and privacy
Vector DBs are the "magic library": They are specialized databases designed to store and search billions of embeddings at high speed
Metadata is crucial for production: Real-world RAG systems almost always combine semantic search (what it means) with metadata filtering (what it is)

For more on building production AI systems, check out our AI Engineering Bootcamp.

Vector Databases and Embeddings: The Brain of RAG

Share this post

The fundamental idea: the magic library

Part 1: What are embeddings?

Open-source embeddings (local & free)

Proprietary embeddings (API & paid)

How embeddings capture "meaning"

Part 2: What are vector databases?

The real power: Vector Search + Metadata Filtering

Key takeaways

Share this post

Continue Reading

Domain-Specific Voice Flows: Building the Guardrails

Multi-Agent Voice Systems: The Warm Transfer

Voice Conversation Memory: Why Your Bot Forgets Who You Are

Voice AI Fundamentals: The 500ms Threshold

Browser Automation: Building Agents That See and Click

Vector Databases and Embeddings: The Brain of RAG

Share this post

Share this post

Continue Reading

Domain-Specific Voice Flows: Building the Guardrails

Multi-Agent Voice Systems: The Warm Transfer

Voice Conversation Memory: Why Your Bot Forgets Who You Are

Voice AI Fundamentals: The 500ms Threshold

Browser Automation: Building Agents That See and Click

Weekly Bytes of AI