Query anonymization for RAG bias mitigation
How to strip names, roles, and demographics from queries before retrieval to reduce RAG bias. The redaction pipeline and the 3 leakage traps to avoid.
Loading...
Bite-sized insights for building production AI systems. Expert guides, real-world patterns, and practical engineering wisdom.
135 posts in total
How to strip names, roles, and demographics from queries before retrieval to reduce RAG bias. The redaction pipeline and the 3 leakage traps to avoid.
Which Python dependency manager should you use for production agent services in 2026? The install speed, lockfile story, and Docker build times compared.
How to build retry logic that handles rate limits, timeouts, and transient failures without burning money. The backoff rules and the 3 errors you must not retry.
How to pick the LLM that grades your LLM. The cost-quality tradeoffs, the calibration check, and why a weaker judge is sometimes the right call.
Why ground truth and relevancy measure different things in RAG evals. When to use each, how to build both datasets, and the 2 metrics that matter most.
How to use Pydantic models to force your RAG planner LLM to return structured steps. The schema, the retry loop, and why plain JSON prompts break in production.
How to test a RAG pipeline for hallucinations systematically. Adversarial prompts, the out-of-scope set, and the metric that catches confabulation.
How to test a RAG pipeline like real software. Unit, integration, and eval tests that catch regressions before they ship. The 3-layer test strategy.
How to fact-check RAG answers with a second LLM pass that verifies every claim against the retrieved context. The prompt, the rejection rule, and the loop.
How LLM-powered query rewriting fixes vague user questions before retrieval. The prompt, the multi-query fan-out, and when rewriting hurts more than helps.
How to filter irrelevant retrieved chunks with a cheap LLM call before the final answer. The prompt, the batch pattern, and the 40 percent noise reduction.
How to pick the right k value for your RAG retriever. The 3-step tuning process, the failure modes of k=3 and k=20, and the sweet spot in between.

Cofounder of AEOsome.com and Chief Mentor at learnwithparam.com with 15+ years building production systems. I've trained 100+ engineers on AI engineering - these programs distill what actually works into structured paths you can follow at your own pace.
Go beyond articles. Build production AI systems with hands-on workshops and our intensive AI Bootcamp.
Cadence, authors, topics, and how to follow along.