The "Zero Result" Trap.
You are building a search engine for a clothing store. A user searches: "Blue running shoes." Then they apply a filter: "Under $50."
You know you have cheap blue shoes in the database. But the result comes back: "No products found."
Why? You fell into the Post-Filtering Trap.
1. The Mechanics: How retrieval fails
Most engineers implement RAG filters in the wrong order:
The Wrong Way (Post-Filtering):
Vector Search: You ask the DB for the top 100 "Blue running shoes" based on similarity.
The Result: The AI finds the most popular, high-quality matches. They are all $150 Nikes and Adidas.
The Filter: Your Python code loops through these 100 items and removes anything over $50.
The Empty Set: Since all top 100 items were expensive, you filtered them all out. You are left with nothing.
The user thinks you have no stock, but you do. Your retrieval just didn't look deep enough.
The Right Way (Pre-Filtering): You must tell the Vector Database to apply the filter before or during the search, not after. This is computationally harder (it breaks the HNSW graph traversal), but modern databases (Chroma, Qdrant, Pinecone) have solved this with "Native Filtering."
2. The Code: Moving logic to the DB
Don't filter in Python. Filter in the Query.
Bad (Python Post-Filtering):
# 1. Get top 100 (mostly expensive stuff)
results = collection.query(query_texts=["blue shoes"], n_results=100)
# 2. Filter locally (Risk of returning empty list)
cheap_shoes = [r for r in results if r['price'] < 50]
Good (Native Pre-Filtering):
# 1. Tell the DB to restrict the search space FIRST
results = collection.query(
query_texts=["blue shoes"],
n_results=5,
# This 'where' clause runs at the database level
where={
"price": {"$lt": 50}
}
)
# Result: The DB ignores expensive items and searches ONLY the cheap ones.
# You get 5 perfect matches.
If you are using pgvector (Postgres), this means adding a WHERE clause inside your SQL query alongside the vector <=> operator.
3. THE CEREBRAL GYM: Solution & New Puzzle
Yesterday's solution (The Lost Money)
The puzzle was: Transaction A writes $110. Transaction B writes $50. B overwrites A without knowing it.
The Answer: The Lost Update Problem. This happens when isolation levels are too low (e.g., Read Committed). You fix this by using Atomic Updates (UPDATE accounts SET balance = balance - 50) instead of Read-Modify-Write, or by using SELECT FOR UPDATE to lock the row.
Today's puzzle (API Design) You are building a payment API. A user has a shaky internet connection. They click "Pay $20" three times because the UI didn't load. Your server receives 3 distinct POST requests. You accidentally charge them $60 ($20 x 3).
The Question: What is the specific HTTP Header (starts with 'I') you should require clients to send to ensure that even if they retry the request 10 times, you only process the payment once?
(Reply with the header name!)
4. THE PULSE: Industry Signals
pgvector (The Default Choice) If you are already using Postgres, you probably don't need a separate Vector DB like Pinecone.
pgvectorlets you store embeddings right next to your users and orders. It supports Pre-Filtering natively via standard SQLWHEREclauses.Cursor (The AI Editor) If you haven't switched from VS Code yet, try Cursor. It is a fork of VS Code with AI baked into the core. You can highlight code and press
Cmd+Kto "Refactor this to use Pre-Filtering," and it actually understands your entire codebase context.MTEB Leaderboard How do you know which embedding model to use? Don't guess. The Massive Text Embedding Benchmark (MTEB) ranks every model by task (Retrieval, Clustering, Summarization). Check this before you pick
text-embedding-3-small.
5. THE LATENT SPACE
"Order of operations is everything."
In math, (2 + 3) * 4 is very different from 2 + (3 * 4). In search, "Filter then Search" is very different from "Search then Filter." One gives you the right answer. The other gives you an empty page.
Check your where clauses.
See you tomorrow.
Harsh Kathiriya - Query & Context

