Semantic search — document search by meaning of the query, not by keyword match. Principle: embed query + doc into vectors → cosine similarity → top-k closest docs. Understands synonyms ("car" ≈ "automobile"), conceptual links ("how to fix engine" → docs on motor troubleshooting). Traditional BM25/TF-IDF is keyword-only. Hybrid search: sparse (BM25) + dense (embeddings) + rerank — 2026 best practice.
Below: details, example, related terms, FAQ.
# Hybrid search with Qdrant
curl -X POST http://localhost:6333/collections/docs/points/search/batch \
-d '{
"searches": [
{"vector": {"name": "dense", "vector": [...]}, "limit": 50},
{"vector": {"name": "sparse", "vector": {"indices": [...], "values": [...]}}, "limit": 50}
]
}'No. BM25 is great for exact matches (code, names, rare words). Hybrid (sparse + dense) beats either alone.
Elasticsearch: mature, king of sparse search, added vector in 8+. Qdrant: dense-first, fast Rust. For hybrid — Elasticsearch+vector extension or Weaviate natively.
<100ms for interactive search. HNSW ANN index helps, no full scan. Fine at >1M docs.