Skip to content

HNSW: Hierarchical Navigable Small World

Коротко:

HNSW (Hierarchical Navigable Small World) — graph-based ANN algorithm, самый популярный для vector DB. Строит multi-layer graph: top layer — sparse, bottom — dense. Search: greedy descent от top до bottom. O(log N) complexity. Used в Qdrant, Pinecone, Weaviate, pgvector (opt-in). Parameters: M (connections per node, 16-64), ef_construction (build quality), ef (search quality).

Ниже: подробности, пример, смежные термины, FAQ.

Попробовать бесплатно →

Подробности

  • M: 16 default. Higher = better recall, more RAM
  • ef_construction: 100-500. Index build quality — one-time cost
  • ef: search-time param. Higher = better recall, slower. 32-200 typical
  • Memory: 4-10x vector size due to graph links. 1M × 1536-dim FP32 → ~30 GB
  • Recall: >95% при ef=100 для most datasets

Пример

# pgvector HNSW
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Query tuning
SET hnsw.ef_search = 100;  -- runtime param
SELECT * FROM docs ORDER BY embedding <=> query_vec LIMIT 5;

Смежные термины

Больше по теме

Часто задаваемые вопросы

HNSW vs IVF?

HNSW: best recall + speed, но all в RAM. IVF: cheaper RAM (centroids + buckets), slower recall. Для huge datasets (>100M) — IVF + re-ranking.

HNSW не работает для filtering?

Prefilter может cut graph connectivity. Качественные vector DB (Qdrant, Weaviate) имеют filter-aware HNSW. pgvector 2024 добавил index filtering.

DiskANN альтернатива?

DiskANN — SSD-based ANN. 10× cheaper memory, 2-3× slower. Для billion-scale. Milvus, MyScale поддерживают.