Bloom filter — probabilistic data structure для проверки "element is probably in set" или "definitely not in set". Использует bit array + multiple hash functions. Memory-efficient (килобайты для millions items), но false positives возможны (no false negatives). Применяется: Cassandra (skip disk reads для missing keys), CDN (dedupe), DB indices, Chrome URL safety checks.
Ниже: подробности, пример, смежные термины, FAQ.
// Redis RedisBloom
> BF.RESERVE users 0.01 100000
> BF.ADD users alice
> BF.EXISTS users alice → 1 (probably)
> BF.EXISTS users bob → 0 (definitely not)Pre-filter перед expensive operations: check before DB query, before CDN miss. Если >95% запросов "not in set" — saves huge load.
Tradeoff: lower FPR = более crupнный filter. 1% FPR = 10 bits/element. 0.1% = 14 bits/element. Выбирайте по downstream cost.
HLL — count unique elements (cardinality), не membership. Разные задачи.