Zvec Logo

Inverted File Index (IVF)

A scalable, partition-based, memory-efficient index designed for massive, naturally clustered datasets — at the expense of slower indexing and tuning.

How It Works

IVF operates by partitioning the entire vector space into clusters. The number of clusters is controlled by the parameter n_list (short for "number of lists").

Indexing Phase ⚙️

  1. Clustering: The algorithm first applies a clustering algorithm, creating n_list clusters. Each cluster is represented by its centroid — a central point that best represents all vectors assigned to that cluster.

  2. Assignment: Each vector in the dataset is assigned to the cluster whose centroid is closest to it. The vector is then stored in an inverted list (also called a "bucket") associated with that centroid. The index essentially becomes a mapping from centroids to their associated vectors.

IVF indexing

Query Phase 🔍

  1. Centroid Selection: When a query vector arrives, the system first computes distances between the query and all n_list centroids to identify the n_probe nearest centroids (n_probe is a parameter that controls how many centroids to consider).

  2. Local Search: Instead of scanning the entire dataset of N vectors, the search is restricted to only the vectors stored in the n_probe selected buckets. A brute-force (or refined) search is then performed within those buckets to find the nearest neighbors.

IVF Query

When to Use an IVF Index

  • ✅ Your vector dataset exhibits natural clustering or locality structure
  • ✅ You're working with very large datasets and memory efficiency is critical
  • ✅ You need to tune parameters to achieve optimal performance

Best Practice:

  • IVF works best on datasets with inherent clustering structure.
  • For even greater memory efficiency and scalability, combine it with Product Quantization (PQ).
  • The performance of IVF is highly sensitive to parameters like n_list. As such, IVF is best suited for practitioners who can systematically experiment with, validate, and optimize these settings for their specific data distribution and latency-recall requirements.

Advantages

  1. Compatibility — Often used as a base layer in composite indexes (e.g., IVF-PQ) for further optimization
  2. Scalability — Query time scales approximately as O(N / n_list × n_probe), making it highly efficient for large N when n_list is large and n_probe is small
  3. Memory Efficiency — Stores vectors in compact inverted lists with minimal overhead from centroids and list pointers, typically uses significantly less memory than graph-based methods like HNSW

Trade-offs

  1. ⚠️ Indexing Overhead — Building the index requires clustering, which can be computationally intensive and slower to build than indexes like HNSW
  2. ⚠️ Parameter Sensitivity — Accuracy and latency are highly dependent on the choice of n_list and n_probe

Key Parameters

Index-Time Parameters

ParameterDescriptionTuning Guidance
metric_typeSimilarity metric used to compare vectorsChoose based on how your embeddings were trained
n_listNumber of clusters (inverted lists) — the vector space is partitioned into this many clusters during indexing• Start with n_listN\sqrt{N}, where N is the number of vectors
• Larger n_list → ✨ finer partitioning, smaller buckets, faster search — but ⚠️ higher indexing cost and more centroids to manage
• Smaller n_list → ✨ faster index construction — but ⚠️ larger buckets and slower search
n_itersCentroid refinement iterations — number of passes to optimize cluster centroids during indexing• Higher n_iters
✨ better clustering quality
⚠️ longer index build time
quantize_typeVector quantization method to apply
Defaults to no quantization
See Quantization for more details

Query-Time Parameters

ParameterDescriptionTuning Guidance
n_probeNumber of clusters to search at query time — the system retrieves candidates only from these nearest clusters.• Higher n_probe
✨ higher recall
⚠️ slower queries
radiusDistance (similarity) threshold for range-based filtering — only documents satisfying the threshold are returnedExample:
• With inner product MetricType.IP, set radius=0.6 to keep only results with score > 0.6
✅ Use when: You want to filter out low-quality matches
🚫 Skip when: You want all top-k results, regardless of quality
is_linearForces a brute-force linear search instead of using the index🐌 Very slow for large datasets!
✅ Only use for: Debugging, tiny collections, or verifying index accuracy

On this page