Flat Index
The Baseline for Similarity Search
How It Works
Performs an exact (brute-force) similarity search by comparing the query vector against every vector in the dataset.
When to Use a Flat Index
- ✅ Small datasets
- ✅ Prototyping and experimentation
- ✅ Evaluation baselines
- ✅ Scenarios where 100% recall is non-negotiable
Best Practice: Start with Flat Index during development and testing — it's your reliability anchor. Once you validate your approach, consider approximate indexes (like HNSW) for production-scale performance. Use the Flat index when working with tiny datasets (e.g., under 300k vectors) where correctness outweighs speed.
Advantages
- ✨ Perfect Recall Guarantee — Finds true nearest neighbors
- ✨ Zero Configuration — Simple setup with no tuning required
- ✨ Instant Indexing — Build time is virtually immediate
Limitations
⚠️ Search latency grows linearly with dataset size — making it impractical for large-scale workloads.
Vector Index
Specialized data structures that accelerate similarity search — trading a small amount of recall for massive gains in speed and scalability.
HNSW Index
A graph-based index and the go-to choice for low-latency approximate nearest neighbor search — delivering state-of-the-art speed and recall at the cost of higher memory usage.