Define a Collection Schema
A collection schema CollectionSchema defines the structure that every document inserted into the collection must conform to.
The schema in Zvec is dynamic: you can add or remove scalar fields and vectors at any time without rebuilding the collection.
CollectionSchema has three parts:
name: An identifier for the collection.fields: A list of scalar fields.vectors: A list of vector fields.
A human-readable identifier for your collection. This name is used internally for reference and logging.
Scalar fields store non-vector (i.e., structured) data — such as strings, numbers, booleans, or arrays.
Each field is defined using FieldSchema with the following properties:
name: A unique string identifier for the field within the collection.data_type: The type of data stored — e.g.,STRING,INT64, or array types likeARRAY_STRING.nullable(optional): Whether the field is allowed to have no value (defaults toFalse).index_param(optional): Enables fast filtering by creating an inverted index viaInvertIndexParam.
A vector is defined using VectorSchema with the following properties:
name: A unique string identifier for the vector within the collection.data_type: The numeric format of the vector.- Dense vectors:
VECTOR_FP32,VECTOR_FP16, etc. - Sparse vectors:
SPARSE_VECTOR_FP32,SPARSE_VECTOR_FP16.
- Dense vectors:
dimension: Required for dense vectors — the number of dimensions.index_param: Configures the vector index type and similarity metric.
Choosing Vector Index Type
The index_param allows you to configure the appropriate indexing strategy:
metric_type:COSINE,L2, orIP(inner product) — Ensure your metric matches how your embeddings were trained!quantize_type(optional): Compress vectors to reduce index size and speed up search (with slight recall trade-off)