Collection Class
zvec.create_and_open
create_and_open(
path: str, schema: CollectionSchema, option: Optional[CollectionOption] = None
) -> Collection
Create a new collection and open it for use.
If a collection already exists at the given path, it may raise an error depending on the underlying implementation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Path or name of the collection to create. |
required |
|
CollectionSchema
|
Schema defining the structure of the collection. |
required |
|
CollectionOption
|
Configuration options
for opening the collection. Defaults to a default-constructed
|
None
|
Returns:
| Name | Type | Description |
|---|---|---|
Collection |
Collection
|
An opened collection instance ready for operations. |
Examples:
>>> import zvec
>>> schema = zvec.CollectionSchema(
... name="my_collection",
... fields=[zvec.FieldSchema("id", zvec.DataType.INT64, nullable=True)]
... )
>>> coll = create_and_open("./my_collection", schema)
zvec.open
open(path: str, option: CollectionOption = CollectionOption()) -> Collection
Open an existing collection from disk.
The collection must have been previously created with create_and_open.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Path or name of the existing collection. |
required |
|
CollectionOption
|
Configuration options
for opening the collection. Defaults to a default-constructed
|
CollectionOption()
|
Returns:
| Name | Type | Description |
|---|---|---|
Collection |
Collection
|
An opened collection instance. |
Examples:
>>> import zvec
>>> coll = zvec.open("./my_collection")
zvec.model.collection.Collection
Collection(obj: _Collection)
Represents an opened collection in Zvec.
A Collection provides methods for data definition (DDL), data manipulation (DML),
and querying (DQL). It is obtained via create_and_open() or open().
This class is not meant to be instantiated directly; use factory functions instead.
Methods:
| Name | Description |
|---|---|
destroy |
Permanently delete the collection from disk. |
flush |
Force all pending writes to disk. |
create_index |
Create an index on a field. |
drop_index |
Remove the index from a field. |
optimize |
Optimize the collection (e.g., merge segments, rebuild index). |
add_column |
Add a new column to the collection. |
drop_column |
Remove a column from the collection. |
alter_column |
Rename a column, update its schema. |
insert |
Insert new documents into the collection. |
upsert |
Insert new documents or update existing ones by ID. |
update |
Update existing documents by ID. |
delete |
Delete documents by ID. |
delete_by_filter |
Delete documents matching a filter expression. |
fetch |
Retrieve documents by ID. |
query |
Perform vector similarity search with optional filtering and re-ranking. |
Attributes:
| Name | Type | Description |
|---|---|---|
path |
str
|
str: The filesystem path of the collection. |
option |
CollectionOption
|
CollectionOption: The options used to open the collection. |
schema |
CollectionSchema
|
CollectionSchema: The schema defining the structure of the collection. |
stats |
CollectionStats
|
CollectionStats: Runtime statistics about the collection (e.g., doc count, size). |
Attributes
path
property
path: str
str: The filesystem path of the collection.
schema
property
schema: CollectionSchema
CollectionSchema: The schema defining the structure of the collection.
stats
property
stats: CollectionStats
CollectionStats: Runtime statistics about the collection (e.g., doc count, size).
Functions
destroy
destroy() -> None
Permanently delete the collection from disk.
Warning
This operation is irreversible. All data will be lost.
flush
flush() -> None
Force all pending writes to disk.
Ensures durability of recent inserts/updates.
create_index
create_index(
field_name: str,
index_param: Union[HnswIndexParam, IVFIndexParam, FlatIndexParam, InvertIndexParam],
option: IndexOption = IndexOption(),
) -> None
Create an index on a field.
Vector index types (HNSW, IVF, FLAT) can only be applied to vector fields.
Inverted index (InvertIndexParam) is for scalar fields.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Name of the field to index. |
required |
|
Union[HnswIndexParam, IVFIndexParam, FlatIndexParam, InvertIndexParam]
|
Index configuration. |
required |
|
Optional[IndexOption]
|
Index creation options.
Defaults to |
IndexOption()
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If a vector index is applied to a non-vector field. |
drop_index
drop_index(field_name: str) -> None
Remove the index from a field.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Name of the indexed field. |
required |
optimize
optimize(option: OptimizeOption = OptimizeOption()) -> None
Optimize the collection (e.g., merge segments, rebuild index).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Optional[OptimizeOption]
|
Optimization options.
Defaults to |
OptimizeOption()
|
add_column
add_column(
field_schema: FieldSchema, expression: str = "", option: AddColumnOption = AddColumnOption()
) -> None
Add a new column to the collection.
The column is populated using the provided expression (e.g., SQL-like formula).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
FieldSchema
|
Schema definition for the new column. |
required |
|
str
|
Expression to compute values for existing documents. |
''
|
|
Optional[AddColumnOption]
|
Options for the operation.
Defaults to |
AddColumnOption()
|
drop_column
drop_column(field_name: str) -> None
Remove a column from the collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Name of the column to drop. |
required |
alter_column
alter_column(
old_name: str,
new_name: Optional[str] = None,
field_schema: Optional[FieldSchema] = None,
option: AlterColumnOption = AlterColumnOption(),
) -> None
Rename a column, update its schema.
This method supports three atomic operations
- Rename only (when
field_schemais None). - Modify schema only (when
new_nameis None or empty string).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
The current name of the column to be altered. |
required |
|
Optional[str]
|
The new name for the column.
- If provided and non-empty, the column will be renamed.
- If |
None
|
|
Optional[FieldSchema]
|
The new schema definition.
- If provided, the column's type, dimension, or other properties will be updated.
- If |
None
|
|
AlterColumnOption
|
Options controlling the alteration behavior.
Defaults to |
AlterColumnOption()
|
Limitation: This operation only supports scalar numeric columns. such as:
- DOUBLE, FLOAT,
- INT32, INT64, UINT32, UINT64
Note
- Schema modification may trigger data migration or index rebuild.
Examples:
>>> # Rename column only
>>> results = collection.alter_column(old_name="id", new_name="doc_id")
>>> # Modify schema only
>>> new_schema = FieldSchema(name="doc_id", dtype=DataType.INT64)
>>> collection.alter_column("id", field_schema=new_schema)
insert
Insert new documents into the collection.
Documents must have unique IDs and conform to the schema.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Union[Doc, list[Doc]]
|
One or more documents to insert. |
required |
Returns:
| Type | Description |
|---|---|
Union[Status, list[Status]]
|
Union[Status, list[Status]]: If a single Doc was given, returns its Status; |
Union[Status, list[Status]]
|
if a list was given, returns a list of Status objects. |
upsert
Insert new documents or update existing ones by ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Union[Doc, list[Doc]]
|
Documents to upsert. |
required |
Returns:
| Type | Description |
|---|---|
Union[Status, list[Status]]
|
Union[Status, list[Status]]: If a single Doc was given, returns its Status; |
Union[Status, list[Status]]
|
if a list was given, returns a list of Status objects. |
update
Update existing documents by ID.
Only specified fields are updated; others remain unchanged.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Union[Doc, list[Doc]]
|
Documents containing updated fields. |
required |
Returns:
| Type | Description |
|---|---|
Union[Status, list[Status]]
|
Union[Status, list[Status]]: If a single Doc was given, returns its Status; |
Union[Status, list[Status]]
|
if a list was given, returns a list of Status objects. |
delete
Delete documents by ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Union[str, list[str]]
|
One or more document IDs to delete. |
required |
Returns:
| Type | Description |
|---|---|
Union[Status, list[Status]]
|
Union[Status, list[Status]]: If a single id was given, returns its Status; |
Union[Status, list[Status]]
|
if a list was given, returns a list of Status objects. |
delete_by_filter
delete_by_filter(filter: str) -> None
Delete documents matching a filter expression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Boolean expression (e.g., |
required |
fetch
Retrieve documents by ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Union[str, list[str]]
|
Document IDs to fetch. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Doc]
|
dict[str, Doc]: Mapping from ID to document. Missing IDs are omitted. |
query
query(
vectors: Optional[Union[VectorQuery, list[VectorQuery]]] = None,
*,
topk: int = 10,
filter: Optional[str] = None,
include_vector: bool = False,
output_fields: Optional[list[str]] = None,
reranker: Optional[ReRanker] = None
) -> list[Doc]
Perform vector similarity search with optional filtering and re-ranking.
At least one VectorQuery must be provided.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Optional[Union[VectorQuery, list[VectorQuery]]]
|
One or more vector queries. Defaults to None. |
None
|
|
int
|
Number of nearest neighbors to return. Defaults to 10. |
10
|
|
Optional[str]
|
Boolean expression to pre-filter candidates. Defaults to None. |
None
|
|
bool
|
Whether to include vector data in results. Defaults to False. |
False
|
|
Optional[list[str]]
|
Scalar fields to include. If None, all fields are returned. Defaults to None. |
None
|
|
Optional[ReRanker]
|
Re-ranker to refine results. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
list[Doc]
|
list[Doc]: Top-k matching documents, sorted by relevance score. |
Examples:
>>> from zvec import VectorQuery
>>> results = collection.query(
... vectors=VectorQuery("embedding", vector=[0.1, 0.2]),
... topk=5,
... filter="category == 'tech'",
... output_fields=["title", "url"]
... )