Collection Class

zvec.create_and_open

create_and_open(
    path: str, schema: CollectionSchema, option: Optional[CollectionOption] = None
) -> Collection

Create a new collection and open it for use.

If a collection already exists at the given path, it may raise an error depending on the underlying implementation.

Parameters:

Name	Type	Description	Default
`path`	`str`	Path or name of the collection to create.	required
`schema`	`CollectionSchema`	Schema defining the structure of the collection.	required
`option`	`CollectionOption`	Configuration options for opening the collection. Defaults to a default-constructed `CollectionOption()` if not provided.	`None`

Returns:

Name	Type	Description
`Collection`	`Collection`	An opened collection instance ready for operations.

Examples:

>>> import zvec
>>> schema = zvec.CollectionSchema(
...     name="my_collection",
...     fields=[zvec.FieldSchema("id", zvec.DataType.INT64, nullable=True)]
... )
>>> coll = create_and_open("./my_collection", schema)

zvec.open

open(path: str, option: CollectionOption = CollectionOption()) -> Collection

Open an existing collection from disk.

The collection must have been previously created with create_and_open.

Parameters:

Name	Type	Description	Default
`path`	`str`	Path or name of the existing collection.	required
`option`	`CollectionOption`	Configuration options for opening the collection. Defaults to a default-constructed `CollectionOption()` if not provided.	`CollectionOption()`

Returns:

Name	Type	Description
`Collection`	`Collection`	An opened collection instance.

Examples:

>>> import zvec
>>> coll = zvec.open("./my_collection")

zvec.model.collection.Collection

Collection(obj: _Collection)

Represents an opened collection in Zvec.

A Collection provides methods for data definition (DDL), data manipulation (DML), and querying (DQL). It is obtained via create_and_open() or open().

This class is not meant to be instantiated directly; use factory functions instead.

Methods:

Name	Description
`destroy`	Permanently delete the collection from disk.
`flush`	Force all pending writes to disk.
`create_index`	Create an index on a field.
`drop_index`	Remove the index from a field.
`optimize`	Optimize the collection (e.g., merge segments, rebuild index).
`add_column`	Add a new column to the collection.
`drop_column`	Remove a column from the collection.
`alter_column`	Rename a column, update its schema.
`insert`	Insert new documents into the collection.
`upsert`	Insert new documents or update existing ones by ID.
`update`	Update existing documents by ID.
`delete`	Delete documents by ID.
`delete_by_filter`	Delete documents matching a filter expression.
`fetch`	Retrieve documents by ID.
`query`	Perform vector similarity search with optional filtering and re-ranking.

Attributes:

Name	Type	Description
`path`	`str`	str: The filesystem path of the collection.
`option`	`CollectionOption`	CollectionOption: The options used to open the collection.
`schema`	`CollectionSchema`	CollectionSchema: The schema defining the structure of the collection.
`stats`	`CollectionStats`	CollectionStats: Runtime statistics about the collection (e.g., doc count, size).

Attributes

path `property`

path: str

str: The filesystem path of the collection.

option `property`

option: CollectionOption

CollectionOption: The options used to open the collection.

schema `property`

schema: CollectionSchema

CollectionSchema: The schema defining the structure of the collection.

stats `property`

stats: CollectionStats

CollectionStats: Runtime statistics about the collection (e.g., doc count, size).

Functions

destroy

destroy() -> None

Permanently delete the collection from disk.

Warning

This operation is irreversible. All data will be lost.

flush

flush() -> None

Force all pending writes to disk.

Ensures durability of recent inserts/updates.

create_index

create_index(
    field_name: str,
    index_param: Union[HnswIndexParam, IVFIndexParam, FlatIndexParam, InvertIndexParam],
    option: IndexOption = IndexOption(),
) -> None

Create an index on a field.

Vector index types (HNSW, IVF, FLAT) can only be applied to vector fields. Inverted index (InvertIndexParam) is for scalar fields.

Parameters:

Name	Type	Description	Default
`field_name`	`str`	Name of the field to index.	required
`index_param`	`Union[HnswIndexParam, IVFIndexParam, FlatIndexParam, InvertIndexParam]`	Index configuration.	required
`option`	`Optional[IndexOption]`	Index creation options. Defaults to `IndexOption()`.	`IndexOption()`

Raises:

Type	Description
`ValueError`	If a vector index is applied to a non-vector field.

drop_index

drop_index(field_name: str) -> None

Remove the index from a field.

Parameters:

Name	Type	Description	Default
`field_name`	`str`	Name of the indexed field.	required

optimize

optimize(option: OptimizeOption = OptimizeOption()) -> None

Optimize the collection (e.g., merge segments, rebuild index).

Parameters:

Name	Type	Description	Default
`option`	`Optional[OptimizeOption]`	Optimization options. Defaults to `OptimizeOption()`.	`OptimizeOption()`

add_column

add_column(
    field_schema: FieldSchema, expression: str = "", option: AddColumnOption = AddColumnOption()
) -> None

Add a new column to the collection.

The column is populated using the provided expression (e.g., SQL-like formula).

Parameters:

Name	Type	Description	Default
`field_schema`	`FieldSchema`	Schema definition for the new column.	required
`expression`	`str`	Expression to compute values for existing documents.	`''`
`option`	`Optional[AddColumnOption]`	Options for the operation. Defaults to `AddColumnOption()`.	`AddColumnOption()`

drop_column

drop_column(field_name: str) -> None

Remove a column from the collection.

Parameters:

Name	Type	Description	Default
`field_name`	`str`	Name of the column to drop.	required

alter_column

alter_column(
    old_name: str,
    new_name: Optional[str] = None,
    field_schema: Optional[FieldSchema] = None,
    option: AlterColumnOption = AlterColumnOption(),
) -> None

Rename a column, update its schema.

This method supports three atomic operations

Rename only (when field_schema is None).
Modify schema only (when new_name is None or empty string).

Parameters:

Name	Type	Description	Default
`old_name`	`str`	The current name of the column to be altered.	required
`new_name`	`Optional[str]`	The new name for the column. - If provided and non-empty, the column will be renamed. - If `None` or empty string, no rename occurs.	`None`
`field_schema`	`Optional[FieldSchema]`	The new schema definition. - If provided, the column's type, dimension, or other properties will be updated. - If `None`, only renaming (if requested) is performed.	`None`
`option`	`AlterColumnOption`	Options controlling the alteration behavior. Defaults to `AlterColumnOption()`.	`AlterColumnOption()`

Limitation: This operation only supports scalar numeric columns. such as: - DOUBLE, FLOAT, - INT32, INT64, UINT32, UINT64

Note

Schema modification may trigger data migration or index rebuild.

Examples:

>>> # Rename column only
>>> results = collection.alter_column(old_name="id", new_name="doc_id")

>>> # Modify schema only
>>> new_schema = FieldSchema(name="doc_id", dtype=DataType.INT64)
>>> collection.alter_column("id", field_schema=new_schema)

insert

insert(docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Insert new documents into the collection.

Documents must have unique IDs and conform to the schema.

Parameters:

Name	Type	Description	Default
`docs`	`Union[Doc, list[Doc]]`	One or more documents to insert.	required

Returns:

Type	Description
`Union[Status, list[Status]]`	Union[Status, list[Status]]: If a single Doc was given, returns its Status;
`Union[Status, list[Status]]`	if a list was given, returns a list of Status objects.

upsert

upsert(docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Insert new documents or update existing ones by ID.

Parameters:

Name	Type	Description	Default
`docs`	`Union[Doc, list[Doc]]`	Documents to upsert.	required

Returns:

Type	Description
`Union[Status, list[Status]]`	Union[Status, list[Status]]: If a single Doc was given, returns its Status;
`Union[Status, list[Status]]`	if a list was given, returns a list of Status objects.

update

update(docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Update existing documents by ID.

Only specified fields are updated; others remain unchanged.

Parameters:

Name	Type	Description	Default
`docs`	`Union[Doc, list[Doc]]`	Documents containing updated fields.	required

Returns:

Type	Description
`Union[Status, list[Status]]`	Union[Status, list[Status]]: If a single Doc was given, returns its Status;
`Union[Status, list[Status]]`	if a list was given, returns a list of Status objects.

delete

delete(ids: Union[str, list[str]]) -> Union[Status, list[Status]]

Delete documents by ID.

Parameters:

Name	Type	Description	Default
`ids`	`Union[str, list[str]]`	One or more document IDs to delete.	required

Returns:

Type	Description
`Union[Status, list[Status]]`	Union[Status, list[Status]]: If a single id was given, returns its Status;
`Union[Status, list[Status]]`	if a list was given, returns a list of Status objects.

delete_by_filter

delete_by_filter(filter: str) -> None

Delete documents matching a filter expression.

Parameters:

Name	Type	Description	Default
`filter`	`str`	Boolean expression (e.g., `"age > 30"`).	required

fetch

fetch(ids: Union[str, list[str]]) -> dict[str, Doc]

Retrieve documents by ID.

Parameters:

Name	Type	Description	Default
`ids`	`Union[str, list[str]]`	Document IDs to fetch.	required

Returns:

Type	Description
`dict[str, Doc]`	dict[str, Doc]: Mapping from ID to document. Missing IDs are omitted.

query

query(
    vectors: Optional[Union[VectorQuery, list[VectorQuery]]] = None,
    *,
    topk: int = 10,
    filter: Optional[str] = None,
    include_vector: bool = False,
    output_fields: Optional[list[str]] = None,
    reranker: Optional[ReRanker] = None
) -> list[Doc]

Perform vector similarity search with optional filtering and re-ranking.

At least one VectorQuery must be provided.

Parameters:

Name	Type	Description	Default
`vectors`	`Optional[Union[VectorQuery, list[VectorQuery]]]`	One or more vector queries. Defaults to None.	`None`
`topk`	`int`	Number of nearest neighbors to return. Defaults to 10.	`10`
`filter`	`Optional[str]`	Boolean expression to pre-filter candidates. Defaults to None.	`None`
`include_vector`	`bool`	Whether to include vector data in results. Defaults to False.	`False`
`output_fields`	`Optional[list[str]]`	Scalar fields to include. If None, all fields are returned. Defaults to None.	`None`
`reranker`	`Optional[ReRanker]`	Re-ranker to refine results. Defaults to None.	`None`

Returns:

Type	Description
`list[Doc]`	list[Doc]: Top-k matching documents, sorted by relevance score.

Examples:

>>> from zvec import VectorQuery
>>> results = collection.query(
...     vectors=VectorQuery("embedding", vector=[0.1, 0.2]),
...     topk=5,
...     filter="category == 'tech'",
...     output_fields=["title", "url"]
... )

Collection Class

zvec.create_and_open

path

schema

option

zvec.open

path

option

zvec.model.collection.Collection

Attributes

path property

option property

schema property

stats property

Functions

destroy

flush

create_index

field_name

index_param

option

drop_index

field_name

optimize

option

add_column

field_schema

expression

option

drop_column

field_name

alter_column

old_name

new_name

field_schema

option

insert

docs

upsert

docs

update

docs

delete

ids

delete_by_filter

filter

fetch

ids

query

vectors

topk

filter

include_vector

output_fields

reranker

`path`

`schema`

`option`

`path`

`option`

path `property`

option `property`

schema `property`

stats `property`

`field_name`

`index_param`

`option`

`field_name`

`option`

`field_schema`

`expression`

`option`

`field_name`

`old_name`

`new_name`

`field_schema`

`option`

`docs`

`docs`

`docs`

`ids`

`filter`

`ids`

`vectors`

`topk`

`filter`

`include_vector`

`output_fields`

`reranker`