Skip to content

Collection Class

zvec.create_and_open

create_and_open(
    path: str, schema: CollectionSchema, option: Optional[CollectionOption] = None
) -> Collection

Create a new collection and open it for use.

If a collection already exists at the given path, it may raise an error depending on the underlying implementation.

Parameters:

Name Type Description Default

path

str

Path or name of the collection to create.

required

schema

CollectionSchema

Schema defining the structure of the collection.

required

option

CollectionOption

Configuration options for opening the collection. Defaults to a default-constructed CollectionOption() if not provided.

None

Returns:

Name Type Description
Collection Collection

An opened collection instance ready for operations.

Examples:

>>> import zvec
>>> schema = zvec.CollectionSchema(
...     name="my_collection",
...     fields=[zvec.FieldSchema("id", zvec.DataType.INT64, nullable=True)]
... )
>>> coll = create_and_open("./my_collection", schema)

zvec.open

Open an existing collection from disk.

The collection must have been previously created with create_and_open.

Parameters:

Name Type Description Default

path

str

Path or name of the existing collection.

required

option

CollectionOption

Configuration options for opening the collection. Defaults to a default-constructed CollectionOption() if not provided.

CollectionOption()

Returns:

Name Type Description
Collection Collection

An opened collection instance.

Examples:

>>> import zvec
>>> coll = zvec.open("./my_collection")

zvec.model.collection.Collection

Collection(obj: _Collection)

Represents an opened collection in Zvec.

A Collection provides methods for data definition (DDL), data manipulation (DML), and querying (DQL). It is obtained via create_and_open() or open().

This class is not meant to be instantiated directly; use factory functions instead.

Methods:

Name Description
destroy

Permanently delete the collection from disk.

flush

Force all pending writes to disk.

create_index

Create an index on a field.

drop_index

Remove the index from a field.

optimize

Optimize the collection (e.g., merge segments, rebuild index).

add_column

Add a new column to the collection.

drop_column

Remove a column from the collection.

alter_column

Rename a column, update its schema.

insert

Insert new documents into the collection.

upsert

Insert new documents or update existing ones by ID.

update

Update existing documents by ID.

delete

Delete documents by ID.

delete_by_filter

Delete documents matching a filter expression.

fetch

Retrieve documents by ID.

query

Perform vector similarity search with optional filtering and re-ranking.

Attributes:

Name Type Description
path str

str: The filesystem path of the collection.

option CollectionOption

CollectionOption: The options used to open the collection.

schema CollectionSchema

CollectionSchema: The schema defining the structure of the collection.

stats CollectionStats

CollectionStats: Runtime statistics about the collection (e.g., doc count, size).

Attributes

path property

path: str

str: The filesystem path of the collection.

option property

CollectionOption: The options used to open the collection.

schema property

CollectionSchema: The schema defining the structure of the collection.

stats property

stats: CollectionStats

CollectionStats: Runtime statistics about the collection (e.g., doc count, size).

Functions

destroy

destroy() -> None

Permanently delete the collection from disk.

Warning

This operation is irreversible. All data will be lost.

flush

flush() -> None

Force all pending writes to disk.

Ensures durability of recent inserts/updates.

create_index

Create an index on a field.

Vector index types (HNSW, IVF, FLAT) can only be applied to vector fields. Inverted index (InvertIndexParam) is for scalar fields.

Parameters:

Name Type Description Default
field_name
str

Name of the field to index.

required
index_param
Union[HnswIndexParam, IVFIndexParam, FlatIndexParam, InvertIndexParam]

Index configuration.

required
option
Optional[IndexOption]

Index creation options. Defaults to IndexOption().

IndexOption()

Raises:

Type Description
ValueError

If a vector index is applied to a non-vector field.

drop_index

drop_index(field_name: str) -> None

Remove the index from a field.

Parameters:

Name Type Description Default
field_name
str

Name of the indexed field.

required

optimize

optimize(option: OptimizeOption = OptimizeOption()) -> None

Optimize the collection (e.g., merge segments, rebuild index).

Parameters:

Name Type Description Default
option
Optional[OptimizeOption]

Optimization options. Defaults to OptimizeOption().

OptimizeOption()

add_column

add_column(
    field_schema: FieldSchema, expression: str = "", option: AddColumnOption = AddColumnOption()
) -> None

Add a new column to the collection.

The column is populated using the provided expression (e.g., SQL-like formula).

Parameters:

Name Type Description Default
field_schema
FieldSchema

Schema definition for the new column.

required
expression
str

Expression to compute values for existing documents.

''
option
Optional[AddColumnOption]

Options for the operation. Defaults to AddColumnOption().

AddColumnOption()

drop_column

drop_column(field_name: str) -> None

Remove a column from the collection.

Parameters:

Name Type Description Default
field_name
str

Name of the column to drop.

required

alter_column

alter_column(
    old_name: str,
    new_name: Optional[str] = None,
    field_schema: Optional[FieldSchema] = None,
    option: AlterColumnOption = AlterColumnOption(),
) -> None

Rename a column, update its schema.

This method supports three atomic operations
  1. Rename only (when field_schema is None).
  2. Modify schema only (when new_name is None or empty string).

Parameters:

Name Type Description Default
old_name
str

The current name of the column to be altered.

required
new_name
Optional[str]

The new name for the column. - If provided and non-empty, the column will be renamed. - If None or empty string, no rename occurs.

None
field_schema
Optional[FieldSchema]

The new schema definition. - If provided, the column's type, dimension, or other properties will be updated. - If None, only renaming (if requested) is performed.

None
option
AlterColumnOption

Options controlling the alteration behavior. Defaults to AlterColumnOption().

AlterColumnOption()

Limitation: This operation only supports scalar numeric columns. such as: - DOUBLE, FLOAT, - INT32, INT64, UINT32, UINT64

Note
  • Schema modification may trigger data migration or index rebuild.

Examples:

>>> # Rename column only
>>> results = collection.alter_column(old_name="id", new_name="doc_id")
>>> # Modify schema only
>>> new_schema = FieldSchema(name="doc_id", dtype=DataType.INT64)
>>> collection.alter_column("id", field_schema=new_schema)

insert

insert(docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Insert new documents into the collection.

Documents must have unique IDs and conform to the schema.

Parameters:

Name Type Description Default
docs
Union[Doc, list[Doc]]

One or more documents to insert.

required

Returns:

Type Description
Union[Status, list[Status]]

Union[Status, list[Status]]: If a single Doc was given, returns its Status;

Union[Status, list[Status]]

if a list was given, returns a list of Status objects.

upsert

upsert(docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Insert new documents or update existing ones by ID.

Parameters:

Name Type Description Default
docs
Union[Doc, list[Doc]]

Documents to upsert.

required

Returns:

Type Description
Union[Status, list[Status]]

Union[Status, list[Status]]: If a single Doc was given, returns its Status;

Union[Status, list[Status]]

if a list was given, returns a list of Status objects.

update

update(docs: Union[Doc, list[Doc]]) -> Union[Status, list[Status]]

Update existing documents by ID.

Only specified fields are updated; others remain unchanged.

Parameters:

Name Type Description Default
docs
Union[Doc, list[Doc]]

Documents containing updated fields.

required

Returns:

Type Description
Union[Status, list[Status]]

Union[Status, list[Status]]: If a single Doc was given, returns its Status;

Union[Status, list[Status]]

if a list was given, returns a list of Status objects.

delete

delete(ids: Union[str, list[str]]) -> Union[Status, list[Status]]

Delete documents by ID.

Parameters:

Name Type Description Default
ids
Union[str, list[str]]

One or more document IDs to delete.

required

Returns:

Type Description
Union[Status, list[Status]]

Union[Status, list[Status]]: If a single id was given, returns its Status;

Union[Status, list[Status]]

if a list was given, returns a list of Status objects.

delete_by_filter

delete_by_filter(filter: str) -> None

Delete documents matching a filter expression.

Parameters:

Name Type Description Default
filter
str

Boolean expression (e.g., "age > 30").

required

fetch

fetch(ids: Union[str, list[str]]) -> dict[str, Doc]

Retrieve documents by ID.

Parameters:

Name Type Description Default
ids
Union[str, list[str]]

Document IDs to fetch.

required

Returns:

Type Description
dict[str, Doc]

dict[str, Doc]: Mapping from ID to document. Missing IDs are omitted.

query

query(
    vectors: Optional[Union[VectorQuery, list[VectorQuery]]] = None,
    *,
    topk: int = 10,
    filter: Optional[str] = None,
    include_vector: bool = False,
    output_fields: Optional[list[str]] = None,
    reranker: Optional[ReRanker] = None
) -> list[Doc]

Perform vector similarity search with optional filtering and re-ranking.

At least one VectorQuery must be provided.

Parameters:

Name Type Description Default
vectors
Optional[Union[VectorQuery, list[VectorQuery]]]

One or more vector queries. Defaults to None.

None
topk
int

Number of nearest neighbors to return. Defaults to 10.

10
filter
Optional[str]

Boolean expression to pre-filter candidates. Defaults to None.

None
include_vector
bool

Whether to include vector data in results. Defaults to False.

False
output_fields
Optional[list[str]]

Scalar fields to include. If None, all fields are returned. Defaults to None.

None
reranker
Optional[ReRanker]

Re-ranker to refine results. Defaults to None.

None

Returns:

Type Description
list[Doc]

list[Doc]: Top-k matching documents, sorted by relevance score.

Examples:

>>> from zvec import VectorQuery
>>> results = collection.query(
...     vectors=VectorQuery("embedding", vector=[0.1, 0.2]),
...     topk=5,
...     filter="category == 'tech'",
...     output_fields=["title", "url"]
... )