Extension

zvec.extension

Modules:

Name	Description
`embedding`
`rerank`

Classes:

Name	Description
`DenseEmbeddingFunction`	Abstract base class for dense vector embedding functions.
`QwenEmbeddingFunction`	Dense embedding function using Qwen (DashScope) Text Embedding API.
`QwenReRanker`	Re-ranker using Qwen (DashScope) LLM-based re-ranking API.
`ReRanker`	Abstract base class for re-ranking search results.
`RrfReRanker`	Re-ranker using Reciprocal Rank Fusion (RRF).
`WeightedReRanker`	Re-ranker that combines scores from multiple vector fields using weights.

Classes

DenseEmbeddingFunction

DenseEmbeddingFunction(dimension: int, data_type: DataType = VECTOR_FP32)

Bases: ABC

Abstract base class for dense vector embedding functions.

Dense embedding functions map text to fixed-length real-valued vectors. Subclasses must implement the embed() method.

Parameters:

Name	Type	Description	Default
`dimension`	`int`	Dimensionality of the output embedding vector.	required
`data_type`	`DataType`	Numeric type of the embedding. Defaults to `DataType.VECTOR_FP32`.	`VECTOR_FP32`

Note

This class is callable: embedding_func("text") is equivalent to embedding_func.embed("text").

Methods:

Name	Description
`embed`	Generate a dense embedding vector for the input text.

Attributes:

Name	Type	Description
`dimension`	`int`	int: The expected dimensionality of the embedding vector.
`data_type`	`DataType`	DataType: The numeric data type of the embedding (e.g., VECTOR_FP32).

Attributes

dimension `property`

dimension: int

int: The expected dimensionality of the embedding vector.

data_type `property`

data_type: DataType

DataType: The numeric data type of the embedding (e.g., VECTOR_FP32).

Functions

embed `abstractmethod`

embed(text: str) -> list[Union[int, float]]

Generate a dense embedding vector for the input text.

Parameters:

Name	Type	Description	Default
`text`	`str`	Input text to embed.	required

Returns:

Type	Description
`list[Union[int, float]]`	list[Union[int, float]]: A list of numbers representing the embedding. Length must equal `self.dimension`.

QwenEmbeddingFunction

QwenEmbeddingFunction(
    dimension: int, model: str = "text-embedding-v4", api_key: Optional[str] = None
)

Bases: DenseEmbeddingFunction

Dense embedding function using Qwen (DashScope) Text Embedding API.

This implementation uses the DashScope service to generate embeddings via Qwen's text embedding models (e.g., text-embedding-v4).

Parameters:

Name	Type	Description	Default
`dimension`	`int`	Desired embedding dimension (e.g., 1024).	required
`model`	`str`	DashScope embedding model name. Defaults to `"text-embedding-v4"`.	`'text-embedding-v4'`
`api_key`	`Optional[str]`	DashScope API key. If not provided, reads from `DASHSCOPE_API_KEY` environment variable.	`None`

Raises:

Type	Description
`ValueError`	If API key is missing or input text is invalid.

Note

Requires the dashscope Python package. Embedding results are cached using functools.lru_cache (maxsize=10).

Methods:

Name	Description
`embed`	Generate embedding for a given text using Qwen (via DashScope).

Attributes:

Name	Type	Description
`dimension`	`int`	int: The expected dimensionality of the embedding vector.
`data_type`	`DataType`	DataType: The numeric data type of the embedding (e.g., VECTOR_FP32).
`model`	`str`	str: The DashScope embedding model name in use.

Attributes

dimension `property`

dimension: int

int: The expected dimensionality of the embedding vector.

data_type `property`

data_type: DataType

DataType: The numeric data type of the embedding (e.g., VECTOR_FP32).

model `property`

model: str

str: The DashScope embedding model name in use.

Functions

embed `cached`

embed(text: str) -> list[Union[int, float]]

Generate embedding for a given text using Qwen (via DashScope).

Parameters:

Name	Type	Description	Default
`text`	`str`	Input text to embed. Must be non-empty and valid string.	required

Returns:

Type	Description
`list[Union[int, float]]`	list[Union[int, float]]: The dense embedding vector.

Raises:

Type	Description
`ValueError`	If input is invalid or API response is malformed.
`RuntimeError`	If network or internal error occurs during API call.

QwenReRanker

QwenReRanker(
    query: Optional[str] = None,
    topn: int = 10,
    rerank_field: Optional[str] = None,
    model: str = "gte-rerank-v2",
    api_key: Optional[str] = None,
)

Bases: ReRanker

Re-ranker using Qwen (DashScope) LLM-based re-ranking API.

This re-ranker sends documents to the DashScope TextReRank service for cross-encoder style re-ranking based on semantic relevance to the query.

Parameters:

Name	Type	Description	Default
`query`	`str`	Query text for semantic re-ranking. Required.	`None`
`topn`	`int`	Number of top documents to return. Defaults to 10.	`10`
`rerank_field`	`str`	Field name containing document text for re-ranking. Required.	`None`
`model`	`str`	DashScope re-ranking model name. Defaults to `"gte-rerank-v2"`.	`'gte-rerank-v2'`
`api_key`	`Optional[str]`	DashScope API key. If not provided, reads from `DASHSCOPE_API_KEY` environment variable.	`None`

Raises:

Type	Description
`ValueError`	If `query` is missing, `rerank_field` is missing, or API key is not provided.

Note

Requires the dashscope Python package. Documents without content in rerank_field are skipped.

Methods:

Name	Description
`rerank`	Re-rank documents using Qwen's TextReRank API.

Attributes:

Name	Type	Description
`topn`	`int`	int: Number of top documents to return after re-ranking.
`query`	`str`	str: Query text used for re-ranking.
`rerank_field`	`Optional[str]`	Optional[str]: Field name used as re-ranking input.
`model`	`str`	str: DashScope re-ranking model name.

Attributes

topn `property`

topn: int

int: Number of top documents to return after re-ranking.

query `property`

query: str

str: Query text used for re-ranking.

rerank_field `property`

rerank_field: Optional[str]

Optional[str]: Field name used as re-ranking input.

model `property`

model: str

str: DashScope re-ranking model name.

Functions

rerank

rerank(query_results: dict[str, list[Doc]]) -> list[Doc]

Re-rank documents using Qwen's TextReRank API.

Parameters:

Name	Type	Description	Default
`query_results`	`dict[str, list[Doc]]`	Results from vector search.	required

Returns:

Type	Description
`list[Doc]`	list[Doc]: Re-ranked documents with relevance scores from Qwen.

Raises:

Type	Description
`ValueError`	If API call fails or no valid documents are found.

ReRanker

ReRanker(query: Optional[str] = None, topn: int = 10, rerank_field: Optional[str] = None)

Bases: ABC

Abstract base class for re-ranking search results.

Re-rankers refine the output of one or more vector queries by applying a secondary scoring strategy. They are used in the query() method of Collection via the reranker parameter.

Parameters:

Name	Type	Description	Default
`query`	`Optional[str]`	Query text used for re-ranking. Required for LLM-based re-rankers. Defaults to None.	`None`
`topn`	`int`	Number of top documents to return after re-ranking. Defaults to 10.	`10`
`rerank_field`	`Optional[str]`	Field name used as input for re-ranking (e.g., document title or body). Defaults to None.	`None`

Note

Subclasses must implement the rerank() method.

Methods:

Name	Description
`rerank`	Re-rank documents from one or more vector queries.

Attributes:

Name	Type	Description
`topn`	`int`	int: Number of top documents to return after re-ranking.
`query`	`str`	str: Query text used for re-ranking.
`rerank_field`	`Optional[str]`	Optional[str]: Field name used as re-ranking input.

Attributes

topn `property`

topn: int

int: Number of top documents to return after re-ranking.

query `property`

query: str

str: Query text used for re-ranking.

rerank_field `property`

rerank_field: Optional[str]

Optional[str]: Field name used as re-ranking input.

Functions

rerank `abstractmethod`

rerank(query_results: dict[str, list[Doc]]) -> list[Doc]

Re-rank documents from one or more vector queries.

Parameters:

Name	Type	Description	Default
`query_results`	`dict[str, list[Doc]]`	Mapping from vector field name to list of retrieved documents (sorted by relevance).	required

Returns:

Type	Description
`list[Doc]`	list[Doc]: Re-ranked list of documents (length ≤ `topn`), with updated `score` fields.

RrfReRanker

RrfReRanker(
    query: Optional[str] = None,
    topn: int = 10,
    rerank_field: Optional[str] = None,
    rank_constant: int = 60,
)

Bases: ReRanker

Re-ranker using Reciprocal Rank Fusion (RRF).

RRF combines results from multiple queries without requiring relevance scores. It assigns higher weight to documents that appear early in multiple result lists.

The RRF score for a document at rank r is: 1 / (k + r + 1), where k is the rank constant.

Parameters:

Name	Type	Description	Default
`query`	`Optional[str]`	Ignored by RRF. Defaults to None.	`None`
`topn`	`int`	Number of top documents to return. Defaults to 10.	`10`
`rerank_field`	`Optional[str]`	Ignored by RRF. Defaults to None.	`None`
`rank_constant`	`int`	Smoothing constant `k` in RRF formula. Larger values reduce the impact of early ranks. Defaults to 60.	`60`

Methods:

Name	Description
`rerank`	Apply Reciprocal Rank Fusion to combine multiple query results.

Attributes:

Name	Type	Description
`topn`	`int`	int: Number of top documents to return after re-ranking.
`query`	`str`	str: Query text used for re-ranking.
`rerank_field`	`Optional[str]`	Optional[str]: Field name used as re-ranking input.

Attributes

topn `property`

topn: int

int: Number of top documents to return after re-ranking.

query `property`

query: str

str: Query text used for re-ranking.

rerank_field `property`

rerank_field: Optional[str]

Optional[str]: Field name used as re-ranking input.

Functions

rerank

rerank(query_results: dict[str, list[Doc]]) -> list[Doc]

Apply Reciprocal Rank Fusion to combine multiple query results.

Parameters:

Name	Type	Description	Default
`query_results`	`dict[str, list[Doc]]`	Results from one or more vector queries.	required

Returns:

Type	Description
`list[Doc]`	list[Doc]: Re-ranked documents with RRF scores in the `score` field.

WeightedReRanker

WeightedReRanker(
    query: Optional[str] = None,
    topn: int = 10,
    rerank_field: Optional[str] = None,
    metric: MetricType = L2,
    weights: Optional[dict[str, float]] = None,
)

Bases: ReRanker

Re-ranker that combines scores from multiple vector fields using weights.

Each vector field's relevance score is normalized based on its metric type, then scaled by a user-provided weight. Final scores are summed across fields.

Parameters:

Name	Type	Description	Default
`query`	`Optional[str]`	Ignored. Defaults to None.	`None`
`topn`	`int`	Number of top documents to return. Defaults to 10.	`10`
`rerank_field`	`Optional[str]`	Ignored. Defaults to None.	`None`
`metric`	`MetricType`	Distance metric used for score normalization. Defaults to `MetricType.L2`.	`L2`
`weights`	`Optional[dict[str, float]]`	Weight per vector field. Fields not listed use weight 1.0. Defaults to None.	`None`

Note

Supported metrics: L2, IP, COSINE. Scores are normalized to [0, 1].

Methods:

Name	Description
`rerank`	Combine scores from multiple vector fields using weighted sum.

Attributes:

Name	Type	Description
`topn`	`int`	int: Number of top documents to return after re-ranking.
`query`	`str`	str: Query text used for re-ranking.
`rerank_field`	`Optional[str]`	Optional[str]: Field name used as re-ranking input.
`weights`	`dict[str, float]`	dict[str, float]: Weight mapping for vector fields.
`metric`	`MetricType`	MetricType: Distance metric used for score normalization.

Attributes

topn `property`

topn: int

int: Number of top documents to return after re-ranking.

query `property`

query: str

str: Query text used for re-ranking.

rerank_field `property`

rerank_field: Optional[str]

Optional[str]: Field name used as re-ranking input.

weights `property`

weights: dict[str, float]

dict[str, float]: Weight mapping for vector fields.

metric `property`

metric: MetricType

MetricType: Distance metric used for score normalization.

Functions

rerank

rerank(query_results: dict[str, list[Doc]]) -> list[Doc]

Combine scores from multiple vector fields using weighted sum.

Parameters:

Name	Type	Description	Default
`query_results`	`dict[str, list[Doc]]`	Results per vector field.	required

Returns:

Type	Description
`list[Doc]`	list[Doc]: Re-ranked documents with combined scores in `score` field.

Extension

zvec.extension

Classes

DenseEmbeddingFunction

dimension

data_type

Attributes

dimension property

data_type property

Functions

embed abstractmethod

QwenEmbeddingFunction

dimension

model

api_key

Attributes

dimension property

data_type property

model property

Functions

embed cached

QwenReRanker

query

topn

rerank_field

model

api_key

Attributes

topn property

query property

rerank_field property

model property

Functions

rerank

ReRanker

query

topn

rerank_field

Attributes

topn property

query property

rerank_field property

Functions

rerank abstractmethod

RrfReRanker

query

topn

rerank_field

rank_constant

Attributes

topn property

query property

rerank_field property

Functions

rerank

WeightedReRanker

query

topn

rerank_field

metric

weights

Attributes

topn property

query property

rerank_field property

weights property

metric property

Functions

rerank

`dimension`

`data_type`

dimension `property`

data_type `property`

embed `abstractmethod`

`dimension`

`model`

`api_key`

dimension `property`

data_type `property`

model `property`

embed `cached`

`query`

`topn`

`rerank_field`

`model`

`api_key`

topn `property`

query `property`

rerank_field `property`

model `property`

`query`

`topn`

`rerank_field`

topn `property`

query `property`

rerank_field `property`

rerank `abstractmethod`

`query`

`topn`

`rerank_field`

`rank_constant`

topn `property`

query `property`

rerank_field `property`

`query`

`topn`

`rerank_field`

`metric`

`weights`

topn `property`

query `property`

rerank_field `property`

weights `property`

metric `property`