Insert Documents
Use the insert() method to add one or more new documents (Doc) to a collection.
Performance Tip:
New vectors are initially buffered for fast ingestion. For optimal search performance, call optimize() after inserting a large batch of documents.
Document Doc
Each Doc passed to insert() must:
- Have a unique
id(not already present in the collection) - Provide data that matches the collection's schema:
- Scalar fields go in the
fieldsdictionary (field names as keys) - Vector embeddings go in the
vectorsdictionary (vector names as keys)
- Scalar fields go in the
- You can omit
nullablescalar fields if a document doesn't have a value for them
If a document with the same id already exists in the collection, the insertion will fail for that document.
To overwrite existing documents or insert without checking, use upsert() instead.
Insert a Single Document
Assume you already have a collection with the following schema:
- A scalar field:
text(string) - A dense vector embedding:
text_embedding(4-dimensional FP32 vector)The 4-dimensional vector is for demonstration only — real-world embeddings are usually much larger.
You've also opened the collection and have a collection object ready.
Now, insert a document like this:
import zvec
# Create a document
doc = zvec.Doc(
id="text_1", # ← must be unique
vectors={
"text_embedding": [0.1, 0.2, 0.3, 0.4], # ← must match the vector name
# ↑ list of floats; list length = dimension (4)
},
fields={
"text": "This is a sample text.", # ← must match the scalar field name
},
)
# Insert the document
result = collection.insert(doc)
print(result) # {"code": 0} means successThe insert() method returns a Status object for single-document insertion.
{"code": 0}indicates success.- Non-zero codes indicate failure.
Successfully inserted documents are immediately available for querying 🚀.
Insert a Batch of Documents
To insert multiple documents at once, pass a list of Doc objects to insert().
Each Doc is processed independently, and the method returns a list of Status objects — one per document.
import zvec
result = collection.insert(
[
zvec.Doc(
id="text_1",
vectors={"text_embedding": [0.1, 0.2, 0.3, 0.4]},
fields={"text": "This is a sample text."},
),
zvec.Doc(
id="text_2",
vectors={"text_embedding": [0.4, 0.3, 0.2, 0.1]},
fields={"text": "This is another sample text."},
),
zvec.Doc(
id="text_3",
vectors={"text_embedding": [-0.1, -0.2, -0.3, -0.4]},
fields={"text": "One more sample text."},
),
]
)
print(result) # [{"code":0}, {"code":0}, {"code":0}]A failure in one document (e.g., duplicate id) does not stop the others from being inserted.
🔍 Always check each Status in the result list.
Insert Documents with Sparse Vectors
Assume your collection includes a sparse vector named sparse_embedding.
Insert a document with a sparse vector like this:
import zvec
result = collection.insert(
zvec.Doc(
id="text_1",
vectors={
"sparse_embedding": {
42: 1.25, # ← dimension 42 has weight 1.25
1337: 0.8, # ← dimension 1337 has weight 0.8
2999: 0.63, # ← dimension 1999 has weight 0.63
}
},
)
)
print(result) # {"code":0}A sparse vector is represented as a dictionary dict[int, float].
There is no fixed dimension size — only non-zero dimensions need to be included.
Insert Documents with Multiple Fields and Vectors
Real-world applications often require collections with multiple scalar fields and vector embeddings. In this example, assume your collection includes the following schema:
- Scalar fields:
book_title(string)category(array of strings)publish_year(32-bit integer)
- Vector embeddings:
dense_embedding: a 768-dimensional dense vectorsparse_embedding: a sparse vector
Insert a document with multiple fields and vectors like this:
import zvec
# Create a document
doc = zvec.Doc(
id="book_1",
vectors={
"dense_embedding": [0.1 for _ in range(768)], # ← use real embedding in practice
"sparse_embedding": {42: 1.25, 1337: 0.8, 1999: 0.64}, # ← use real embedding in practice
},
fields={
"book_title": "Gone with the Wind", # ← string
"category": ["Romance", "Classic Literature"], # ← array of strings
"publish_year": 1936, # ← integer
},
)
# Insert the document
result = collection.insert(doc)
print(result) # {"code": 0} means success