Upsert¶
upsert() is an idempotent insert-or-update operation.
What upsert does¶
upsert(keys, vectors) ensures that each key maps to the given vector:
- Key is new: The vector is inserted (same as
add()). - Key exists, vector unchanged: No operation (skip).
- Key exists, vector changed: The old vector is removed and the new vector is inserted.
upsert() is idempotent: calling it multiple times with the same inputs produces the same result.
Single upsert¶
import numpy as np
from iscc_usearch import NphdIndex
index = NphdIndex(max_dim=256)
vec = np.array([255, 128, 64, 32], dtype=np.uint8)
index.upsert(1, vec)
# Calling again with same data is a no-op
index.upsert(1, vec)
print(len(index)) # 1
# Update with different vector
vec_new = np.array([0, 0, 0, 0], dtype=np.uint8)
index.upsert(1, vec_new)
print(index.get(1)) # array([0, 0, 0, 0], dtype=uint8)
Batch upsert¶
Batch upsert works with uniform-length vectors:
keys = [1, 2, 3]
vectors = np.array(
[
[255, 128, 64, 32],
[0, 0, 0, 0],
[1, 2, 3, 4],
],
dtype=np.uint8,
)
index.upsert(keys, vectors)
Variable-length batch upsert¶
Batch upsert() requires all vectors to have the same length because it normalizes inputs to a
2D array internally. For variable-length vectors, call upsert() one at a time:
variable_keys = [10, 11, 12]
variable_vecs = [
np.array([255, 128], dtype=np.uint8), # 16-bit
np.array([255, 128, 64, 32], dtype=np.uint8), # 32-bit
np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=np.uint8), # 64-bit
]
for key, vec in zip(variable_keys, variable_vecs):
index.upsert(key, vec)
When to use upsert vs add¶
| Scenario | Use |
|---|---|
| Keys are guaranteed unique | add() |
| Keys may repeat (idempotent sync) | upsert() |
| Bulk initial load | add() |
| Incremental updates | upsert() |
Note
upsert() requires explicit keys. Auto-generated keys (keys=None) are not supported.
Note
upsert() is available on NphdIndex only. ShardedNphdIndex does not support upsert()
because it uses an append-only design without remove().