Upsert¶

upsert() is an idempotent insert-or-update operation.

What upsert does¶

upsert(keys, vectors) ensures that each key maps to the given vector:

Key is new: The vector is inserted (same as add()).
Key exists, vector unchanged: No operation (skip).
Key exists, vector changed: The old vector is removed and the new vector is inserted.

upsert() is idempotent: calling it multiple times with the same inputs produces the same result.

Single upsert¶

import numpy as np
from iscc_usearch import NphdIndex

index = NphdIndex(max_dim=256)

vec = np.array([255, 128, 64, 32], dtype=np.uint8)
index.upsert(1, vec)

# Calling again with same data is a no-op
index.upsert(1, vec)
print(len(index))  # 1

# Update with different vector
vec_new = np.array([0, 0, 0, 0], dtype=np.uint8)
index.upsert(1, vec_new)
print(index.get(1))  # array([0, 0, 0, 0], dtype=uint8)

Batch upsert¶

Batch upsert works with uniform-length vectors:

keys = [1, 2, 3]
vectors = np.array(
    [
        [255, 128, 64, 32],
        [0, 0, 0, 0],
        [1, 2, 3, 4],
    ],
    dtype=np.uint8,
)

index.upsert(keys, vectors)

Variable-length batch upsert¶

Batch upsert() requires all vectors to have the same length because it normalizes inputs to a 2D array internally. For variable-length vectors, call upsert() one at a time:

variable_keys = [10, 11, 12]
variable_vecs = [
    np.array([255, 128], dtype=np.uint8),  # 16-bit
    np.array([255, 128, 64, 32], dtype=np.uint8),  # 32-bit
    np.array([1, 2, 3, 4, 5, 6, 7, 8], dtype=np.uint8),  # 64-bit
]

for key, vec in zip(variable_keys, variable_vecs):
    index.upsert(key, vec)

When to use upsert vs add¶

Scenario	Use
Keys are guaranteed unique	`add()`
Keys may repeat (idempotent sync)	`upsert()`
Bulk initial load	`add()`
Incremental updates	`upsert()`

Note

upsert() requires explicit keys. Auto-generated keys (keys=None) are not supported.

Note

upsert() is available on NphdIndex only. ShardedNphdIndex does not support upsert() because it uses an append-only design without remove().