128-bit UUID keys¶
Use ShardedIndex128 or ShardedNphdIndex128 when 64-bit integer keys are not enough — for
example, when your identifiers are UUIDs, 128-bit hashes, or structured multi-part keys.
When to use 128-bit keys¶
Switch from ShardedIndex / ShardedNphdIndex to their 128-bit variants when:
- Your key space exceeds 64 bits.
- Your identifiers are natively 128-bit (UUIDs, MD5 hashes, etc.).
- You need to pack multiple fields into a single key (e.g., two 8-byte values).
Key format¶
128-bit keys are represented as:
- Single key:
bytesof length 16 - Batch keys: NumPy array with
dtype='V16'(void 16-byte elements)
import numpy as np
# Single key — 16 bytes
key = b"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"
# Batch keys — V16 array
keys = np.array([key, b"\xff" * 16], dtype="V16")
Create a 128-bit sharded index¶
Add vectors¶
import numpy as np
# Single add
key = b"\x00" * 16
vector = np.random.randint(0, 256, size=32, dtype=np.uint8)
index.add(key, vector)
# Batch add — V16 array
keys = np.array([b"\x00" * 15 + bytes([i]) for i in range(100)], dtype="V16")
vectors = np.random.randint(0, 256, size=(100, 32), dtype=np.uint8)
index.add(keys, vectors)
# Batch add — list of bytes (also accepted)
keys_list = [b"\x00" * 15 + bytes([i]) for i in range(100, 200)]
vectors2 = np.random.randint(0, 256, size=(100, 32), dtype=np.uint8)
index.add(keys_list, vectors2)
Note
Auto-key generation (keys=None) is not supported for 128-bit indexes. All keys must be
provided explicitly.
Skip-if-exists with add_once()¶
add_once() adds vectors only when their keys do not already exist. Existing keys are silently
skipped (first-write-wins). Works with single keys, V16 arrays, and list[bytes]:
key = b"\x00" * 16
vec = np.random.randint(0, 256, size=32, dtype=np.uint8)
# First add succeeds
index.add_once(key, vec)
# Second add is silently skipped
result = index.add_once(key, vec)
assert result is None # key already existed
Search¶
Search works the same as with 64-bit indexes. Results contain V16 keys:
query = np.random.randint(0, 256, size=32, dtype=np.uint8)
matches = index.search(query, count=10)
for key_bytes, dist in zip(matches.keys, matches.distances):
print(f"Key {bytes(key_bytes).hex()}: distance = {dist:.4f}")
Retrieve by key¶
# Single get
vector = index.get(b"\x00" * 16)
# Batch get
keys = np.array([b"\x00" * 16, b"\xff" * 16], dtype="V16")
vectors = index.get(keys)
# Contains
print(index.contains(b"\x00" * 16)) # True or False
print(b"\x00" * 16 in index) # Same thing
Structured key packing¶
A common pattern is packing two 8-byte values into a 16-byte key. Use struct for this:
import struct
part_a = 0x0123456789ABCDEF # first 8 bytes
part_b = 42 # second 8 bytes
# Pack as big-endian: part_a (8B) + part_b (8B)
key = struct.pack(">QQ", part_a, part_b)
# Add to index
index.add(key, vector)
# Later, unpack from search results
a, b = struct.unpack(">QQ", key)
Save and reopen¶
# Save
index.save()
# Reopen — auto-detects existing shards and uuid key kind
index = ShardedIndex128(path="./my_index_128")
# or
index = ShardedNphdIndex128(path="./my_nphd_128")
Validation rules¶
128-bit indexes enforce strict key validation:
| Operation | Validation |
|---|---|
add(key, vec) |
key must be bytes of length 16 |
add(keys, vecs) |
keys: np.ndarray with dtype='V16' or list[bytes] (len 16) |
add_once(key, vec) / add_once(keys, vecs) |
Same rules as add() |
get(key) / contains(key) / count(key) |
key must be bytes of length 16 |
get(keys) / contains(keys) / count(keys) |
keys must be V16 array or Sequence[bytes(16)] |
Passing the wrong key type or length raises ValueError immediately rather than producing
silent incorrect results.
Limitations¶
- No auto-keys:
keys=NoneraisesValueError. All keys must be explicit. copy()/clear()raiseNotImplementedError(same as standard sharded indexes).