Skip to content

API Reference

Auto-generated documentation for all public classes in iscc-usearch.

NphdIndex

Single-file index for variable-length binary bit-vectors with NPHD metric.

NphdIndex

NphdIndex(max_dim=256, **kwargs)

Bases: Index

Fast approximate nearest neighbor search for variable-length binary bit-vectors.

Supports Normalized Prefix Hamming Distance (NPHD) metric and packed binary vectors as np.uint8 arrays of variable length. Vector keys must be integers.

CONCURRENCY: Single-process only. The underlying .usearch files have no file locking or multi-process coordination. Running multiple processes against the same index may corrupt data. Use a single process with async/await for concurrent connections.

UPSERT: Batch upsert requires uniform-length vectors. For variable-length batch upsert, call upsert() individually for each vector: for k, v in zip(keys, vecs): idx.upsert(k, v)

Create a new NPHD index.

dirty property

dirty

Number of unsaved key mutations since last save/load/view/reset.

Counts individual keys added or removed. Useful for implementing caller-driven flush policies (e.g. "save every N writes").

Returns:

Type Description

Count of key mutations since last persistence operation.

add

add(keys, vectors, **kwargs)

Add variable-length binary vectors to the index.

Parameters:

Name Type Description Default
keys

Integer key(s) or None for auto-generation

required
vectors

Single vector, 2D array of uniform vectors, or list of variable-length vectors

required
kwargs

Additional arguments passed to parent Index.add()

{}

Returns:

Type Description

Array of keys for added vectors

remove

remove(keys, **kwargs)

Remove vectors by key(s).

Only counts keys that actually exist toward the dirty counter.

Parameters:

Name Type Description Default
keys

Integer key(s) to remove

required
kwargs

Additional arguments passed to parent Index.remove()

{}

get

get(keys, dtype=None)

Retrieve unpadded variable-length vectors by key(s).

Parameters:

Name Type Description Default
keys

Integer key(s) to lookup

required
dtype

Optional data type (defaults to index dtype)

None

Returns:

Type Description

Unpadded vector(s) or None for missing keys

search

search(
    vectors,
    count=10,
    radius=math.inf,
    *,
    threads=0,
    exact=False,
    log=False,
    progress=None,
)

Search for nearest neighbors of query vector(s).

Parameters:

Name Type Description Default
vectors

Single vector or batch of variable-length vectors to query

required
count

Maximum number of nearest neighbors to return per query

10
radius

Maximum distance for results

inf
threads

Number of threads (0 = auto)

0
exact

Perform exact search

False
log

Enable progress logging

False
progress

Optional progress callback

None

Returns:

Type Description

Matches for single query or BatchMatches for batch queries

Raises:

Type Description
ValueError

If count < 1

save

save(*args, **kwargs)

Save index to file or buffer and reset dirty counter.

Accepts the same arguments as usearch Index.save().

Returns:

Type Description

Serialized buffer when saving to memory, None when saving to file.

load

load(path_or_buffer=None, progress=None)

Load index from file or buffer and restore max_dim from saved ndim.

Parameters:

Name Type Description Default
path_or_buffer

Path or buffer to load from (defaults to self.path)

None
progress

Optional progress callback

None

view

view(path_or_buffer=None, progress=None)

Memory-map index from file or buffer and restore max_dim from saved ndim.

Parameters:

Name Type Description Default
path_or_buffer

Path or buffer to view from (defaults to self.path)

None
progress

Optional progress callback

None

reset

reset()

Reset the index to empty state and clear dirty counter.

copy

copy()

Create a copy of this index.

Returns:

Type Description

New NphdIndex with same configuration and data

restore staticmethod

restore(path_or_buffer, view=False, **kwargs)

Restore a NphdIndex from a saved file or buffer.

Parameters:

Name Type Description Default
path_or_buffer

Path or buffer to restore from

required
view

If True, memory-map the index instead of loading

False
kwargs

Additional arguments passed to NphdIndex constructor

{}

Returns:

Type Description

Restored NphdIndex or None if file is invalid

ShardedNphdIndex

Multi-shard index combining automatic sharding with NPHD support for variable-length vectors.

ShardedNphdIndex

ShardedNphdIndex(
    *,
    max_dim: int | None = None,
    path: str | PathLike,
    shard_size: int = DEFAULT_SHARD_SIZE,
    connectivity: int | None = None,
    expansion_add: int | None = None,
    expansion_search: int | None = None,
    **kwargs: Any,
)

Bases: ShardedIndex

Sharded index for variable-length binary bit-vectors with NPHD metric.

Combines ShardedIndex's automatic sharding with NphdIndex's support for variable-length vectors and Normalized Prefix Hamming Distance metric.

CONCURRENCY: Single-process only. No file locking. Use async/await within a single process for concurrent access.

Initialize a sharded NPHD index.

Parameters:

Name Type Description Default
max_dim int | None

Maximum bits per vector (auto-detected from existing shards if omitted)

None
path str | PathLike

Directory path for shard storage (required)

required
shard_size int

Size limit in bytes before rotating shards (default 1GB)

DEFAULT_SHARD_SIZE
connectivity int | None

HNSW connectivity parameter (M)

None
expansion_add int | None

Search depth on insertions (efConstruction)

None
expansion_search int | None

Search depth on queries (ef)

None

max_dim property

max_dim: int

Maximum number of bits per vector.

max_bytes property

max_bytes: int

Maximum number of bytes per vector.

vectors property

vectors: ShardedNphdIndexedVectors

Lazy iterator over all unpadded vectors across all shards.

Returns a ShardedNphdIndexedVectors object that supports: - Iteration: for vec in idx.vectors - Length: len(idx.vectors) - Indexing: idx.vectors[0], idx.vectors[-1] - Slicing: idx.vectors[:10] - Numpy conversion: np.asarray(idx.vectors) (requires uniform vector lengths)

Vectors are returned unpadded (variable-length), consistent with the get() API. This is a live view - reflects current state at iteration time.

Returns:

Type Description
ShardedNphdIndexedVectors

ShardedNphdIndexedVectors iterator

add

add(
    keys: int | None | Any,
    vectors: NDArray[Any] | Sequence[NDArray[Any]],
    *,
    copy: bool = True,
    threads: int = 0,
    log: str | bool = False,
    progress: Callable[[int, int], bool] | None = None,
) -> int | NDArray[np.uint64]

Add variable-length binary vectors to the index.

Pads vectors before adding to ensure consistent storage across shards.

Parameters:

Name Type Description Default
keys int | None | Any

Integer key(s) or None for auto-generation

required
vectors NDArray[Any] | Sequence[NDArray[Any]]

Single vector or batch of variable-length vectors to add

required
copy bool

Whether to copy vectors into index

True
threads int

Number of threads (0 = auto)

0
log str | bool

Enable progress logging

False
progress Callable[[int, int], bool] | None

Progress callback

None

Returns:

Type Description
int | NDArray[uint64]

Key(s) for added vectors

upsert

upsert(
    keys: Any,
    vectors: NDArray[Any] | Sequence[NDArray[Any]],
    **kwargs: Any,
) -> int | NDArray

Insert or update variable-length vectors by key.

Handles ragged/mixed-length vectors that np.asarray cannot stack.

Parameters:

Name Type Description Default
keys Any

Key(s) — None not accepted

required
vectors NDArray[Any] | Sequence[NDArray[Any]]

Single vector or batch of variable-length vectors

required

Returns:

Type Description
int | NDArray

Key(s) for stored vectors

Raises:

Type Description
ValueError

If keys is None or multi=True

RuntimeError

If index is read-only

add_once

add_once(
    keys: int | Any,
    vectors: NDArray[Any] | Sequence[NDArray[Any]],
    **kwargs: Any,
) -> int | NDArray | None

Add variable-length vectors, skipping keys that already exist.

First-write-wins: existing keys kept unchanged. Batch duplicates deduplicated (first occurrence kept).

Not atomic under concurrent writes — caller must serialize if needed.

Parameters:

Name Type Description Default
keys int | Any

Integer key(s) — None not accepted

required
vectors NDArray[Any] | Sequence[NDArray[Any]]

Single vector or batch of variable-length vectors

required
kwargs Any

Additional arguments passed to add()

{}

Returns:

Type Description
int | NDArray | None

Key(s) added, empty array if all skipped, None if single key skipped

Raises:

Type Description
ValueError

If keys is None or keys/vectors length mismatch

search

search(
    vectors: NDArray[Any],
    count: int = 10,
    *,
    radius: float = float("inf"),
    threads: int = 0,
    exact: bool = False,
    log: str | bool = False,
    progress: Callable[[int, int], bool] | None = None,
) -> Matches | BatchMatches

Search for nearest neighbors of query vector(s).

Pads query vectors before searching to match stored format.

Parameters:

Name Type Description Default
vectors NDArray[Any]

Query vector or batch of variable-length vectors to query

required
count int

Maximum number of nearest neighbors to return per query

10
radius float

Maximum distance for results

float('inf')
threads int

Number of threads (0 = auto)

0
exact bool

Perform exact search

False
log str | bool

Enable progress logging

False
progress Callable[[int, int], bool] | None

Progress callback

None

Returns:

Type Description
Matches | BatchMatches

Matches for single query, BatchMatches for batch

get

get(
    keys: int | Any, dtype: Any = None
) -> NDArray[Any] | list | None

Retrieve unpadded variable-length vectors by key(s) from any shard.

Parameters:

Name Type Description Default
keys int | Any

Integer key(s) to lookup

required
dtype Any

Optional data type for returned vectors

None

Returns:

Type Description
NDArray[Any] | list | None

Unpadded vector(s) or None for missing keys

__repr__

__repr__() -> str

Return string representation of the sharded NPHD index.

ShardedIndex

Generic sharded index for any metric. Use ShardedNphdIndex for NPHD workloads.

ShardedIndex

ShardedIndex(
    *,
    ndim: int | None = None,
    metric: MetricKind | Any = MetricKind.Cos,
    dtype: ScalarKind | str | None = None,
    connectivity: int | None = None,
    expansion_add: int | None = None,
    expansion_search: int | None = None,
    multi: bool = False,
    path: str | PathLike,
    shard_size: int = DEFAULT_SHARD_SIZE,
    bloom_filter: bool = True,
    read_only: bool = False,
)

Sharded vector index with full CRUD support.

Wraps usearch Index/Indexes to provide automatic sharding when the active shard exceeds the configured size limit. Finished shards are memory-mapped (view mode) for efficient read-only access, while the active shard is fully loaded (load mode) for read-write operations.

CRUD semantics (requires multi=False for remove/upsert): - add(): append to active shard - remove(): lazy-delete from active shard, tombstone in view shards - upsert(): remove + add (last-write-wins for batch duplicates) - compact(): rebuild view shards excluding tombstoned/duplicate entries

CONCURRENCY: Single-process only. No file locking. Use async/await within a single process for concurrent access.

Parameters:

Name Type Description Default
ndim int | None

Number of vector dimensions (auto-detected from existing shards if omitted)

None
metric MetricKind | Any

Distance metric (MetricKind or CompiledMetric)

Cos
dtype ScalarKind | str | None

Scalar type for vectors (ScalarKind)

None
connectivity int | None

HNSW connectivity parameter (M)

None
expansion_add int | None

Search depth on insertions (efConstruction)

None
expansion_search int | None

Search depth on queries (ef)

None
multi bool

Allow multiple vectors per key

False
path str | PathLike

Directory path for shard storage (required)

required
shard_size int

Size limit in bytes before rotating shards (default 1GB)

DEFAULT_SHARD_SIZE
bloom_filter bool

Enable bloom filter for fast non-existent key rejection

True
read_only bool

Open all shards in view mode (memory-mapped, read-only). Raises ValueError if no existing shards are found. Write operations raise RuntimeError.

False

Initialize a sharded index.

read_only property

read_only: bool

Whether the index is in read-only mode.

size property

size: int

Total number of logical vectors (approximate with cross-shard duplicates).

Exact when no cross-shard duplicates exist (the common case). May slightly overcount after upsert+rotation creates temporary duplicates. Compaction makes it exact. Tombstones are subtracted 1:1.

ndim property

ndim: int

Vector dimensionality.

dtype property

dtype: ScalarKind

Scalar type for vectors.

metric property

metric: MetricKind | Any

Distance metric.

metric_kind property

metric_kind: MetricKind

Distance metric kind.

connectivity property

connectivity: int

HNSW connectivity parameter.

expansion_add property writable

expansion_add: int

Expansion parameter for additions.

expansion_search: int

Expansion parameter for searches.

multi property

multi: bool

Whether multiple vectors per key are allowed.

path property

path: Path

Directory path for shard storage.

shard_count property

shard_count: int

Number of shard files.

memory_usage property

memory_usage: int

Estimated memory usage across all shards.

serialized_length property

serialized_length: int

Serialized length of active shard.

capacity property

capacity: int

Capacity of active shard.

dirty property

dirty: int

Number of unsaved key mutations since last save/reset.

Counts individual keys added or removed. Useful for implementing caller-driven flush policies (e.g. "save every N writes").

Always returns 0 for read-only indexes.

Returns:

Type Description
int

Count of key mutations since last persistence operation.

tombstone_count property

tombstone_count: int

Number of pending tombstones (view shard deletions awaiting compaction).

keys property

keys: ShardedIndexedKeys

Lazy iterator over all keys across all shards.

Returns a ShardedIndexedKeys object that supports: - Iteration: for key in idx.keys - Length: len(idx.keys) - Indexing: idx.keys[0], idx.keys[-1] - Slicing: idx.keys[:10] - Numpy conversion: np.asarray(idx.keys)

This is a live view - reflects current state at iteration time.

Returns:

Type Description
ShardedIndexedKeys

ShardedIndexedKeys iterator

vectors property

vectors: ShardedIndexedVectors

Lazy iterator over all vectors across all shards.

Returns a ShardedIndexedVectors object that supports: - Iteration: for vec in idx.vectors - Length: len(idx.vectors) - Indexing: idx.vectors[0], idx.vectors[-1] - Slicing: idx.vectors[:10] - Numpy conversion: np.asarray(idx.vectors)

This is a live view - reflects current state at iteration time.

Note: Unlike usearch Index.vectors which returns an np.ndarray immediately, this returns a lazy iterator appropriate for larger-than-RAM indexes.

Returns:

Type Description
ShardedIndexedVectors

ShardedIndexedVectors iterator

add

add(
    keys: int | None | Any,
    vectors: NDArray[Any],
    *,
    copy: bool = True,
    threads: int = 0,
    log: str | bool = False,
    progress: Callable[[int, int], bool] | None = None,
) -> int | NDArray[np.uint64]

Add vectors to the active shard, rotating if size exceeded.

Parameters:

Name Type Description Default
keys int | None | Any

Integer key(s) or None for auto-generation

required
vectors NDArray[Any]

Vector or batch of vectors to add

required
copy bool

Whether to copy vectors into index

True
threads int

Number of threads (0 = auto)

0
log str | bool

Enable progress logging

False
progress Callable[[int, int], bool] | None

Progress callback

None

Returns:

Type Description
int | NDArray[uint64]

Key(s) for added vectors

add_once

add_once(
    keys: int | Any, vectors: NDArray[Any], **kwargs: Any
) -> int | NDArray | None

Add vectors, silently skipping keys that already exist.

First-write-wins: if a key is already in the index, its vector is kept unchanged. Duplicate keys within a single batch are deduplicated — only the first occurrence is added.

Not atomic under concurrent writes — caller must serialize if needed (see ShardedIndex concurrency model).

Parameters:

Name Type Description Default
keys int | Any

Integer key(s) — None not accepted

required
vectors NDArray[Any]

Vector or batch of vectors to add

required
kwargs Any

Additional arguments passed to add()

{}

Returns:

Type Description
int | NDArray | None

Key(s) added, empty array if all skipped, None if single key skipped

Raises:

Type Description
ValueError

If keys is None or keys/vectors length mismatch

RuntimeError

If index is read-only

upsert

upsert(
    keys: Any, vectors: NDArray[Any], **kwargs: Any
) -> int | NDArray

Insert or update vectors by key.

For new keys: adds to active shard. For existing keys: removes old entry, adds new entry to active shard.

Parameters:

Name Type Description Default
keys Any

Key(s) — None not accepted

required
vectors NDArray[Any]

Vector(s) to store

required

Returns:

Type Description
int | NDArray

Key(s) for stored vectors

Raises:

Type Description
ValueError

If keys is None or multi=True

RuntimeError

If index is read-only

search

search(
    vectors: NDArray[Any],
    count: int = 10,
    *,
    radius: float = float("inf"),
    threads: int = 0,
    exact: bool = False,
    log: str | bool = False,
    progress: Callable[[int, int], bool] | None = None,
) -> Matches | BatchMatches

Search across all shards, merging and sorting results.

Active shard results suppress view shard versions of the same key. Tombstoned keys are filtered from view shard results. Search is approximate for cross-view-shard duplicates — compaction resolves those.

Parameters:

Name Type Description Default
vectors NDArray[Any]

Query vector or batch of vectors

required
count int

Maximum number of results per query

10
radius float

Maximum distance for results

float('inf')
threads int

Number of threads (0 = auto)

0
exact bool

Perform exact search

False
log str | bool

Enable progress logging

False
progress Callable[[int, int], bool] | None

Progress callback

None

Returns:

Type Description
Matches | BatchMatches

Matches for single query, BatchMatches for batch

Raises:

Type Description
ValueError

If count < 1

get

get(
    keys: int | Any, dtype: Any = None
) -> NDArray[Any] | list | None

Retrieve vectors by key from any shard.

Parameters:

Name Type Description Default
keys int | Any

Integer key(s) to lookup

required
dtype Any

Optional data type for returned vectors

None

Returns:

Type Description
NDArray[Any] | list | None

Vector(s) or None for missing keys

contains

contains(keys: int | Any) -> bool | NDArray[np.bool_]

Check if keys exist in any shard.

When bloom_filter=True (default), uses bloom filter to quickly reject non-existent keys.

Parameters:

Name Type Description Default
keys int | Any

Integer key(s) to check

required

Returns:

Type Description
bool | NDArray[bool_]

Boolean or array of booleans

__contains__

__contains__(keys: int | Any) -> bool | NDArray[np.bool_]

Support 'in' operator.

count

count(keys: int | Any) -> int | NDArray[np.uint64]

Count occurrences of keys across all shards (sum aggregation).

Parameters:

Name Type Description Default
keys int | Any

Integer key(s) to count

required

Returns:

Type Description
int | NDArray[uint64]

Count or array of counts

save

save(
    path_or_buffer: str | PathLike | None = None,
    progress: Callable[[int, int], bool] | None = None,
) -> None

Save active shard and bloom filter to disk.

ShardedIndex manages its own file layout — pass no arguments to save to the directory configured at construction time.

Parameters:

Name Type Description Default
path_or_buffer str | PathLike | None

Must be None. Raises TypeError if a path is provided.

None
progress Callable[[int, int], bool] | None

Progress callback

None

Raises:

Type Description
TypeError

If path_or_buffer is not None.

RuntimeError

If index is read-only.

rebuild_bloom

rebuild_bloom(
    save: bool = True, log_progress: bool = True
) -> int

Rebuild bloom filter from all existing keys.

Use this to populate the bloom filter for an existing index that was created without bloom filter support, or to repair a corrupted filter.

Processes keys shard-by-shard in batches for efficiency.

Parameters:

Name Type Description Default
save bool

Whether to save the bloom filter to disk after rebuilding

True
log_progress bool

Whether to log progress per shard

True

Returns:

Type Description
int

Number of keys added to the bloom filter

Raises:

Type Description
RuntimeError

If index is read-only

compact

compact() -> int

Rebuild view shards excluding tombstoned and cross-shard duplicate entries.

Processes shards newest-to-oldest. Keys already seen in newer shards (or the active shard) are dropped from older shards. Returns number of entries removed.

Two-phase approach: (1) collect data while shards are accessible, (2) release all mmap references, (3) execute file operations. This prevents Windows PermissionError from locked memory-mapped files.

Raises:

Type Description
RuntimeError

If index is read-only

metadata staticmethod

metadata(path: str | PathLike) -> dict | None

Extract metadata from a sharded index directory.

Parameters:

Name Type Description Default
path str | PathLike

Directory containing shard files

required

Returns:

Type Description
dict | None

Metadata dict or None if invalid

__len__

__len__() -> int

Total number of vectors across all shards.

remove

remove(keys: Any, *, compact: bool = False) -> None

Remove vectors by key.

Active shard entries: USearch remove() (lazy deletion). View shard entries: tombstoned (suppressed on read, cleaned on compact()). Keys that exist only in the active shard are NOT tombstoned.

Parameters:

Name Type Description Default
keys Any

Key or sequence of keys to remove

required
compact bool

If True, call USearch isolate() on active shard entries

False

Raises:

Type Description
RuntimeError

If index is read-only

ValueError

If multi=True

__delitem__

__delitem__(keys: Any) -> None

Remove vectors by key (del index[key]).

rename

rename(*args: Any, **kwargs: Any) -> None

Not supported.

join

join(*args: Any, **kwargs: Any) -> None

Not supported for sharded indexes.

cluster

cluster(*args: Any, **kwargs: Any) -> None

Not supported for sharded indexes.

pairwise_distance

pairwise_distance(*args: Any, **kwargs: Any) -> None

Not supported for sharded indexes.

copy

copy() -> None

Not supported - too complex with multiple shards.

clear

clear() -> None

Not supported - would need to handle multiple files.

reset

reset() -> None

Release all resources and reset to empty-but-usable state.

Releases view shards, active shard, bloom filter, and tombstones in memory. Does not delete files on disk. After reset, the index is empty and ready for new add() calls with the same configuration.

Raises:

Type Description
RuntimeError

If index is read-only

__repr__

__repr__() -> str

Return string representation of the sharded index.

ShardedIndex128

Sharded index with 128-bit UUID keys. Uses bytes(16) for single keys and np.dtype('V16') arrays for batches.

ShardedIndex128

ShardedIndex128(*, path: str | PathLike, **kwargs: Any)

Bases: _UuidKeyMixin, ShardedIndex

Sharded vector index with 128-bit UUID keys.

Uses usearch's key_kind="uuid" for 128-bit composite keys represented as bytes(16) for single keys and np.dtype('V16') arrays for batches.

Auto-generation of keys (keys=None) is not supported — all keys must be provided explicitly.

Initialize a 128-bit sharded index.

Parameters:

Name Type Description Default
path str | PathLike

Directory path for shard storage

required
kwargs Any

Passed to ShardedIndex (key_kind is absorbed if present)

{}

ShardedNphdIndex128

Sharded NPHD index with 128-bit UUID keys for variable-length vectors.

ShardedNphdIndex128

ShardedNphdIndex128(*, path: str | PathLike, **kwargs: Any)

Bases: _UuidKeyMixin, ShardedNphdIndex

Sharded NPHD index with 128-bit UUID keys.

Combines ShardedNphdIndex's variable-length vector support with 128-bit composite keys represented as bytes(16) for single keys and np.dtype('V16') arrays for batches.

Auto-generation of keys (keys=None) is not supported — all keys must be provided explicitly.

Initialize a 128-bit sharded NPHD index.

Parameters:

Name Type Description Default
path str | PathLike

Directory path for shard storage

required
kwargs Any

Passed to ShardedNphdIndex (key_kind is absorbed if present)

{}

__repr__

__repr__() -> str

Return string representation of the sharded NPHD 128-bit index.

ScalableBloomFilter

Scalable bloom filter for efficient probabilistic key existence checks.

ScalableBloomFilter

ScalableBloomFilter(
    initial_capacity: int = 10000000,
    fpr: float = 0.01,
    growth_factor: float = 2.0,
)

Scalable bloom filter that grows automatically as elements are added.

Chains multiple fixed-size bloom filters to support unlimited growth while maintaining the target false positive rate. Each new filter has progressively tighter FPR to keep the overall rate bounded.

Parameters:

Name Type Description Default
initial_capacity int

Initial number of elements before first growth

10000000
fpr float

Target false positive rate (0.0-1.0)

0.01
growth_factor float

Capacity multiplier for each new filter

2.0

Initialize a scalable bloom filter.

count property

count: int

Approximate number of elements added.

current_capacity property

current_capacity: int

Total capacity across all filters.

filter_count property

filter_count: int

Number of bloom filters in the chain.

add

add(key: int | bytes) -> None

Add a single key to the bloom filter.

Parameters:

Name Type Description Default
key int | bytes

Integer or bytes key to add

required

add_batch

add_batch(keys: Sequence[int] | Sequence[bytes]) -> None

Add multiple keys to the bloom filter efficiently.

Uses native batch operations and handles capacity growth properly.

Parameters:

Name Type Description Default
keys Sequence[int] | Sequence[bytes]

Sequence of integer or bytes keys to add

required

contains

contains(key: int | bytes) -> bool

Check if a key might be in the filter.

Parameters:

Name Type Description Default
key int | bytes

Integer or bytes key to check

required

Returns:

Type Description
bool

False if definitely not present, True if possibly present

contains_batch

contains_batch(
    keys: Sequence[int] | Sequence[bytes],
) -> list[bool]

Check if multiple keys might be in the filter.

Uses native Rust batch operations for throughput. Each filter in the chain is checked via a single batch call, and results are OR-combined.

Parameters:

Name Type Description Default
keys Sequence[int] | Sequence[bytes]

Sequence of integer or bytes keys to check

required

Returns:

Type Description
list[bool]

List of booleans (False=definitely not, True=possibly present)

clear

clear() -> None

Clear all filters and reset to initial state.

save

save(path: str | Path) -> None

Save bloom filter to disk atomically via temp file + rename.

Parameters:

Name Type Description Default
path str | Path

File path to save to

required

load classmethod

load(path: str | Path) -> ScalableBloomFilter

Load bloom filter from disk.

Parameters:

Name Type Description Default
path str | Path

File path to load from

required

Returns:

Type Description
ScalableBloomFilter

Restored ScalableBloomFilter

Raises:

Type Description
ValueError

If file format is invalid

__len__

__len__() -> int

Return approximate number of elements.

__contains__

__contains__(key: int | bytes) -> bool

Support 'in' operator.

__repr__

__repr__() -> str

Return string representation.

timer

Context manager for timing operations with loguru integration.

timer

timer(message: str, log_start=False, level='DEBUG')

Context manager for timing code blocks and logging elapsed duration.

Logs a message with the elapsed time on exit using loguru.

Parameters:

Name Type Description Default
message str

Description of the operation being timed.

required
log_start

If True, log a "started" message on entry.

False
level

Log level for messages (default: "DEBUG").

'DEBUG'

__enter__

__enter__()

Start the timer.

__exit__

__exit__(exc_type, exc_value, traceback)

Stop the timer and log elapsed duration.