Skip to content

API Reference

Auto-generated from source code docstrings.

Index Protocol

The IsccIndexProtocol defines the interface that all index backends implement. CLI, REST API, and library code all use this protocol regardless of the backend.

IsccIndexProtocol

Bases: Protocol

Protocol for ISCC index backends.

All methods are synchronous. Backends are free to use threading, connection pools, etc. internally.

This protocol defines the core operations that all ISCC index implementations must support: - Index lifecycle: create, get, list, delete - Asset operations: add, search - Resource cleanup: close

All index implementations should handle the protocol's exception contract: - ValueError: Invalid parameters or validation failures - FileExistsError: Attempting to create an existing index - FileNotFoundError: Attempting to access a non-existent index

list_indexes

list_indexes()

List all available indexes with metadata.

Scans the backend storage and returns metadata for all existing indexes. The metadata includes index name, asset count, and storage size.

Returns:

Type Description

List of IsccIndex objects with name, assets, and size

create_index

create_index(index)

Create a new named index.

Initializes a new index with the specified name. The index starts empty with 0 assets. If the backend requires initialization (creating directories, database tables, etc.), this method handles it.

Parameters:

Name Type Description Default
index

IsccIndex with name (assets and size fields are ignored)

required

Returns:

Type Description

Created IsccIndex with initial metadata (assets=0, size=0)

Raises:

Type Description
ValueError

If name is invalid (doesn't match pattern ^[a-z][a-z0-9]*$)

FileExistsError

If index with this name already exists

get_index

get_index(name)

Get index metadata by name.

Retrieves current metadata for the specified index, including the number of assets and storage size. This is useful for monitoring index growth and health.

Parameters:

Name Type Description Default
name

Index name (must match pattern ^[a-z][a-z0-9]*$)

required

Returns:

Type Description

IsccIndex with current metadata

Raises:

Type Description
FileNotFoundError

If index doesn't exist

delete_index

delete_index(name)

Delete an index and all its data.

Permanently removes the index and all associated data. This operation cannot be undone. Implementations should clean up all resources (files, database tables, etc.).

Parameters:

Name Type Description Default
name

Index name

required

Raises:

Type Description
FileNotFoundError

If index doesn't exist

add_assets

add_assets(index_name, assets)

Add assets to index.

Adds multiple ISCC assets to the specified index. Each asset contains an ISCC-ID and ISCC-UNITs for similarity indexing. Assets with missing ISCC-IDs will have them auto-generated.

Implementations should: - Store asset metadata for later retrieval - Index ISCC-UNITs by type for similarity search - Handle duplicates gracefully (update vs create) - Return status for each asset

Parameters:

Name Type Description Default
index_name

Target index name

required
assets

List of IsccEntry objects to add

required

Returns:

Type Description

List of IsccAddResult with status for each asset

Raises:

Type Description
FileNotFoundError

If index doesn't exist

ValueError

If assets contain invalid ISCC codes

get_asset

get_asset(index_name, iscc_id)

Get a specific asset by ISCC-ID.

Retrieves the full asset details for a given ISCC-ID from the specified index. This is useful for fetching complete asset metadata after performing a search, which returns only ISCC-IDs and scores.

Parameters:

Name Type Description Default
index_name

Target index name

required
iscc_id

ISCC-ID of the asset to retrieve

required

Returns:

Type Description

IsccEntry with all stored metadata

Raises:

Type Description
FileNotFoundError

If index doesn't exist or asset not found

ValueError

If ISCC-ID format is invalid

search_assets

search_assets(index_name, query, limit=100)

Search for similar assets in index.

Performs similarity search using the query asset's ISCC-UNITs. Results are aggregated across all unit types and returned sorted by relevance (highest scores first).

The returned IsccSearchResult includes: - query: The original query asset (may have auto-generated iscc_id) - global_matches: List of IsccGlobalMatch objects with scores and per-unit breakdowns

Parameters:

Name Type Description Default
index_name

Target index name

required
query

IsccQuery to search for (either iscc_code or units required)

required
limit

Maximum number of results to return (default: 100)

100

Returns:

Type Description

IsccSearchResult with query and list of matches

Raises:

Type Description
FileNotFoundError

If index doesn't exist

ValueError

If query asset is invalid

close

close()

Close connections and cleanup resources.

Should be called when the backend is no longer needed. Implementations should clean up resources like database connections, file handles, and memory caches. This method should be idempotent (safe to call multiple times).

After calling close(), the index instance should not be used for further operations.

Configuration

Server deployment settings managed through environment variables with the ISCC_SEARCH_ prefix. See the Configuration Reference for the full list of variables.

SearchOptions

Bases: BaseSettings

Application options for ISCC-Search.

Options can be configured via: - Environment variables (prefixed with ISCC_SEARCH_) - .env file in the working directory - Direct instantiation with parameters - Runtime override using the override() method

cors_origins_list property

cors_origins_list

Split comma-separated CORS origins string into a list.

Returns:

Type Description

List of allowed origin strings

override

override(update=None)

Return an updated and validated deep copy of the current options instance.

Parameters:

Name Type Description Default
update

Dictionary of field names and values to override.

None

Returns:

Type Description

New SearchOptions instance with updated and validated fields.

Models

Types and convenience classes for handling ISCC codes, units, and items.

models

Types and convenience classes for handling ISCCs

Terms and Definitions
  • ISCC - Any ISCC-CODE, ISCC-UNIT, or ISCC-ID
  • ISCC-HEADER - Self-describing 2-byte header for V1 components (3 bytes for future versions). The first 12 bits encode MainType, SubType, and Version. Additional bits encode Length for variable-length ISCCs.
  • ISCC-BODY - Actual payload of an ISCC, similarity preserving compact binary code, hash or timestamp
  • ISCC-DIGEST - Binary representation of complete ISCC (ISCC-HEADER + ISCC-BODY).
  • ISCC-SEQUENCE - Binary sequence of ISCC-DIGESTS
  • ISCC-UNIT - ISCC-HEADER + ISCC-BODY where the ISCC-BODY is calculated from a single algorithm
  • ISCC-CODE - ISCC-HEADER + ISCC-BODY where the ISCC-BODY is a sequence of multiple ISCC-UNIT BODYs
    • DATA and INSTANCE are the minimum required mandatory ISCC-UNITS for a valid ISCC-CODE
  • ISCC-ID - Globally unique digital asset identifier (ISCC-HEADER + 52-bit timestamp + 12-bit hub_id)
  • SIMPRINT - Headerless base64 encoded similarity hash that describes a content segment (granular feature)
  • UNIT-TYPE: Identifier for ISCC-UNIT types that can be indexed together with meaningful similarity search

IsccBase

IsccBase(iscc)

Base class for ISCC objects providing common properties and methods.

Handles conversion between different ISCC representations (string, bytes) and provides access to ISCC components (header, body, fields).

Initialize ISCC object from string or binary representation.

Parameters:

Name Type Description Default
iscc

ISCC in canonical string format (with or without "ISCC:" prefix) or binary digest

required

Raises:

Type Description
TypeError

If iscc is not str or bytes

body property
body

Extract ISCC-BODY bytes (payload without header).

Returns:

Type Description

ISCC-BODY as raw bytes

fields cached property
fields

Decode ISCC header into structured fields.

Returns:

Type Description

IsccTuple with MainType, SubType, Version, Length, and Body

iscc_type cached property
iscc_type

Get human-readable ISCC type identifier.

Returns:

Type Description

ISCC type string in format "MAINTYPE-SUBTYPE-VERSION" (e.g., "CONTENT-TEXT-V1")

__str__
__str__()

Get canonical ISCC string representation.

Returns:

Type Description

ISCC in canonical format with "ISCC:" prefix and base32 encoding

__len__
__len__()

Get ISCC-BODY bit-length.

Returns:

Type Description

Number of bits in ISCC-BODY (64, 128, 192, or 256)

__bytes__
__bytes__()

Get binary ISCC-DIGEST representation.

Returns:

Type Description

Complete ISCC-DIGEST as bytes (ISCC-HEADER + ISCC-BODY)

IsccID

IsccID(iscc)

Bases: IsccBase

ISCC-ID: Globally unique digital asset identifier.

Combines ISCC-HEADER with 52-bit timestamp and 12-bit server-id for unique identification of digital assets across distributed systems.

realm_id property
realm_id

Extract REALM-ID from ISCC-ID header.

Returns:

Type Description

Realm identifier (0 for REALM_0, 1 for REALM_1)

__int__ cached
__int__()

Convert ISCC-ID to integer representation.

WARNING: Integer representation does not include ISCC-HEADER information. Use as a 64-bit integer database ID and keep track of the REALM-ID for reconstruction.

Returns:

Type Description

Integer representation of complete ISCC-ID digest

from_int classmethod
from_int(iscc_id, realm_id)

Construct ISCC-ID from integer and realm identifier.

Parameters:

Name Type Description Default
iscc_id

Integer representation of ISCC-ID body (8 bytes)

required
realm_id

Realm identifier for ISCC-HEADER SubType (0 or 1)

required

Returns:

Type Description

New IsccID instance

from_body classmethod
from_body(body, realm_id)

Construct ISCC-ID from body bytes and realm identifier.

Parameters:

Name Type Description Default
body

ISCC-ID body bytes (8 bytes)

required
realm_id

Realm identifier for ISCC-HEADER SubType (0 or 1)

required

Returns:

Type Description

New IsccID instance

random classmethod
random()

Create a new random ISCC-ID.

Uses REALM-ID 0 for non-authoritative ISCC-IDs with current timestamp and random server ID.

Returns:

Type Description

New IsccID instance with random identifier

IsccUnit

IsccUnit(iscc)

Bases: IsccBase

ISCC-UNIT: Single-algorithm ISCC component.

An ISCC-UNIT combines ISCC-HEADER with ISCC-BODY calculated from a single algorithm. Multiple ISCC-UNITs can be combined to form an ISCC-CODE.

unit_type property
unit_type

Get ISCC-UNIT type identifier.

Returns:

Type Description

ISCC type string (alias for iscc_type property)

__array__
__array__(dtype=np.uint8, copy=None)

Return numpy array from ISCC-BODY bytes.

Parameters:

Name Type Description Default
dtype

NumPy dtype for the array

uint8
copy

If True, always copy. If False, never copy (view only). If None, copy only if needed.

None

Returns:

Type Description

NumPy array representation of ISCC-BODY

IsccCode

IsccCode(iscc)

Bases: IsccBase

ISCC-CODE: Composite ISCC combining multiple ISCC-UNITs.

An ISCC-CODE combines multiple ISCC-UNIT bodies into a single identifier. Minimum requirement: DATA and INSTANCE units. Can include META, SEMANTIC, and CONTENT units depending on the content type.

units cached property
units

Decompose ISCC-CODE into constituent ISCC-UNITs.

Parses the ISCC-CODE body and reconstructs individual ISCC-UNITs with their headers. Handles both standard codes and WIDE subtype codes.

Returns:

Type Description

List of IsccUnit objects contained in this ISCC-CODE

IsccItemDict

Bases: TypedDict

Dictionary representation of an ISCC item with ID, code, and units as strings.

IsccItem

Bases: Struct

Minimal ISCC container for efficient indexing.

Stores only binary representations (id and units). String representations and derived values are computed on-demand (no caching for memory efficiency).

Parameters:

Name Type Description Default
id_data

ISCC-ID digest (10 bytes: 2-byte header + 8-byte body)

required
units_data

Sequence of ISCC-UNIT digests

required
iscc_id property
iscc_id

ISCC-ID as canonical string.

iscc_code property
iscc_code

ISCC-CODE computed from units (wide format).

units property
units

ISCC-UNITs as list of canonical strings.

dict property
dict

Convert IsccItem to dictionary representation.

Returns:

Type Description

Dictionary with iscc_id, iscc_code, and units as canonical strings

json property
json

Serialize IsccItem to JSON bytes.

Returns:

Type Description

JSON-encoded representation of IsccItem dictionary

new classmethod
new(iscc_id, iscc_code=None, units=None)

Create a new IsccItem from ISCC-ID and either ISCC-CODE or units.

Parameters:

Name Type Description Default
iscc_id

ISCC-ID as string or binary digest

required
iscc_code

Optional ISCC-CODE as string or binary digest

None
units

Optional list of ISCC-UNITs as strings or binary digests

None

Returns:

Type Description

New IsccItem instance

Raises:

Type Description
ValueError

If neither iscc_code nor units is provided

from_dict classmethod
from_dict(data)

Create IsccItem from dictionary, generating random ISCC-ID if missing.

Parameters:

Name Type Description Default
data

Dictionary with optional iscc_id and either iscc_code or units

required

Returns:

Type Description

New IsccItem instance

Raises:

Type Description
ValueError

If neither iscc_code nor units is provided

new_iscc_id

new_iscc_id()

Generate a new random ISCC-ID digest.

Creates a 10-byte ISCC-ID using current timestamp (52 bits) and random server ID (12 bits). Uses REALM-0 for non-authoritative identifiers.

Returns:

Type Description

Complete ISCC-ID digest (2-byte header + 8-byte body)

split_iscc_sequence

split_iscc_sequence(data)

Split a sequence of concatenated ISCC-DIGESTS.

Parameters:

Name Type Description Default
data

Concatenated ISCC-DIGESTS (variable-length)

required

Returns:

Type Description

List of individual ISCC-DIGEST bytes