API Reference¶
Auto-generated from source code docstrings.
Index Protocol¶
The IsccIndexProtocol defines the interface that all index backends implement. CLI, REST API,
and library code all use this protocol regardless of the backend.
IsccIndexProtocol ¶
Bases: Protocol
Protocol for ISCC index backends.
All methods are synchronous. Backends are free to use threading, connection pools, etc. internally.
This protocol defines the core operations that all ISCC index implementations must support: - Index lifecycle: create, get, list, delete - Asset operations: add, search - Resource cleanup: close
All index implementations should handle the protocol's exception contract: - ValueError: Invalid parameters or validation failures - FileExistsError: Attempting to create an existing index - FileNotFoundError: Attempting to access a non-existent index
list_indexes ¶
List all available indexes with metadata.
Scans the backend storage and returns metadata for all existing indexes. The metadata includes index name, asset count, and storage size.
Returns:
| Type | Description |
|---|---|
|
List of IsccIndex objects with name, assets, and size |
create_index ¶
Create a new named index.
Initializes a new index with the specified name. The index starts empty with 0 assets. If the backend requires initialization (creating directories, database tables, etc.), this method handles it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index
|
IsccIndex with name (assets and size fields are ignored) |
required |
Returns:
| Type | Description |
|---|---|
|
Created IsccIndex with initial metadata (assets=0, size=0) |
Raises:
| Type | Description |
|---|---|
ValueError
|
If name is invalid (doesn't match pattern ^[a-z][a-z0-9]*$) |
FileExistsError
|
If index with this name already exists |
get_index ¶
Get index metadata by name.
Retrieves current metadata for the specified index, including the number of assets and storage size. This is useful for monitoring index growth and health.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
Index name (must match pattern ^[a-z][a-z0-9]*$) |
required |
Returns:
| Type | Description |
|---|---|
|
IsccIndex with current metadata |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If index doesn't exist |
delete_index ¶
Delete an index and all its data.
Permanently removes the index and all associated data. This operation cannot be undone. Implementations should clean up all resources (files, database tables, etc.).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
Index name |
required |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If index doesn't exist |
add_assets ¶
Add assets to index.
Adds multiple ISCC assets to the specified index. Each asset contains an ISCC-ID and ISCC-UNITs for similarity indexing. Assets with missing ISCC-IDs will have them auto-generated.
Implementations should: - Store asset metadata for later retrieval - Index ISCC-UNITs by type for similarity search - Handle duplicates gracefully (update vs create) - Return status for each asset
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index_name
|
Target index name |
required | |
assets
|
List of IsccEntry objects to add |
required |
Returns:
| Type | Description |
|---|---|
|
List of IsccAddResult with status for each asset |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If index doesn't exist |
ValueError
|
If assets contain invalid ISCC codes |
get_asset ¶
Get a specific asset by ISCC-ID.
Retrieves the full asset details for a given ISCC-ID from the specified index. This is useful for fetching complete asset metadata after performing a search, which returns only ISCC-IDs and scores.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index_name
|
Target index name |
required | |
iscc_id
|
ISCC-ID of the asset to retrieve |
required |
Returns:
| Type | Description |
|---|---|
|
IsccEntry with all stored metadata |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If index doesn't exist or asset not found |
ValueError
|
If ISCC-ID format is invalid |
search_assets ¶
Search for similar assets in index.
Performs similarity search using the query asset's ISCC-UNITs. Results are aggregated across all unit types and returned sorted by relevance (highest scores first).
The returned IsccSearchResult includes: - query: The original query asset (may have auto-generated iscc_id) - global_matches: List of IsccGlobalMatch objects with scores and per-unit breakdowns
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
index_name
|
Target index name |
required | |
query
|
IsccQuery to search for (either iscc_code or units required) |
required | |
limit
|
Maximum number of results to return (default: 100) |
100
|
Returns:
| Type | Description |
|---|---|
|
IsccSearchResult with query and list of matches |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If index doesn't exist |
ValueError
|
If query asset is invalid |
close ¶
Close connections and cleanup resources.
Should be called when the backend is no longer needed. Implementations should clean up resources like database connections, file handles, and memory caches. This method should be idempotent (safe to call multiple times).
After calling close(), the index instance should not be used for further operations.
Configuration¶
Server deployment settings managed through environment variables with the ISCC_SEARCH_ prefix.
See the Configuration Reference for the full list of variables.
SearchOptions ¶
Bases: BaseSettings
Application options for ISCC-Search.
Options can be configured via: - Environment variables (prefixed with ISCC_SEARCH_) - .env file in the working directory - Direct instantiation with parameters - Runtime override using the override() method
cors_origins_list
property
¶
Split comma-separated CORS origins string into a list.
Returns:
| Type | Description |
|---|---|
|
List of allowed origin strings |
override ¶
Return an updated and validated deep copy of the current options instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
update
|
Dictionary of field names and values to override. |
None
|
Returns:
| Type | Description |
|---|---|
|
New SearchOptions instance with updated and validated fields. |
Models¶
Types and convenience classes for handling ISCC codes, units, and items.
models ¶
Types and convenience classes for handling ISCCs¶
Terms and Definitions¶
- ISCC - Any ISCC-CODE, ISCC-UNIT, or ISCC-ID
- ISCC-HEADER - Self-describing 2-byte header for V1 components (3 bytes for future versions). The first 12 bits encode MainType, SubType, and Version. Additional bits encode Length for variable-length ISCCs.
- ISCC-BODY - Actual payload of an ISCC, similarity preserving compact binary code, hash or timestamp
- ISCC-DIGEST - Binary representation of complete ISCC (ISCC-HEADER + ISCC-BODY).
- ISCC-SEQUENCE - Binary sequence of ISCC-DIGESTS
- ISCC-UNIT - ISCC-HEADER + ISCC-BODY where the ISCC-BODY is calculated from a single algorithm
- ISCC-CODE - ISCC-HEADER + ISCC-BODY where the ISCC-BODY is a sequence of multiple ISCC-UNIT BODYs
- DATA and INSTANCE are the minimum required mandatory ISCC-UNITS for a valid ISCC-CODE
- ISCC-ID - Globally unique digital asset identifier (ISCC-HEADER + 52-bit timestamp + 12-bit hub_id)
- SIMPRINT - Headerless base64 encoded similarity hash that describes a content segment (granular feature)
- UNIT-TYPE: Identifier for ISCC-UNIT types that can be indexed together with meaningful similarity search
IsccBase ¶
Base class for ISCC objects providing common properties and methods.
Handles conversion between different ISCC representations (string, bytes) and provides access to ISCC components (header, body, fields).
Initialize ISCC object from string or binary representation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iscc
|
ISCC in canonical string format (with or without "ISCC:" prefix) or binary digest |
required |
Raises:
| Type | Description |
|---|---|
TypeError
|
If iscc is not str or bytes |
body
property
¶
Extract ISCC-BODY bytes (payload without header).
Returns:
| Type | Description |
|---|---|
|
ISCC-BODY as raw bytes |
fields
cached
property
¶
Decode ISCC header into structured fields.
Returns:
| Type | Description |
|---|---|
|
IsccTuple with MainType, SubType, Version, Length, and Body |
iscc_type
cached
property
¶
Get human-readable ISCC type identifier.
Returns:
| Type | Description |
|---|---|
|
ISCC type string in format "MAINTYPE-SUBTYPE-VERSION" (e.g., "CONTENT-TEXT-V1") |
__str__ ¶
Get canonical ISCC string representation.
Returns:
| Type | Description |
|---|---|
|
ISCC in canonical format with "ISCC:" prefix and base32 encoding |
__len__ ¶
Get ISCC-BODY bit-length.
Returns:
| Type | Description |
|---|---|
|
Number of bits in ISCC-BODY (64, 128, 192, or 256) |
__bytes__ ¶
Get binary ISCC-DIGEST representation.
Returns:
| Type | Description |
|---|---|
|
Complete ISCC-DIGEST as bytes (ISCC-HEADER + ISCC-BODY) |
IsccID ¶
Bases: IsccBase
ISCC-ID: Globally unique digital asset identifier.
Combines ISCC-HEADER with 52-bit timestamp and 12-bit server-id for unique identification of digital assets across distributed systems.
realm_id
property
¶
Extract REALM-ID from ISCC-ID header.
Returns:
| Type | Description |
|---|---|
|
Realm identifier (0 for REALM_0, 1 for REALM_1) |
__int__
cached
¶
Convert ISCC-ID to integer representation.
WARNING: Integer representation does not include ISCC-HEADER information. Use as a 64-bit integer database ID and keep track of the REALM-ID for reconstruction.
Returns:
| Type | Description |
|---|---|
|
Integer representation of complete ISCC-ID digest |
from_int
classmethod
¶
Construct ISCC-ID from integer and realm identifier.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iscc_id
|
Integer representation of ISCC-ID body (8 bytes) |
required | |
realm_id
|
Realm identifier for ISCC-HEADER SubType (0 or 1) |
required |
Returns:
| Type | Description |
|---|---|
|
New IsccID instance |
from_body
classmethod
¶
Construct ISCC-ID from body bytes and realm identifier.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
body
|
ISCC-ID body bytes (8 bytes) |
required | |
realm_id
|
Realm identifier for ISCC-HEADER SubType (0 or 1) |
required |
Returns:
| Type | Description |
|---|---|
|
New IsccID instance |
random
classmethod
¶
Create a new random ISCC-ID.
Uses REALM-ID 0 for non-authoritative ISCC-IDs with current timestamp and random server ID.
Returns:
| Type | Description |
|---|---|
|
New IsccID instance with random identifier |
IsccUnit ¶
Bases: IsccBase
ISCC-UNIT: Single-algorithm ISCC component.
An ISCC-UNIT combines ISCC-HEADER with ISCC-BODY calculated from a single algorithm. Multiple ISCC-UNITs can be combined to form an ISCC-CODE.
unit_type
property
¶
Get ISCC-UNIT type identifier.
Returns:
| Type | Description |
|---|---|
|
ISCC type string (alias for iscc_type property) |
__array__ ¶
Return numpy array from ISCC-BODY bytes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dtype
|
NumPy dtype for the array |
uint8
|
|
copy
|
If True, always copy. If False, never copy (view only). If None, copy only if needed. |
None
|
Returns:
| Type | Description |
|---|---|
|
NumPy array representation of ISCC-BODY |
IsccCode ¶
Bases: IsccBase
ISCC-CODE: Composite ISCC combining multiple ISCC-UNITs.
An ISCC-CODE combines multiple ISCC-UNIT bodies into a single identifier. Minimum requirement: DATA and INSTANCE units. Can include META, SEMANTIC, and CONTENT units depending on the content type.
units
cached
property
¶
Decompose ISCC-CODE into constituent ISCC-UNITs.
Parses the ISCC-CODE body and reconstructs individual ISCC-UNITs with their headers. Handles both standard codes and WIDE subtype codes.
Returns:
| Type | Description |
|---|---|
|
List of IsccUnit objects contained in this ISCC-CODE |
IsccItemDict ¶
Bases: TypedDict
Dictionary representation of an ISCC item with ID, code, and units as strings.
IsccItem ¶
Bases: Struct
Minimal ISCC container for efficient indexing.
Stores only binary representations (id and units). String representations and derived values are computed on-demand (no caching for memory efficiency).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
id_data
|
ISCC-ID digest (10 bytes: 2-byte header + 8-byte body) |
required | |
units_data
|
Sequence of ISCC-UNIT digests |
required |
dict
property
¶
Convert IsccItem to dictionary representation.
Returns:
| Type | Description |
|---|---|
|
Dictionary with iscc_id, iscc_code, and units as canonical strings |
json
property
¶
Serialize IsccItem to JSON bytes.
Returns:
| Type | Description |
|---|---|
|
JSON-encoded representation of IsccItem dictionary |
new
classmethod
¶
Create a new IsccItem from ISCC-ID and either ISCC-CODE or units.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iscc_id
|
ISCC-ID as string or binary digest |
required | |
iscc_code
|
Optional ISCC-CODE as string or binary digest |
None
|
|
units
|
Optional list of ISCC-UNITs as strings or binary digests |
None
|
Returns:
| Type | Description |
|---|---|
|
New IsccItem instance |
Raises:
| Type | Description |
|---|---|
ValueError
|
If neither iscc_code nor units is provided |
from_dict
classmethod
¶
Create IsccItem from dictionary, generating random ISCC-ID if missing.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Dictionary with optional iscc_id and either iscc_code or units |
required |
Returns:
| Type | Description |
|---|---|
|
New IsccItem instance |
Raises:
| Type | Description |
|---|---|
ValueError
|
If neither iscc_code nor units is provided |
new_iscc_id ¶
Generate a new random ISCC-ID digest.
Creates a 10-byte ISCC-ID using current timestamp (52 bits) and random server ID (12 bits). Uses REALM-0 for non-authoritative identifiers.
Returns:
| Type | Description |
|---|---|
|
Complete ISCC-ID digest (2-byte header + 8-byte body) |
split_iscc_sequence ¶
Split a sequence of concatenated ISCC-DIGESTS.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Concatenated ISCC-DIGESTS (variable-length) |
required |
Returns:
| Type | Description |
|---|---|
|
List of individual ISCC-DIGEST bytes |