Skip to content

iscc-search

Tests Python License DeepWiki

BETA

This project is under active development. The API is not yet stable and may change without notice.

Similarity search engine for ISCC content codes.

iscc-search indexes ISCC codes and finds similar digital content. You provide ISCC codes (content fingerprints defined by ISO 24138), and the engine returns ranked matches based on content similarity.

The project supports multiple storage backends - from in-memory indexes for testing to HNSW-accelerated persistent stores for production. You can use it as a Python library, a command-line tool, or a REST API server.

iscc-search vs iscc-usearch

iscc-search (this project) is the search engine - CLI, REST API, and index management. iscc-usearch is a patched fork of the usearch vector search library that provides the NPHD metric and low-level vector indexes. iscc-search uses iscc-usearch internally as one of its backends. Most users only need iscc-search.

Key capabilities

  • Variable-length code matching - compares ISCC codes of different lengths using normalized prefix Hamming distance
  • Multiple backends - in-memory (memory://), LMDB-backed (lmdb://), and HNSW-accelerated (usearch://)
  • REST API - FastAPI server with OpenAPI documentation, health checks, and optional authentication
  • CLI - manage indexes, add assets, and search from the terminal
  • Protocol-based abstraction - all backends implement IsccIndexProtocol, so you can swap storage without changing application code

Quick start

uv add iscc-search
pip install iscc-search

Use the Python API to create an index, add an asset, and search:

import os

os.environ["ISCC_SEARCH_INDEX_URI"] = "memory://"

from iscc_search.options import get_index
from iscc_search.schema import IsccEntry, IsccIndex, IsccQuery

# Create index backend
index = get_index()
index.create_index(IsccIndex(name="demo"))

# Add an asset with an ISCC-CODE
asset = IsccEntry(iscc_code="ISCC:KACYPXW445FTYNJ3CYSXHAFJMA2HUWULUNRFE3BLHRSCXYH2M5AEGQY")
index.add_assets("demo", [asset])

# Search for similar content
query = IsccQuery(iscc_code="ISCC:KACYPXW445FTYNJ3CYSXHAFJMA2HUWULUNRFE3BLHRSCXYH2M5AEGQY")
results = index.search_assets("demo", query)

for match in results.global_matches:
    print(f"{match.iscc_id}  score={match.score}")

index.close()

Documentation

  • Tutorials - Learn the basics

    Hands-on guide from installation to your first search.

  • How-to Guides - Solve specific problems

    Backend configuration, CLI usage, REST API, and deployment.

  • Explanation - Understand the concepts

    How ISCC works, system architecture, and similarity search internals.

  • Reference - Look up details

    API reference, configuration options, and agent documentation.

Source code on GitHub