Skip to content

Use the CLI

The iscc-search CLI manages indexes, adds assets, searches for similar content, and starts the REST API server. All data commands operate on the active index unless you override it with --index.

Manage indexes

Register a local index:

iscc-search index add myindex --local

Register a local index with a custom data path:

iscc-search index add myindex --local --path /data/iscc

Register a remote index pointing to a running iscc-search server:

iscc-search index add production --remote https://search.example.com

Register a remote index with an API key:

iscc-search index add production --remote https://search.example.com --api-key your-secret

List all configured indexes:

iscc-search index list

Switch the active index:

iscc-search index use production

Remove an index from configuration:

iscc-search index remove staging

Remove an index and delete its local data:

iscc-search index remove old-local --delete-data

Index HuggingFace datasets

The ISCC Foundation publishes ready-to-index datasets on the HuggingFace Hub. Browse them:

iscc-search datasets

The listing defaults to the iscc organization. Switch to another namespace with --author, cap the output with --limit, or emit JSON for scripting:

iscc-search datasets --author myorg --limit 50
iscc-search datasets --json

Index a dataset directly — parquet files are streamed from the Hub and cached under your HuggingFace cache directory:

iscc-search hub iscc/iscc-flickr30k

When no index is registered yet, hub auto-creates a local one named after the dataset (iscc/iscc-flickr30kflickr30k). Override the target name or stop early while experimenting:

iscc-search hub iscc/iscc-book-covers --limit 10000
iscc-search hub iscc/iscc-flickr30k --index production
iscc-search hub iscc/iscc-flickr30k --split train --batch-size 1000

Original row fields (title, caption, ISBN, image URLs, …) are preserved as opaque metadata on each asset; binary columns such as image thumbnails are skipped.

Add assets

Add assets from a directory of JSON files:

iscc-search add /path/to/assets/

The command looks for *.iscc.json files first, then falls back to *.json. Each file must contain at least an iscc_code or iscc field.

Add assets with a glob pattern:

iscc-search add /data/corpus/*.iscc.json

Add a single file:

iscc-search add asset.iscc.json

Control batch size and truncate simprints:

iscc-search add --batch-size 500 --simprint-bits 128 /data/assets/

Target a specific index instead of the active one:

iscc-search add --index production /data/assets/

Retrieve assets

Fetch full asset details by ISCC-ID:

iscc-search get ISCC:MAIGIIFJRDGEQQAA

Target a specific index:

iscc-search get ISCC:MAIGIIFJRDGEQQAA --index production

Search for similar assets

Search by ISCC-CODE:

iscc-search search ISCC:KECYCMZIOY36XXGZ7S6QJQ2AEEXPOVEHZYPK6GMSFLU3WF54UPZMTPY

Limit the number of results:

iscc-search search ISCC:KECYCMZIOY36XXGZ7S6QJQ2AEEXPOVEHZYPK6GMSFLU3WF54UPZMTPY --limit 10

Search a specific index:

iscc-search search ISCC:KECYCMZIOY36XXGZ7S6QJQ2AEEXPOVEHZYPK6GMSFLU3WF54UPZMTPY --index production

Start the server

Start in production mode:

iscc-search serve

Start in development mode with auto-reload:

iscc-search serve --dev

Use a custom host and port:

iscc-search serve --host 127.0.0.1 --port 9000

Check the version

iscc-search version