Docs API reference — image

Image API reference

POST /v1/ingest/image/{tenant}/{record} accepts the raw image bytes. Algorithm is selected via ?algorithm=.

Algorithm matrix

`algorithm`	Output	Use for
`multi`	weighted blend of pHash + dHash + aHash + global + block	best general-purpose default; survives crops, rotations, watermark overlays
`phash`	64-bit perceptual hash (DCT-based)	near-duplicate detection, robust to scaling and mild colour shift
`dhash`	64-bit difference hash	very fast, robust to gamma changes, brittle to crops
`ahash`	64-bit average hash	fastest, weakest; debugging or low-stakes dedup
`semantic`	dense embedding (FP32 vector)	content-similarity search (CLIP-class models)

Request

curl -sS https://ucfp.dev/api/fingerprint \
  -H 'Authorization: Bearer ucfp_…' \
  -H 'Content-Type: image/jpeg' \
  --data-binary @photo.jpg

Or, with parameters as a multipart upload:

curl -sS -X POST 'https://ucfp.dev/v1/ingest/image/17/01HZX…?algorithm=multi' \
  -H 'Authorization: Bearer ucfp_…' \
  -F 'image=@photo.jpg' \
  -F 'preprocess={"max_input_bytes":10485760,"max_dimension":2048,"min_dimension":32};type=application/json' \
  -F 'multi_config={"phash_weight":0.5,"dhash_weight":0.3,"ahash_weight":0.2};type=application/json'

`PreprocessConfigDto`

Applies to every image algorithm. Validates and resizes before hashing.

Field	Default	Effect
`max_input_bytes`	`10485760` (10 MiB)	Reject larger uploads with `413`.
`max_dimension`	`2048`	Downscale longest edge to this if larger. Saves CPU, lossy on tiny detail.
`min_dimension`	`32`	Reject inputs whose shortest edge is smaller, with `422`.

`MultiHashConfigDto` (multi only)

Controls the blended digest. Weights are normalised before hashing — only their ratios matter.

Field	Default	Effect
`phash_weight`	`0.4`	Contribution of pHash.
`dhash_weight`	`0.3`	Contribution of dHash.
`ahash_weight`	`0.1`	Contribution of aHash.
`global_weight`	`0.1`	Contribution of the global colour-histogram component.
`block_weight`	`0.1`	Contribution of the per-block descriptor.
`block_distance_threshold`	`12`	Hamming threshold below which two block-descriptors count as a match.

Set any weight to 0 to disable that component.

Per-algorithm notes

`multi` (default)

Blends pHash, dHash, aHash, a 64-bin colour histogram (global), and a 4×4 block descriptor. Best survival against rotation, crop, and watermark. Largest output (~136 bytes) but still cheap.

`phash`

DCT-based. Output: 8 bytes. Reliable down to ~5 % rescale. Standard recommendation for "find near-identical photos". Feature image-perceptual.

`dhash`

Difference hash. Output: 8 bytes. Faster than pHash (no DCT) and more robust to gamma; fails on tight crops. Feature image-perceptual.

`ahash`

Average hash. Output: 8 bytes. Compute on resized greyscale, threshold each pixel against the mean. Easy to reason about, weakest signal. Feature image-perceptual.

`semantic`

Runs a CLIP-class image encoder. Returns an FP32 vector (typically 384-d or 512-d). Required for "find images that look like this image" search across diverse content. Specify model_id (e.g. clip-vit-b32); see /healthz for the loaded model list. Feature image-semantic.

curl -sS -X POST 'https://ucfp.dev/v1/ingest/image/17/01HZX…?algorithm=semantic&model_id=clip-vit-b32' \
  -H 'Authorization: Bearer ucfp_…' \
  -H 'Content-Type: image/png' \
  --data-binary @photo.png

Response

{
  "tenant_id": 17,
  "record_id": "01HZX…",
  "modality": "image",
  "algorithm": "imgfprint-multi-v1",
  "format_version": 1,
  "config_hash": "0x4f01c882b1ea93de",
  "fingerprint_bytes": 136,
  "has_embedding": false,
  "embedding_dim": null,
  "model_id": null
}

For semantic, has_embedding is true and embedding_dim is set.

Supported input formats

JPEG, PNG, WebP, GIF (first frame), BMP, TIFF. Animated content uses the first frame; submit per-frame to fingerprint a video, or use the upcoming video modality (out of v1).

Image API reference

Algorithm matrix

Request

PreprocessConfigDto

MultiHashConfigDto (multi only)

Per-algorithm notes

multi (default)

phash

dhash

ahash

semantic