CLIP Image Database

A searchable image database using SigLIP 2 (CLIP) embeddings and SQLite-vec for efficient similarity search.

Note: This code was generated with AI assistance.

Features

Image Indexing: Scan directories and extract CLIP embeddings from images
Text Search: Search images using natural language queries
Image Search: Find similar images using a reference image
Combined Search: Combine text and image queries with weighted blending
Interactive Mode: Load model once and run multiple queries
HTML Gallery: Beautiful search results with image previews and direct file access

Note: The HTML search results use localexplorer: protocol links for opening files and folders. To use these links, you'll need a browser extension like Local Explorer for Chrome/Edge or similar extensions for other browsers. Without the extension, you can still view images and copy file paths manually.

Requirements

Python 3.8+
CUDA-capable GPU (recommended) or CPU
See requirements.txt for Python dependencies

Installation

Clone this repository:

git clone <repository-url>
cd CLIP-database

Install dependencies:

cd core
pip install -r requirements.txt

Install sqlite-vec extension (if not already installed):

pip install sqlite-vec

(Optional) Copy config.json.example to ../config.json (repo root) and edit paths if needed.

Usage

Indexing Images

Scan a directory and build the image database:

cd core
python image_database.py scan /path/to/images --db "/path/to/database.db"

Options:

--batch-size: Number of images to process before committing to DB (default: 75)
--inference-batch-size: Batch size for model inference (default: 16, higher = faster but more VRAM)
--profile: Show performance profiling information
--limit: Limit number of images to process (for testing)

cd core
python image_database.py scan /path/to/images --batch-size 75 --inference-batch-size 16 --profile --limit 100

Searching Images

Text Search

cd core
python image_database.py search "a red car" -k 20 --db "/path/to/database.db"

To search in a specific database:

python image_database.py search "a red car" --db "/path/to/database.db" -k 20

Image Search

cd core
python image_database.py search /path/to/image.jpg --image -k 20

Combined Search

cd core
python image_database.py search "sunset" --query2 /path/to/image.jpg --weights 0.7 0.3 -k 20

Negative Prompts

cd core
python image_database.py search "nature" --negative "buildings" -k 20

Interactive Mode

cd core
python image_database.py search --interactive

In interactive mode:

Enter text queries directly
Use image:/path/to/image.jpg for image queries
Combine queries with +: image:/path/to/img.jpg + sunset
Use negative prompts with -: beautiful landscape - people or sunset - image:/path/to/unwanted.jpg
Change result count with k:20
Type quit or exit to end session

Model

This project uses SigLIP 2 SO400M from Google, which provides:

1152-dimensional embeddings
Strong text-image alignment
Efficient inference

The model will be automatically downloaded from HuggingFace on first use (or use --model-cache to specify a custom cache directory).

Database Schema

The SQLite database contains:

images table: Image metadata (file path, last modified, hash)
vec0 virtual table: Vector embeddings (using sqlite-vec)
image_embeddings table: Mapping between images and embeddings

Performance Tips

Use --inference-batch-size to optimize GPU memory usage
Enable --profile to identify bottlenecks
The database uses WAL mode for better concurrent access

License

MIT License - see LICENSE file for details.

This project uses:

SigLIP 2 - Apache 2.0 License
sqlite-vec - MIT/Apache 2.0 dual license
SQLite - Public Domain

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
FUNCTIONALITY.md		FUNCTIONALITY.md
LICENSE		LICENSE
README.md		README.md
browser.png		browser.png
config.json.example		config.json.example
image_database.py		image_database.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CLIP Image Database

Features

Requirements

Installation

Usage

Indexing Images

Searching Images

Text Search

Image Search

Combined Search

Negative Prompts

Interactive Mode

Model

Database Schema

Performance Tips

License

About

Uh oh!

Releases

Packages

Languages

License

droon/CLIP-database

Folders and files

Latest commit

History

Repository files navigation

CLIP Image Database

Features

Requirements

Installation

Usage

Indexing Images

Searching Images

Text Search

Image Search

Combined Search

Negative Prompts

Interactive Mode

Model

Database Schema

Performance Tips

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages