All projects

Multi-Modal Semantic Image Search Engine

Open-source semantic image search supporting text→image, image→image, and text+image→image queries on the Myntra Fashion Product Dataset.

Python
CLIP
FAISS
Multimodal Embeddings
Streamlit

Problem

Fashion catalogs need flexible search beyond text or image alone — shoppers search by photo, description, or both.

Approach

Joint embedding space via multimodal encoders and FAISS index for sub-second retrieval across query modalities.

Architecture

Catalog → image + text encoders → unified vector index → multi-modal query API → Streamlit demo UI.

Results

Real-time text-to-image, image-to-image, and text+image retrieval on the Myntra Fashion dataset.

Lessons learned

Late-fusion of modalities at query time is simpler and competitive with joint training for many retrieval tasks.