Semantic Search Benchmarking via Synthetic Data Generation
A framework for generating synthetic Q&A datasets using Gemini AI to benchmark semantic search models across diverse domains — eliminating the need for manual evaluation set curation.
Problem
Evaluating semantic search quality requires domain-specific (query, relevant document) pairs — expensive and slow to curate by hand.
Approach
Use Gemini AI to generate synthetic (query, answer, source document) triples from a given corpus, then run embedding models against retrieval metrics.
Architecture
Corpus → Gemini AI query/answer generation → embedding model evaluation pipeline → retrieval metrics (MRR, Recall@K, NDCG).
Results
Framework generates domain-specific benchmarks and evaluates multiple semantic search models comparatively.
Lessons learned
Synthetic data from capable LLMs is a fast, scalable alternative to manual eval set construction — especially effective for domain-specific retrieval benchmarking.
Next
Fed-BLEND — Federated Conformal Prediction for VLMs
Novel federated conformal prediction method that mitigates hallucinations in federated fine-tuned vision-language models via abstention.
GCTAF — Time-Series Forecasting & Flare Risk Classification
Attention-based forecasting of magnetic field trajectories combined with supervised contrastive learning for solar flare classification.