Upload a car image and compare predictions from three models side by side.
Custom ViT — fine-tuned
CLIP — zero-shot
GPT-4o — vision LLM