Running on Zero 16 Explainable-Vision-Language-Model 🥶 16 Generate a video visualizing how a model attends to an image while generating text