Computer Vision Toolbox Model for Moondream Vision Language Model

Moondream is a small footprint vision language model, with image captioning capability.
16 Downloads
Updated 25 Nov 2025
The Moondream 2 model is a lightweight Vision-Language Model (Vision-LLM) capable of image captioning. Due to its small size, it can be run efficiently on most local workstations.
MATLAB Release Compatibility
Created with R2026a
Compatible with R2026a
Platform Compatibility
Windows macOS (Apple Silicon) macOS (Intel) Linux
Tags Add Tags