🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

computer-vision large-language-models mllm
1 Open Issue Need Help Last updated: Sep 16, 2025

Open Issues Need Help

View All on GitHub
QwenVL Based model about 2 months ago
enhancement good first issue

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python
#computer-vision#large-language-models#mllm