Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
dipta007
/
VCInspector-7B
like
1
Image-Text-to-Text
Transformers
Safetensors
dipta007/ActivityNet-FG-It
English
qwen2_5_vl
image-to-text
multimodal
video-caption-evaluation
reference-free
factual-analysis
vision-language
conversational
text-generation-inference
arxiv:
2509.16538
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
VCInspector-7B
Commit History
Create README.md
d063cdb
verified
dipta007
commited on
26 days ago
update files
f966cc7
verified
dipta007
commited on
Dec 26, 2025
initial commit
48f760a
verified
dipta007
commited on
Dec 26, 2025