Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Cactooz
/
DeepMMAudio
like
2
Video-Text-to-Text
Loie/VGGSound
CLAPv2/Clotho
cvssp/WavCaps
video-to-audio
License:
mit
Model card
Files
Files and versions
xet
Community
main
DeepMMAudio
6.34 GB
1 contributor
History:
5 commits
Cactooz
Add base MMAudio model retrain checkpoint
f6f7b5e
verified
10 days ago
.gitattributes
1.52 kB
initial commit
11 days ago
README.md
213 Bytes
Update datasets used
11 days ago
base-model_checkpoint_full_448b_ckpt_300000.pth
3.15 GB
xet
Add base MMAudio model retrain checkpoint
10 days ago
depth-model_checkpoint_full_448b_ckpt_300000.pth
3.19 GB
xet
Add DeepMMAudio model checkpoint
10 days ago