ByteDance
/

BindWeave

Model card Files Files and versions

lizhaoyang commited on 11 days ago

Commit

814246b

·

1 Parent(s): d4bf1d3

Update README

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -48,7 +48,8 @@ license: apache-2.0
 ## 📖 Overview
 BindWeave is a unified subject-consistent video generation framework for single- and multi-subject prompts, built on an MLLM-DiT architecture that couples a pretrained multimodal large language model with a diffusion transformer.
-It achieves cross-modal integration via entity grounding and representation alignment, leveraging the MLLM to parse complex prompts and produce subject-aware hidden states that condition the DiT for high-fidelity generation.
 ###  OpenS2V-Eval Performance 🏆
 BindWeave achieves a solid score of 57.61 on the [OpenS2V-Eval](https://huggingface.co/spaces/BestWishYsh/OpenS2V-Eval) benchmark, highlighting its robust capabilities across multiple evaluation dimensions and demonstrating competitive performance against several leading open-source and commercial systems.

 ## 📖 Overview
 BindWeave is a unified subject-consistent video generation framework for single- and multi-subject prompts, built on an MLLM-DiT architecture that couples a pretrained multimodal large language model with a diffusion transformer.
+It achieves cross-modal integration via entity grounding and representation alignment, leveraging the MLLM to parse complex prompts and produce subject-aware hidden states that condition the DiT for high-fidelity generation. For more details or tutorials refer to [ByteDance/BindWeave](https://github.com/bytedance/BindWeave)
 ###  OpenS2V-Eval Performance 🏆
 BindWeave achieves a solid score of 57.61 on the [OpenS2V-Eval](https://huggingface.co/spaces/BestWishYsh/OpenS2V-Eval) benchmark, highlighting its robust capabilities across multiple evaluation dimensions and demonstrating competitive performance against several leading open-source and commercial systems.