Safetensors
lizhaoyang commited on
Commit
814246b
Β·
1 Parent(s): d4bf1d3

Update README

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -48,7 +48,8 @@ license: apache-2.0
48
 
49
  ## πŸ“– Overview
50
  BindWeave is a unified subject-consistent video generation framework for single- and multi-subject prompts, built on an MLLM-DiT architecture that couples a pretrained multimodal large language model with a diffusion transformer.
51
- It achieves cross-modal integration via entity grounding and representation alignment, leveraging the MLLM to parse complex prompts and produce subject-aware hidden states that condition the DiT for high-fidelity generation.
 
52
 
53
  ### OpenS2V-Eval Performance πŸ†
54
  BindWeave achieves a solid score of 57.61 on the [OpenS2V-Eval](https://huggingface.co/spaces/BestWishYsh/OpenS2V-Eval) benchmark, highlighting its robust capabilities across multiple evaluation dimensions and demonstrating competitive performance against several leading open-source and commercial systems.
 
48
 
49
  ## πŸ“– Overview
50
  BindWeave is a unified subject-consistent video generation framework for single- and multi-subject prompts, built on an MLLM-DiT architecture that couples a pretrained multimodal large language model with a diffusion transformer.
51
+ It achieves cross-modal integration via entity grounding and representation alignment, leveraging the MLLM to parse complex prompts and produce subject-aware hidden states that condition the DiT for high-fidelity generation. For more details or tutorials refer to [ByteDance/BindWeave](https://github.com/bytedance/BindWeave)
52
+
53
 
54
  ### OpenS2V-Eval Performance πŸ†
55
  BindWeave achieves a solid score of 57.61 on the [OpenS2V-Eval](https://huggingface.co/spaces/BestWishYsh/OpenS2V-Eval) benchmark, highlighting its robust capabilities across multiple evaluation dimensions and demonstrating competitive performance against several leading open-source and commercial systems.