Submitted by Zhuoran Zhao 25 Composing Concepts from Images and Videos via Concept-prompt Binding MMLab@HKUST 51 2