GYP666 commited on
Commit
79b80df
·
verified ·
1 Parent(s): a681892

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md CHANGED
@@ -137,7 +137,84 @@ python train/inference.py \
137
 
138
  **Output**: The generated videos will be saved in the `wanx/outputs/` directory.
139
 
 
140
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
141
 
142
  ## 📊 Project Structure
143
 
 
137
 
138
  **Output**: The generated videos will be saved in the `wanx/outputs/` directory.
139
 
140
+ ## 🔧 Training Process
141
 
142
+ ### Step 1: Prompt Preprocessing
143
+
144
+ Before training, you need to preprocess the text prompts to generate embeddings.
145
+
146
+ #### CogVideoX Preprocessing
147
+
148
+ ```bash
149
+ cd utils
150
+ python process_prompts_cogvideox.py \
151
+ --input_file your_prompts.txt \
152
+ --output_dir ../cogvideox/prompts \
153
+ --model_path ../cogvideox/CogVideoX-5b \
154
+ --batch_size 32 \
155
+ --save_separate
156
+ ```
157
+
158
+ **Argument Descriptions**:
159
+
160
+ - `--input_file`: A `.txt` file containing prompts, with one prompt per line.
161
+ - `--output_dir`: The directory to save the output embeddings.
162
+ - `--model_path`: Path to the CogVideoX model.
163
+ - `--batch_size`: The batch size for processing.
164
+ - `--save_separate`: Whether to save each embedding as a separate file.
165
+
166
+ #### WanX Preprocessing
167
+
168
+ ```bash
169
+ cd utils
170
+ python process_prompts_wanx.py
171
+ ```
172
+
173
+ This script will automatically process the prompts in `utils/all_dimension_aug_wanx.txt` and generate the corresponding embeddings.
174
+
175
+ ### Step 2: Start Training
176
+
177
+ #### CogVideoX Training
178
+
179
+ ```bash
180
+ cd cogvideox
181
+ bash train_tdm_1.sh
182
+ ```
183
+
184
+ **Core Training Parameters**:
185
+
186
+ ```bash
187
+ # If not training with 8 GPUs, you must modify CUDA_VISIBLE_DEVICES and the num_processes in config.yaml
188
+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,8 accelerate launch \
189
+ --config_file train/config.yaml \
190
+ train/train_cogvideo_tdm.py \
191
+ --pretrained_model_name_or_path CogVideoX-5b \ # Path to the base model
192
+ --mixed_precision bf16 \ # Use mixed-precision for reduced memory usage
193
+ --train_batch_size 5 \ # Training batch size
194
+ --gradient_accumulation_steps 4 \ # Number of gradient accumulation steps
195
+ --learning_rate 1e-4 \ # Learning rate for the student model
196
+ --learning_rate_g 1e-4 \
197
+ --learning_rate_fake 5e-4 \ # Learning rate for the fake model
198
+ --lambda_reg 0.5 \ # Regularization weight
199
+ --k_step 8 \ # Target number of steps for distillation
200
+ --cfg 3.5 \ # Classifier-Free Guidance scale
201
+ --eta 0.9 \ # ETA parameter for DDIM
202
+ --use_sparsity true \ # Enable sparse attention
203
+ --rank 64 \
204
+ --lora_alpha 64 \ # LoRA configuration
205
+ --max_train_steps 300 \ # Maximum number of training steps
206
+ --checkpointing_steps 15 \ # Interval for saving checkpoints
207
+ --gradient_checkpointing \ # Use gradient checkpointing to save memory
208
+ --enable_slicing \
209
+ --enable_tiling # VAE memory optimization
210
+ ```
211
+
212
+ #### WanX Training
213
+
214
+ ```bash
215
+ cd wanx
216
+ bash train_wanx_tdm.sh
217
+ ```
218
 
219
  ## 📊 Project Structure
220