GYP666
/

BLADE

Model card Files Files and versions

xet

Community

GYP666 commited on Aug 14

Commit

79b80df

verified ·

1 Parent(s): a681892

Update README.md

Browse files

Files changed (1) hide show

README.md +77 -0

README.md CHANGED Viewed

@@ -137,7 +137,84 @@ python train/inference.py \
 **Output**: The generated videos will be saved in the `wanx/outputs/` directory.
 ## 📊 Project Structure

 **Output**: The generated videos will be saved in the `wanx/outputs/` directory.
+## 🔧 Training Process
+### Step 1: Prompt Preprocessing
+Before training, you need to preprocess the text prompts to generate embeddings.
+#### CogVideoX Preprocessing
+```bash
+cd utils
+python process_prompts_cogvideox.py \
+    --input_file your_prompts.txt \
+    --output_dir ../cogvideox/prompts \
+    --model_path ../cogvideox/CogVideoX-5b \
+    --batch_size 32 \
+    --save_separate
+```
+**Argument Descriptions**:
+  - `--input_file`: A `.txt` file containing prompts, with one prompt per line.
+  - `--output_dir`: The directory to save the output embeddings.
+  - `--model_path`: Path to the CogVideoX model.
+  - `--batch_size`: The batch size for processing.
+  - `--save_separate`: Whether to save each embedding as a separate file.
+#### WanX Preprocessing
+```bash
+cd utils
+python process_prompts_wanx.py
+```
+This script will automatically process the prompts in `utils/all_dimension_aug_wanx.txt` and generate the corresponding embeddings.
+### Step 2: Start Training
+#### CogVideoX Training
+```bash
+cd cogvideox
+bash train_tdm_1.sh
+```
+**Core Training Parameters**:
+```bash
+# If not training with 8 GPUs, you must modify CUDA_VISIBLE_DEVICES and the num_processes in config.yaml
+CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,8 accelerate launch \
+    --config_file train/config.yaml \
+    train/train_cogvideo_tdm.py \
+    --pretrained_model_name_or_path CogVideoX-5b \        # Path to the base model
+    --mixed_precision bf16 \                              # Use mixed-precision for reduced memory usage
+    --train_batch_size 5 \                                # Training batch size
+    --gradient_accumulation_steps 4 \                     # Number of gradient accumulation steps
+    --learning_rate 1e-4 \                                # Learning rate for the student model
+    --learning_rate_g 1e-4 \
+    --learning_rate_fake 5e-4 \                           # Learning rate for the fake model
+    --lambda_reg 0.5 \                                    # Regularization weight
+    --k_step 8 \                                          # Target number of steps for distillation
+    --cfg 3.5 \                                           # Classifier-Free Guidance scale
+    --eta 0.9 \                                           # ETA parameter for DDIM
+    --use_sparsity true \                                 # Enable sparse attention
+    --rank 64 \
+    --lora_alpha 64 \                                     # LoRA configuration
+    --max_train_steps 300 \                               # Maximum number of training steps
+    --checkpointing_steps 15 \                            # Interval for saving checkpoints
+    --gradient_checkpointing \                            # Use gradient checkpointing to save memory
+    --enable_slicing \
+    --enable_tiling                                       # VAE memory optimization
+```
+#### WanX Training
+```bash
+cd wanx
+bash train_wanx_tdm.sh
+```
 ## 📊 Project Structure