2024/08/18 04:09:04 - mmengine - DEBUG - An `DeepSpeedStrategy` instance is built from registry, and its implementation can be found in xtuner.engine._strategy.deepspeed 2024/08/18 04:09:04 - mmengine - INFO - ------------------------------------------------------------ System environment: sys.platform: linux Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] CUDA available: True MUSA available: False numpy_random_seed: 546015089 GPU 0: NVIDIA A100-SXM4-80GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 12.2, V12.2.140 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 PyTorch: 2.3.1+cu121 PyTorch compiling details: PyTorch built with: - GCC 9.3 - C++ Version: 201703 - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361) - OpenMP 201511 (a.k.a. OpenMP 4.5) - LAPACK is enabled (usually provided by MKL) - NNPACK is enabled - CPU capability usage: AVX512 - CUDA Runtime 12.1 - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90 - CuDNN 8.9.2 - Magma 2.6.1 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.1, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, TorchVision: 0.18.1+cu121 OpenCV: 4.9.0 MMEngine: 0.10.3 Runtime environment: launcher: none randomness: {'seed': None, 'deterministic': False} cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: None deterministic: False Distributed launcher: none Distributed training: False GPU number: 1 ------------------------------------------------------------ 2024/08/18 04:09:05 - mmengine - INFO - Config: accumulative_counts = 64 batch_size = 2 betas = ( 0.9, 0.999, ) custom_hooks = [ dict( tokenizer=dict( pretrained_model_name_or_path='/root/models/InternVL2_2B', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained'), type='xtuner.engine.hooks.DatasetInfoHook'), ] data_path = '/root/data/screenshot_od/updated_layout_ocr_multi_teach_filtered.json' data_root = '/root/data/' dataloader_num_workers = 4 default_hooks = dict( checkpoint=dict( by_epoch=False, interval=1000, max_keep_ckpts=-1, save_optimizer=False, type='mmengine.hooks.CheckpointHook'), logger=dict( interval=10, log_metric_by_epoch=False, type='mmengine.hooks.LoggerHook'), param_scheduler=dict(type='mmengine.hooks.ParamSchedulerHook'), sampler_seed=dict(type='mmengine.hooks.DistSamplerSeedHook'), timer=dict(type='mmengine.hooks.IterTimerHook')) env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) image_folder = '/root/data/extracted_images' launcher = 'none' llava_dataset = dict( data_paths= '/root/data/screenshot_od/updated_layout_ocr_multi_teach_filtered.json', image_folders='/root/data/extracted_images', max_length=8192, model_path='/root/models/InternVL2_2B', template='xtuner.utils.PROMPT_TEMPLATE.internlm2_chat', type='xtuner.dataset.InternVL_V1_5_Dataset') load_from = None log_level = 'DEBUG' log_processor = dict(by_epoch=False) lr = 2e-05 max_epochs = 4 max_length = 8192 max_norm = 1 model = dict( freeze_llm=True, freeze_visual_encoder=True, llm_lora=dict( lora_alpha=256, lora_dropout=0.05, r=128, target_modules=None, task_type='CAUSAL_LM', type='peft.LoraConfig'), model_path='/root/models/InternVL2_2B', quantization_llm=True, quantization_vit=False, type='xtuner.model.InternVL_V1_5') optim_type = 'torch.optim.AdamW' optim_wrapper = dict( optimizer=dict( betas=( 0.9, 0.999, ), lr=2e-05, type='torch.optim.AdamW', weight_decay=0.05), type='DeepSpeedOptimWrapper') param_scheduler = [ dict( begin=0, by_epoch=True, convert_to_iter_based=True, end=0.12, start_factor=1e-05, type='mmengine.optim.LinearLR'), dict( begin=0.12, by_epoch=True, convert_to_iter_based=True, end=4, eta_min=0.0, type='mmengine.optim.CosineAnnealingLR'), ] path = '/root/models/InternVL2_2B' prompt_template = 'xtuner.utils.PROMPT_TEMPLATE.internlm2_chat' randomness = dict(deterministic=False, seed=None) resume = False runner_type = 'FlexibleRunner' save_steps = 1000 save_total_limit = -1 strategy = dict( config=dict( bf16=dict(enabled=True), fp16=dict(enabled=False, initial_scale_power=16), gradient_accumulation_steps='auto', gradient_clipping='auto', train_micro_batch_size_per_gpu='auto', zero_allow_untested_optimizer=True, zero_force_ds_cpu_optimizer=False, zero_optimization=dict(overlap_comm=True, stage=2)), exclude_frozen_parameters=True, gradient_accumulation_steps=64, gradient_clipping=1, sequence_parallel_size=1, train_micro_batch_size_per_gpu=2, type='xtuner.engine.DeepSpeedStrategy') tokenizer = dict( pretrained_model_name_or_path='/root/models/InternVL2_2B', trust_remote_code=True, type='transformers.AutoTokenizer.from_pretrained') train_cfg = dict(max_epochs=4, type='xtuner.engine.runner.TrainLoop') train_dataloader = dict( batch_size=2, collate_fn=dict(type='xtuner.dataset.collate_fns.default_collate_fn'), dataset=dict( data_paths= '/root/data/screenshot_od/updated_layout_ocr_multi_teach_filtered.json', image_folders='/root/data/extracted_images', max_length=8192, model_path='/root/models/InternVL2_2B', template='xtuner.utils.PROMPT_TEMPLATE.internlm2_chat', type='xtuner.dataset.InternVL_V1_5_Dataset'), num_workers=4, sampler=dict( length_property='modality_length', per_device_batch_size=128, type='xtuner.dataset.samplers.LengthGroupedSampler')) visualizer = dict( type='mmengine.visualization.Visualizer', vis_backends=[ dict(type='mmengine.visualization.TensorboardVisBackend'), ]) warmup_ratio = 0.03 weight_decay = 0.05 work_dir = '/root/wangqun/work_dirs/internvl_ft_run_9_filter' 2024/08/18 04:09:05 - mmengine - DEBUG - An `TensorboardVisBackend` instance is built from registry, and its implementation can be found in mmengine.visualization.vis_backend 2024/08/18 04:09:05 - mmengine - DEBUG - An `Visualizer` instance is built from registry, and its implementation can be found in mmengine.visualization.visualizer 2024/08/18 04:09:05 - mmengine - DEBUG - Attribute `_env_initialized` is not defined in or `._env_initialized is False, `_init_env` will be called and ._env_initialized will be set to True 2024/08/18 04:09:08 - mmengine - DEBUG - Get class `RuntimeInfoHook` from "hook" registry in "mmengine" 2024/08/18 04:09:08 - mmengine - DEBUG - An `RuntimeInfoHook` instance is built from registry, and its implementation can be found in mmengine.hooks.runtime_info_hook 2024/08/18 04:09:08 - mmengine - DEBUG - An `IterTimerHook` instance is built from registry, and its implementation can be found in mmengine.hooks.iter_timer_hook 2024/08/18 04:09:08 - mmengine - DEBUG - An `DistSamplerSeedHook` instance is built from registry, and its implementation can be found in mmengine.hooks.sampler_seed_hook 2024/08/18 04:09:08 - mmengine - DEBUG - An `LoggerHook` instance is built from registry, and its implementation can be found in mmengine.hooks.logger_hook 2024/08/18 04:09:08 - mmengine - DEBUG - An `ParamSchedulerHook` instance is built from registry, and its implementation can be found in mmengine.hooks.param_scheduler_hook 2024/08/18 04:09:08 - mmengine - DEBUG - An `CheckpointHook` instance is built from registry, and its implementation can be found in mmengine.hooks.checkpoint_hook 2024/08/18 04:09:08 - mmengine - WARNING - Failed to search registry with scope "mmengine" in the "builder" registry tree. As a workaround, the current "builder" registry in "xtuner" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmengine" is a correct scope, or whether the registry is initialized. 2024/08/18 04:09:09 - mmengine - DEBUG - An `from_pretrained` instance is built from registry, and its implementation can be found in transformers.models.auto.tokenization_auto 2024/08/18 04:09:09 - mmengine - DEBUG - An `DatasetInfoHook` instance is built from registry, and its implementation can be found in xtuner.engine.hooks.dataset_info_hook 2024/08/18 04:09:09 - mmengine - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) RuntimeInfoHook (BELOW_NORMAL) LoggerHook -------------------- before_train: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DatasetInfoHook (VERY_LOW ) CheckpointHook -------------------- before_train_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (NORMAL ) DistSamplerSeedHook -------------------- before_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook -------------------- after_train_iter: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- after_train_epoch: (NORMAL ) IterTimerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- before_val: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook -------------------- before_val_epoch: (NORMAL ) IterTimerHook -------------------- before_val_iter: (NORMAL ) IterTimerHook -------------------- after_val_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_val_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook (LOW ) ParamSchedulerHook (VERY_LOW ) CheckpointHook -------------------- after_val: (VERY_HIGH ) RuntimeInfoHook -------------------- after_train: (VERY_HIGH ) RuntimeInfoHook (VERY_LOW ) CheckpointHook -------------------- before_test: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) DatasetInfoHook -------------------- before_test_epoch: (NORMAL ) IterTimerHook -------------------- before_test_iter: (NORMAL ) IterTimerHook -------------------- after_test_iter: (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_test_epoch: (VERY_HIGH ) RuntimeInfoHook (NORMAL ) IterTimerHook (BELOW_NORMAL) LoggerHook -------------------- after_test: (VERY_HIGH ) RuntimeInfoHook -------------------- after_run: (BELOW_NORMAL) LoggerHook -------------------- 2024/08/18 04:09:09 - mmengine - DEBUG - An `FlexibleRunner` instance is built from registry, its implementation can be found inmmengine.runner._flexible_runner 2024/08/18 04:09:09 - mmengine - INFO - Starting to loading data and calc length 2024/08/18 04:09:09 - mmengine - INFO - =======Starting to process /root/data/screenshot_od/updated_layout_ocr_multi_teach_filtered.json ======= 2024/08/18 04:09:12 - mmengine - INFO - =======total 4562 samples of /root/data/screenshot_od/updated_layout_ocr_multi_teach_filtered.json======= 2024/08/18 04:09:12 - mmengine - INFO - end loading data and calc length 2024/08/18 04:09:12 - mmengine - INFO - =======total 4562 samples======= 2024/08/18 04:09:12 - mmengine - DEBUG - An `InternVL_V1_5_Dataset` instance is built from registry, and its implementation can be found in xtuner.dataset.internvl_dataset 2024/08/18 04:09:13 - mmengine - INFO - LengthGroupedSampler is used. 2024/08/18 04:09:13 - mmengine - INFO - LengthGroupedSampler construction is complete, and the selected attribute is modality_length 2024/08/18 04:09:13 - mmengine - DEBUG - An `LengthGroupedSampler` instance is built from registry, and its implementation can be found in xtuner.dataset.samplers.length_grouped 2024/08/18 04:09:13 - mmengine - WARNING - Dataset InternVL_V1_5_Dataset has no metainfo. ``dataset_meta`` in visualizer will be None. 2024/08/18 04:09:13 - mmengine - DEBUG - An `TrainLoop` instance is built from registry, and its implementation can be found in xtuner.engine.runner.loops 2024/08/18 04:09:13 - mmengine - INFO - Start to load InternVL_V1_5 model. 2024/08/18 04:09:13 - mmengine - DEBUG - Get class `BaseDataPreprocessor` from "model" registry in "mmengine" 2024/08/18 04:09:13 - mmengine - DEBUG - An `BaseDataPreprocessor` instance is built from registry, and its implementation can be found in mmengine.model.base_model.data_preprocessor 2024/08/18 04:10:36 - mmengine - DEBUG - An `LoraConfig` instance is built from registry, and its implementation can be found in peft.tuners.lora.config 2024/08/18 04:10:41 - mmengine - INFO - InternVL_V1_5( (data_preprocessor): BaseDataPreprocessor() (model): InternVLChatModel( (vision_model): InternVisionModel( (embeddings): InternVisionEmbeddings( (patch_embedding): Conv2d(3, 1024, kernel_size=(14, 14), stride=(14, 14)) ) (encoder): InternVisionEncoder( (layers): ModuleList( (0-23): 24 x InternVisionEncoderLayer( (attn): InternAttention( (qkv): Linear(in_features=1024, out_features=3072, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=1024, out_features=1024, bias=True) ) (mlp): InternMLP( (act): GELUActivation() (fc1): Linear(in_features=1024, out_features=4096, bias=True) (fc2): Linear(in_features=4096, out_features=1024, bias=True) ) (norm1): LayerNorm((1024,), eps=1e-06, elementwise_affine=True) (norm2): LayerNorm((1024,), eps=1e-06, elementwise_affine=True) (drop_path1): Identity() (drop_path2): Identity() ) ) ) ) (language_model): PeftModelForCausalLM( (base_model): LoraModel( (model): InternLM2ForCausalLM( (model): InternLM2Model( (tok_embeddings): Embedding(92553, 2048, padding_idx=2) (layers): ModuleList( (0-23): 24 x InternLM2DecoderLayer( (attention): InternLM2Attention( (wqkv): lora.Linear( (base_layer): Linear4bit(in_features=2048, out_features=4096, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.05, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=128, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=128, out_features=4096, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ) (wo): lora.Linear( (base_layer): Linear4bit(in_features=2048, out_features=2048, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.05, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=128, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=128, out_features=2048, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ) (rotary_emb): InternLM2DynamicNTKScalingRotaryEmbedding() ) (feed_forward): InternLM2MLP( (w1): lora.Linear( (base_layer): Linear4bit(in_features=2048, out_features=8192, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.05, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=128, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=128, out_features=8192, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ) (w3): lora.Linear( (base_layer): Linear4bit(in_features=2048, out_features=8192, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.05, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=128, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=128, out_features=8192, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ) (w2): lora.Linear( (base_layer): Linear4bit(in_features=8192, out_features=2048, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.05, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=8192, out_features=128, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=128, out_features=2048, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ) (act_fn): SiLU() ) (attention_norm): InternLM2RMSNorm() (ffn_norm): InternLM2RMSNorm() ) ) (norm): InternLM2RMSNorm() ) (output): lora.Linear( (base_layer): Linear4bit(in_features=2048, out_features=92553, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.05, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=128, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=128, out_features=92553, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() ) ) ) ) (mlp1): Sequential( (0): LayerNorm((4096,), eps=1e-05, elementwise_affine=True) (1): Linear(in_features=4096, out_features=2048, bias=True) (2): GELU(approximate='none') (3): Linear(in_features=2048, out_features=2048, bias=True) ) ) ) 2024/08/18 04:10:41 - mmengine - INFO - InternVL_V1_5 construction is complete 2024/08/18 04:10:41 - mmengine - DEBUG - An `InternVL_V1_5` instance is built from registry, and its implementation can be found in xtuner.model.internvl 2024/08/18 04:10:41 - mmengine - DEBUG - Get class `DefaultOptimWrapperConstructor` from "optimizer wrapper constructor" registry in "mmengine" 2024/08/18 04:10:41 - mmengine - DEBUG - An `DefaultOptimWrapperConstructor` instance is built from registry, and its implementation can be found in mmengine.optim.optimizer.default_constructor 2024/08/18 04:10:41 - mmengine - DEBUG - An `AdamW` instance is built from registry, and its implementation can be found in torch.optim.adamw 2024/08/18 04:10:41 - mmengine - DEBUG - Get class `DeepSpeedOptimWrapper` from "optim_wrapper" registry in "mmengine" 2024/08/18 04:10:41 - mmengine - DEBUG - An `DeepSpeedOptimWrapper` instance is built from registry, and its implementation can be found in mmengine._strategy.deepspeed 2024/08/18 04:10:44 - mmengine - DEBUG - The `end` of is not set. Use the max epochs/iters of train loop as default. 2024/08/18 04:10:44 - mmengine - DEBUG - The `end` of is not set. Use the max epochs/iters of train loop as default. 2024/08/18 04:10:44 - mmengine - INFO - Num train samples 4562 2024/08/18 04:10:44 - mmengine - INFO - train example: 2024/08/18 04:10:45 - mmengine - INFO - <|im_start|> system You are an AI assistant whose name is InternLM (书生·浦语).<|im_end|><|im_start|>user 请从这张聊天截图中提取结构化信息<|im_end|><|im_start|> assistant { "dialog_name": "wxid_ljj408gtxlw512", "conversation": [ { "timestamp": "2024-06-17 23:51:00", "speaker": "wxid_ljj408gtxlw512", "content": "对方还不是你的朋友", "image": "", "transfer": [], "file": [] }, { "timestamp": "2024-06-17 23:51:00", "speaker": "wxid_ljj408gtxlw512", "content": "", "image": "", "transfer": "transfer", "file": "" }, { "timestamp": "2024-06-17 23:51:00", "speaker": "wxid_ljj408gtxlw512", "content": "", "image": "", "transfer": "transfer", "file": "" }, { "timestamp": "2024-06-17 23:51:00", "speaker": "wxid_ljj408gtxlw512", "content": "", "image": "", "transfer": "transfer", "file": "" } ] }<|im_end|> 2024/08/18 04:10:45 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io 2024/08/18 04:10:45 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future. 2024/08/18 04:10:45 - mmengine - INFO - Checkpoints will be saved to /root/wangqun/work_dirs/internvl_ft_run_9_filter. 2024/08/18 04:11:54 - mmengine - INFO - Iter(train) [ 10/9216] lr: 6.5474e-07 eta: 17:50:17 time: 6.9756 data_time: 0.0144 memory: 22610 loss: 0.2850 2024/08/18 04:12:21 - mmengine - INFO - Iter(train) [ 20/9216] lr: 1.3820e-06 eta: 12:20:02 time: 2.6814 data_time: 0.0173 memory: 17423 loss: 0.2930 2024/08/18 04:12:47 - mmengine - INFO - Iter(train) [ 30/9216] lr: 2.1093e-06 eta: 10:22:27 time: 2.5402 data_time: 0.0147 memory: 16461 loss: 0.3648 2024/08/18 04:13:14 - mmengine - INFO - Iter(train) [ 40/9216] lr: 2.8365e-06 eta: 9:31:27 time: 2.7493 data_time: 0.0143 memory: 16188 loss: 0.2915 2024/08/18 04:13:47 - mmengine - INFO - Iter(train) [ 50/9216] lr: 3.5638e-06 eta: 9:15:59 time: 3.2507 data_time: 0.0147 memory: 16247 loss: 0.2793 2024/08/18 04:14:24 - mmengine - INFO - Iter(train) [ 60/9216] lr: 4.2911e-06 eta: 9:17:52 time: 3.7375 data_time: 0.0145 memory: 16045 loss: 0.3050 2024/08/18 04:15:05 - mmengine - INFO - Iter(train) [ 70/9216] lr: 5.0183e-06 eta: 9:26:28 time: 4.0790 data_time: 0.0145 memory: 16159 loss: 0.3289 2024/08/18 04:15:48 - mmengine - INFO - Iter(train) [ 80/9216] lr: 5.7456e-06 eta: 9:37:20 time: 4.3197 data_time: 0.0145 memory: 16138 loss: 0.2859 2024/08/18 04:16:33 - mmengine - INFO - Iter(train) [ 90/9216] lr: 6.4729e-06 eta: 9:48:34 time: 4.4935 data_time: 0.0146 memory: 16169 loss: 0.3435 2024/08/18 04:17:18 - mmengine - INFO - Iter(train) [ 100/9216] lr: 7.2001e-06 eta: 9:58:12 time: 4.5458 data_time: 0.0143 memory: 15997 loss: 0.3320 2024/08/18 04:18:05 - mmengine - INFO - Iter(train) [ 110/9216] lr: 7.9274e-06 eta: 10:07:12 time: 4.6375 data_time: 0.0144 memory: 15931 loss: 0.3482 2024/08/18 04:18:52 - mmengine - INFO - Iter(train) [ 120/9216] lr: 8.6547e-06 eta: 10:15:36 time: 4.7182 data_time: 0.0144 memory: 16024 loss: 0.3717 2024/08/18 04:19:38 - mmengine - INFO - Iter(train) [ 130/9216] lr: 9.3819e-06 eta: 10:21:49 time: 4.6531 data_time: 0.0144 memory: 15905 loss: 0.3732 2024/08/18 04:20:25 - mmengine - INFO - Iter(train) [ 140/9216] lr: 1.0109e-05 eta: 10:27:11 time: 4.6656 data_time: 0.0145 memory: 16112 loss: 0.3444 2024/08/18 04:21:12 - mmengine - INFO - Iter(train) [ 150/9216] lr: 1.0836e-05 eta: 10:31:39 time: 4.6598 data_time: 0.0146 memory: 15791 loss: 0.3489 2024/08/18 04:21:59 - mmengine - INFO - Iter(train) [ 160/9216] lr: 1.1564e-05 eta: 10:36:33 time: 4.7730 data_time: 0.0149 memory: 15969 loss: 0.3573 2024/08/18 04:22:47 - mmengine - INFO - Iter(train) [ 170/9216] lr: 1.2291e-05 eta: 10:40:54 time: 4.7875 data_time: 0.0147 memory: 15810 loss: 0.3566 2024/08/18 04:23:35 - mmengine - INFO - Iter(train) [ 180/9216] lr: 1.3018e-05 eta: 10:44:19 time: 4.7426 data_time: 0.0148 memory: 15758 loss: 0.4069 2024/08/18 04:24:22 - mmengine - INFO - Iter(train) [ 190/9216] lr: 1.3746e-05 eta: 10:46:48 time: 4.6831 data_time: 0.0145 memory: 15724 loss: 0.4203 2024/08/18 04:25:08 - mmengine - INFO - Iter(train) [ 200/9216] lr: 1.4473e-05 eta: 10:48:44 time: 4.6528 data_time: 0.0144 memory: 15777 loss: 0.3773 2024/08/18 04:25:54 - mmengine - INFO - Iter(train) [ 210/9216] lr: 1.5200e-05 eta: 10:50:10 time: 4.6180 data_time: 0.0144 memory: 15689 loss: 0.3462 2024/08/18 04:26:41 - mmengine - INFO - Iter(train) [ 220/9216] lr: 1.5927e-05 eta: 10:51:28 time: 4.6281 data_time: 0.0145 memory: 15796 loss: 0.3856 2024/08/18 04:27:27 - mmengine - INFO - Iter(train) [ 230/9216] lr: 1.6655e-05 eta: 10:52:47 time: 4.6588 data_time: 0.0144 memory: 15755 loss: 0.4274 2024/08/18 04:28:13 - mmengine - INFO - Iter(train) [ 240/9216] lr: 1.7382e-05 eta: 10:53:45 time: 4.6294 data_time: 0.0147 memory: 15654 loss: 0.4223 2024/08/18 04:28:59 - mmengine - INFO - Iter(train) [ 250/9216] lr: 1.8109e-05 eta: 10:54:12 time: 4.5689 data_time: 0.0145 memory: 15559 loss: 0.4469 2024/08/18 04:29:44 - mmengine - INFO - Iter(train) [ 260/9216] lr: 1.8836e-05 eta: 10:54:15 time: 4.5129 data_time: 0.0149 memory: 15533 loss: 0.4657 2024/08/18 04:30:31 - mmengine - INFO - Iter(train) [ 270/9216] lr: 1.9564e-05 eta: 10:55:22 time: 4.7166 data_time: 0.0145 memory: 15710 loss: 0.4091 2024/08/18 04:31:16 - mmengine - INFO - Iter(train) [ 280/9216] lr: 2.0000e-05 eta: 10:54:50 time: 4.4326 data_time: 0.0149 memory: 15460 loss: 0.3527 2024/08/18 04:32:00 - mmengine - INFO - Iter(train) [ 290/9216] lr: 2.0000e-05 eta: 10:54:15 time: 4.4283 data_time: 0.0144 memory: 15457 loss: 0.4037 2024/08/18 04:32:44 - mmengine - INFO - Iter(train) [ 300/9216] lr: 2.0000e-05 eta: 10:53:38 time: 4.4206 data_time: 0.0144 memory: 15332 loss: 0.4215 2024/08/18 04:33:28 - mmengine - INFO - Iter(train) [ 310/9216] lr: 1.9999e-05 eta: 10:52:56 time: 4.4072 data_time: 0.0146 memory: 15336 loss: 0.4561 2024/08/18 04:34:12 - mmengine - INFO - Iter(train) [ 320/9216] lr: 1.9999e-05 eta: 10:52:10 time: 4.3913 data_time: 0.0145 memory: 15408 loss: 0.4662 2024/08/18 04:34:56 - mmengine - INFO - Iter(train) [ 330/9216] lr: 1.9998e-05 eta: 10:51:28 time: 4.4046 data_time: 0.0150 memory: 15287 loss: 0.4962 2024/08/18 04:35:40 - mmengine - INFO - Iter(train) [ 340/9216] lr: 1.9998e-05 eta: 10:50:25 time: 4.3270 data_time: 0.0143 memory: 15276 loss: 0.5024 2024/08/18 04:36:21 - mmengine - INFO - Iter(train) [ 350/9216] lr: 1.9997e-05 eta: 10:48:48 time: 4.1855 data_time: 0.0144 memory: 15049 loss: 0.4411 2024/08/18 04:37:02 - mmengine - INFO - Iter(train) [ 360/9216] lr: 1.9996e-05 eta: 10:46:54 time: 4.1058 data_time: 0.0137 memory: 14834 loss: 0.4156 2024/08/18 04:37:43 - mmengine - INFO - Iter(train) [ 370/9216] lr: 1.9995e-05 eta: 10:44:40 time: 4.0051 data_time: 0.0135 memory: 14813 loss: 0.3812 2024/08/18 04:38:21 - mmengine - INFO - Iter(train) [ 380/9216] lr: 1.9993e-05 eta: 10:41:51 time: 3.8358 data_time: 0.0130 memory: 14626 loss: 0.2543 2024/08/18 04:38:57 - mmengine - INFO - Iter(train) [ 390/9216] lr: 1.9992e-05 eta: 10:38:27 time: 3.6502 data_time: 0.0135 memory: 14251 loss: 0.3295 2024/08/18 04:39:31 - mmengine - INFO - Iter(train) [ 400/9216] lr: 1.9991e-05 eta: 10:34:18 time: 3.4057 data_time: 0.0124 memory: 13949 loss: 0.2180 2024/08/18 04:40:04 - mmengine - INFO - Iter(train) [ 410/9216] lr: 1.9989e-05 eta: 10:29:51 time: 3.2744 data_time: 0.0118 memory: 13803 loss: 0.2269 2024/08/18 04:40:36 - mmengine - INFO - Iter(train) [ 420/9216] lr: 1.9987e-05 eta: 10:25:18 time: 3.1941 data_time: 0.0121 memory: 13565 loss: 0.2022 2024/08/18 04:41:07 - mmengine - INFO - Iter(train) [ 430/9216] lr: 1.9986e-05 eta: 10:20:41 time: 3.1167 data_time: 0.0124 memory: 13344 loss: 0.2041 2024/08/18 04:41:38 - mmengine - INFO - Iter(train) [ 440/9216] lr: 1.9984e-05 eta: 10:16:04 time: 3.0652 data_time: 0.0121 memory: 13500 loss: 0.2418 2024/08/18 04:42:08 - mmengine - INFO - Iter(train) [ 450/9216] lr: 1.9982e-05 eta: 10:11:32 time: 3.0292 data_time: 0.0122 memory: 13235 loss: 0.2728 2024/08/18 04:42:38 - mmengine - INFO - Iter(train) [ 460/9216] lr: 1.9979e-05 eta: 10:06:55 time: 2.9504 data_time: 0.0121 memory: 13174 loss: 0.3095 2024/08/18 04:43:07 - mmengine - INFO - Iter(train) [ 470/9216] lr: 1.9977e-05 eta: 10:02:18 time: 2.8964 data_time: 0.0120 memory: 13071 loss: 0.3158 2024/08/18 04:43:34 - mmengine - INFO - Iter(train) [ 480/9216] lr: 1.9975e-05 eta: 9:57:16 time: 2.6994 data_time: 0.0119 memory: 12794 loss: 0.3971 2024/08/18 04:43:57 - mmengine - INFO - Iter(train) [ 490/9216] lr: 1.9972e-05 eta: 9:51:22 time: 2.3472 data_time: 0.0109 memory: 12262 loss: 0.2797 2024/08/18 04:44:19 - mmengine - INFO - Iter(train) [ 500/9216] lr: 1.9969e-05 eta: 9:45:08 time: 2.1519 data_time: 0.0105 memory: 11873 loss: 0.3053 2024/08/18 04:44:37 - mmengine - INFO - Iter(train) [ 510/9216] lr: 1.9966e-05 eta: 9:38:16 time: 1.8513 data_time: 0.0103 memory: 11437 loss: 0.3392 2024/08/18 04:45:27 - mmengine - INFO - Iter(train) [ 520/9216] lr: 1.9964e-05 eta: 9:40:20 time: 4.9649 data_time: 0.0129 memory: 19201 loss: 0.2170 2024/08/18 04:46:21 - mmengine - INFO - Iter(train) [ 530/9216] lr: 1.9961e-05 eta: 9:43:26 time: 5.3838 data_time: 0.0149 memory: 17043 loss: 0.2319 2024/08/18 04:47:12 - mmengine - INFO - Iter(train) [ 540/9216] lr: 1.9957e-05 eta: 9:45:50 time: 5.1787 data_time: 0.0151 memory: 16535 loss: 0.1397 2024/08/18 04:48:03 - mmengine - INFO - Iter(train) [ 550/9216] lr: 1.9954e-05 eta: 9:47:47 time: 5.0455 data_time: 0.0147 memory: 16183 loss: 0.1238 2024/08/18 04:48:53 - mmengine - INFO - Iter(train) [ 560/9216] lr: 1.9951e-05 eta: 9:49:29 time: 4.9947 data_time: 0.0146 memory: 16252 loss: 0.1413 2024/08/18 04:49:42 - mmengine - INFO - Iter(train) [ 570/9216] lr: 1.9947e-05 eta: 9:50:59 time: 4.9490 data_time: 0.0146 memory: 16140 loss: 0.1584 2024/08/18 04:50:31 - mmengine - INFO - Iter(train) [ 580/9216] lr: 1.9943e-05 eta: 9:52:18 time: 4.9067 data_time: 0.0147 memory: 16088 loss: 0.1494 2024/08/18 04:51:20 - mmengine - INFO - Iter(train) [ 590/9216] lr: 1.9940e-05 eta: 9:53:28 time: 4.8763 data_time: 0.0144 memory: 16083 loss: 0.1276 2024/08/18 04:52:09 - mmengine - INFO - Iter(train) [ 600/9216] lr: 1.9936e-05 eta: 9:54:32 time: 4.8614 data_time: 0.0145 memory: 16147 loss: 0.1478 2024/08/18 04:52:57 - mmengine - INFO - Iter(train) [ 610/9216] lr: 1.9932e-05 eta: 9:55:31 time: 4.8545 data_time: 0.0142 memory: 16544 loss: 0.1894 2024/08/18 04:53:45 - mmengine - INFO - Iter(train) [ 620/9216] lr: 1.9927e-05 eta: 9:56:20 time: 4.8026 data_time: 0.0146 memory: 16116 loss: 0.1296 2024/08/18 04:54:34 - mmengine - INFO - Iter(train) [ 630/9216] lr: 1.9923e-05 eta: 9:57:11 time: 4.8424 data_time: 0.0145 memory: 15988 loss: 0.1846 2024/08/18 04:55:23 - mmengine - INFO - Iter(train) [ 640/9216] lr: 1.9919e-05 eta: 9:58:03 time: 4.8690 data_time: 0.0155 memory: 16154 loss: 0.1800 2024/08/18 04:56:10 - mmengine - INFO - Iter(train) [ 650/9216] lr: 1.9914e-05 eta: 9:58:38 time: 4.7717 data_time: 0.0147 memory: 16540 loss: 0.1711 2024/08/18 04:56:58 - mmengine - INFO - Iter(train) [ 660/9216] lr: 1.9910e-05 eta: 9:59:08 time: 4.7466 data_time: 0.0150 memory: 15910 loss: 0.1150 2024/08/18 04:57:45 - mmengine - INFO - Iter(train) [ 670/9216] lr: 1.9905e-05 eta: 9:59:37 time: 4.7594 data_time: 0.0146 memory: 15824 loss: 0.0936 2024/08/18 04:58:32 - mmengine - INFO - Iter(train) [ 680/9216] lr: 1.9900e-05 eta: 9:59:58 time: 4.7123 data_time: 0.0143 memory: 15760 loss: 0.1496 2024/08/18 04:59:20 - mmengine - INFO - Iter(train) [ 690/9216] lr: 1.9895e-05 eta: 10:00:18 time: 4.7179 data_time: 0.0146 memory: 15812 loss: 0.1491 2024/08/18 05:00:09 - mmengine - INFO - Iter(train) [ 700/9216] lr: 1.9890e-05 eta: 10:00:57 time: 4.8920 data_time: 0.0149 memory: 15853 loss: 0.1629 2024/08/18 05:00:55 - mmengine - INFO - Iter(train) [ 710/9216] lr: 1.9884e-05 eta: 10:01:00 time: 4.6177 data_time: 0.0144 memory: 15705 loss: 0.1613 2024/08/18 05:01:41 - mmengine - INFO - Iter(train) [ 720/9216] lr: 1.9879e-05 eta: 10:01:05 time: 4.6356 data_time: 0.0148 memory: 15705 loss: 0.1297 2024/08/18 05:02:28 - mmengine - INFO - Iter(train) [ 730/9216] lr: 1.9874e-05 eta: 10:01:19 time: 4.7306 data_time: 0.0146 memory: 15803 loss: 0.1270 2024/08/18 05:03:15 - mmengine - INFO - Iter(train) [ 740/9216] lr: 1.9868e-05 eta: 10:01:24 time: 4.6659 data_time: 0.0150 memory: 15633 loss: 0.1840 2024/08/18 05:04:01 - mmengine - INFO - Iter(train) [ 750/9216] lr: 1.9862e-05 eta: 10:01:17 time: 4.5779 data_time: 0.0149 memory: 15552 loss: 0.1695 2024/08/18 05:04:46 - mmengine - INFO - Iter(train) [ 760/9216] lr: 1.9856e-05 eta: 10:01:07 time: 4.5495 data_time: 0.0149 memory: 15662 loss: 0.1391 2024/08/18 05:05:32 - mmengine - INFO - Iter(train) [ 770/9216] lr: 1.9850e-05 eta: 10:00:55 time: 4.5508 data_time: 0.0145 memory: 15621 loss: 0.1981 2024/08/18 05:06:18 - mmengine - INFO - Iter(train) [ 780/9216] lr: 1.9844e-05 eta: 10:00:47 time: 4.5869 data_time: 0.0145 memory: 15696 loss: 0.1759 2024/08/18 05:07:04 - mmengine - INFO - Iter(train) [ 790/9216] lr: 1.9838e-05 eta: 10:00:44 time: 4.6458 data_time: 0.0145 memory: 15607 loss: 0.1745 2024/08/18 05:07:49 - mmengine - INFO - Iter(train) [ 800/9216] lr: 1.9832e-05 eta: 10:00:26 time: 4.5089 data_time: 0.0144 memory: 15488 loss: 0.2079 2024/08/18 05:08:34 - mmengine - INFO - Iter(train) [ 810/9216] lr: 1.9825e-05 eta: 10:00:08 time: 4.5275 data_time: 0.0144 memory: 15388 loss: 0.1841 2024/08/18 05:09:20 - mmengine - INFO - Iter(train) [ 820/9216] lr: 1.9819e-05 eta: 9:59:49 time: 4.5105 data_time: 0.0145 memory: 15465 loss: 0.1769 2024/08/18 05:10:04 - mmengine - INFO - Iter(train) [ 830/9216] lr: 1.9812e-05 eta: 9:59:22 time: 4.4452 data_time: 0.0147 memory: 15270 loss: 0.2430 2024/08/18 05:10:48 - mmengine - INFO - Iter(train) [ 840/9216] lr: 1.9805e-05 eta: 9:58:53 time: 4.4291 data_time: 0.0151 memory: 15379 loss: 0.2653 2024/08/18 05:11:32 - mmengine - INFO - Iter(train) [ 850/9216] lr: 1.9798e-05 eta: 9:58:19 time: 4.3801 data_time: 0.0142 memory: 15376 loss: 0.1715 2024/08/18 05:12:15 - mmengine - INFO - Iter(train) [ 860/9216] lr: 1.9791e-05 eta: 9:57:35 time: 4.2780 data_time: 0.0145 memory: 15078 loss: 0.2359 2024/08/18 05:12:57 - mmengine - INFO - Iter(train) [ 870/9216] lr: 1.9784e-05 eta: 9:56:43 time: 4.1925 data_time: 0.0143 memory: 15003 loss: 0.2547 2024/08/18 05:13:38 - mmengine - INFO - Iter(train) [ 880/9216] lr: 1.9776e-05 eta: 9:55:43 time: 4.1124 data_time: 0.0143 memory: 14889 loss: 0.2039 2024/08/18 05:14:19 - mmengine - INFO - Iter(train) [ 890/9216] lr: 1.9769e-05 eta: 9:54:42 time: 4.0936 data_time: 0.0139 memory: 15132 loss: 0.3217 2024/08/18 05:14:58 - mmengine - INFO - Iter(train) [ 900/9216] lr: 1.9761e-05 eta: 9:53:23 time: 3.9017 data_time: 0.0134 memory: 14607 loss: 0.3773 2024/08/18 05:15:35 - mmengine - INFO - Iter(train) [ 910/9216] lr: 1.9754e-05 eta: 9:51:47 time: 3.6885 data_time: 0.0129 memory: 14499 loss: 0.1927 2024/08/18 05:16:08 - mmengine - INFO - Iter(train) [ 920/9216] lr: 1.9746e-05 eta: 9:49:42 time: 3.3669 data_time: 0.0128 memory: 14132 loss: 0.1760 2024/08/18 05:16:40 - mmengine - INFO - Iter(train) [ 930/9216] lr: 1.9738e-05 eta: 9:47:23 time: 3.1797 data_time: 0.0123 memory: 13812 loss: 0.1324 2024/08/18 05:17:11 - mmengine - INFO - Iter(train) [ 940/9216] lr: 1.9730e-05 eta: 9:44:59 time: 3.1097 data_time: 0.0121 memory: 13447 loss: 0.1548 2024/08/18 05:17:42 - mmengine - INFO - Iter(train) [ 950/9216] lr: 1.9722e-05 eta: 9:42:38 time: 3.0993 data_time: 0.0120 memory: 13439 loss: 0.1384 2024/08/18 05:18:13 - mmengine - INFO - Iter(train) [ 960/9216] lr: 1.9713e-05 eta: 9:40:12 time: 3.0319 data_time: 0.0121 memory: 13335 loss: 0.1554 2024/08/18 05:18:42 - mmengine - INFO - Iter(train) [ 970/9216] lr: 1.9705e-05 eta: 9:37:42 time: 2.9503 data_time: 0.0121 memory: 13134 loss: 0.1774 2024/08/18 05:19:11 - mmengine - INFO - Iter(train) [ 980/9216] lr: 1.9696e-05 eta: 9:35:12 time: 2.9097 data_time: 0.0120 memory: 13137 loss: 0.1721 2024/08/18 05:19:38 - mmengine - INFO - Iter(train) [ 990/9216] lr: 1.9688e-05 eta: 9:32:24 time: 2.6779 data_time: 0.0120 memory: 12950 loss: 0.2355 2024/08/18 05:20:01 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 05:20:01 - mmengine - INFO - Iter(train) [1000/9216] lr: 1.9679e-05 eta: 9:29:06 time: 2.2678 data_time: 0.0107 memory: 12110 loss: 0.1091 2024/08/18 05:20:01 - mmengine - INFO - Saving checkpoint at 1000 iterations 2024/08/18 05:20:24 - mmengine - INFO - Iter(train) [1010/9216] lr: 1.9670e-05 eta: 9:25:53 time: 2.2983 data_time: 0.1877 memory: 11785 loss: 0.1675 2024/08/18 05:20:39 - mmengine - INFO - Iter(train) [1020/9216] lr: 1.9661e-05 eta: 9:21:43 time: 1.5419 data_time: 0.0094 memory: 11232 loss: 0.2658 2024/08/18 05:21:21 - mmengine - INFO - Iter(train) [1030/9216] lr: 1.9652e-05 eta: 9:21:07 time: 4.1800 data_time: 0.0110 memory: 21965 loss: 0.1310 2024/08/18 05:22:17 - mmengine - INFO - Iter(train) [1040/9216] lr: 1.9643e-05 eta: 9:22:21 time: 5.5649 data_time: 0.0144 memory: 17646 loss: 0.1027 2024/08/18 05:23:08 - mmengine - INFO - Iter(train) [1050/9216] lr: 1.9633e-05 eta: 9:23:02 time: 5.1868 data_time: 0.0144 memory: 16567 loss: 0.1146 2024/08/18 05:23:59 - mmengine - INFO - Iter(train) [1060/9216] lr: 1.9624e-05 eta: 9:23:32 time: 5.0631 data_time: 0.0152 memory: 16235 loss: 0.0804 2024/08/18 05:24:49 - mmengine - INFO - Iter(train) [1070/9216] lr: 1.9614e-05 eta: 9:23:51 time: 4.9432 data_time: 0.0145 memory: 16214 loss: 0.0937 2024/08/18 05:25:38 - mmengine - INFO - Iter(train) [1080/9216] lr: 1.9605e-05 eta: 9:24:08 time: 4.9306 data_time: 0.0145 memory: 16335 loss: 0.1290 2024/08/18 05:26:28 - mmengine - INFO - Iter(train) [1090/9216] lr: 1.9595e-05 eta: 9:24:27 time: 4.9816 data_time: 0.0145 memory: 16691 loss: 0.0921 2024/08/18 05:27:17 - mmengine - INFO - Iter(train) [1100/9216] lr: 1.9585e-05 eta: 9:24:42 time: 4.9256 data_time: 0.0144 memory: 16121 loss: 0.1023 2024/08/18 05:28:06 - mmengine - INFO - Iter(train) [1110/9216] lr: 1.9575e-05 eta: 9:24:53 time: 4.8977 data_time: 0.0145 memory: 16097 loss: 0.0649 2024/08/18 05:28:55 - mmengine - INFO - Iter(train) [1120/9216] lr: 1.9564e-05 eta: 9:25:04 time: 4.9133 data_time: 0.0147 memory: 16283 loss: 0.1566 2024/08/18 05:29:43 - mmengine - INFO - Iter(train) [1130/9216] lr: 1.9554e-05 eta: 9:25:07 time: 4.8098 data_time: 0.0144 memory: 15969 loss: 0.1303 2024/08/18 05:30:34 - mmengine - INFO - Iter(train) [1140/9216] lr: 1.9544e-05 eta: 9:25:27 time: 5.0713 data_time: 0.0145 memory: 16000 loss: 0.1400 2024/08/18 05:31:21 - mmengine - INFO - Iter(train) [1150/9216] lr: 1.9533e-05 eta: 9:25:23 time: 4.7474 data_time: 0.0145 memory: 15829 loss: 0.1138 2024/08/18 05:32:09 - mmengine - INFO - Iter(train) [1160/9216] lr: 1.9522e-05 eta: 9:25:18 time: 4.7430 data_time: 0.0145 memory: 16043 loss: 0.2112 2024/08/18 05:32:57 - mmengine - INFO - Iter(train) [1170/9216] lr: 1.9512e-05 eta: 9:25:17 time: 4.8052 data_time: 0.0145 memory: 16740 loss: 0.1125 2024/08/18 05:33:44 - mmengine - INFO - Iter(train) [1180/9216] lr: 1.9501e-05 eta: 9:25:11 time: 4.7388 data_time: 0.0147 memory: 15870 loss: 0.1105 2024/08/18 05:34:31 - mmengine - INFO - Iter(train) [1190/9216] lr: 1.9490e-05 eta: 9:25:01 time: 4.6973 data_time: 0.0144 memory: 15831 loss: 0.1409 2024/08/18 05:35:18 - mmengine - INFO - Iter(train) [1200/9216] lr: 1.9479e-05 eta: 9:24:49 time: 4.6892 data_time: 0.0145 memory: 15841 loss: 0.1488 2024/08/18 05:36:05 - mmengine - INFO - Iter(train) [1210/9216] lr: 1.9467e-05 eta: 9:24:36 time: 4.6589 data_time: 0.0147 memory: 16033 loss: 0.1125 2024/08/18 05:36:51 - mmengine - INFO - Iter(train) [1220/9216] lr: 1.9456e-05 eta: 9:24:19 time: 4.6198 data_time: 0.0150 memory: 15703 loss: 0.1086 2024/08/18 05:37:36 - mmengine - INFO - Iter(train) [1230/9216] lr: 1.9444e-05 eta: 9:23:56 time: 4.5356 data_time: 0.0146 memory: 15588 loss: 0.1074 2024/08/18 05:38:22 - mmengine - INFO - Iter(train) [1240/9216] lr: 1.9433e-05 eta: 9:23:34 time: 4.5538 data_time: 0.0144 memory: 15943 loss: 0.1482 2024/08/18 05:39:06 - mmengine - INFO - Iter(train) [1250/9216] lr: 1.9421e-05 eta: 9:23:06 time: 4.4570 data_time: 0.0146 memory: 15624 loss: 0.1790 2024/08/18 05:39:52 - mmengine - INFO - Iter(train) [1260/9216] lr: 1.9409e-05 eta: 9:22:43 time: 4.5513 data_time: 0.0149 memory: 15793 loss: 0.1324 2024/08/18 05:40:37 - mmengine - INFO - Iter(train) [1270/9216] lr: 1.9397e-05 eta: 9:22:19 time: 4.5470 data_time: 0.0148 memory: 15767 loss: 0.1325 2024/08/18 05:41:23 - mmengine - INFO - Iter(train) [1280/9216] lr: 1.9385e-05 eta: 9:21:55 time: 4.5346 data_time: 0.0145 memory: 15557 loss: 0.1583 2024/08/18 05:42:08 - mmengine - INFO - Iter(train) [1290/9216] lr: 1.9373e-05 eta: 9:21:31 time: 4.5573 data_time: 0.0145 memory: 15664 loss: 0.1421 2024/08/18 05:42:54 - mmengine - INFO - Iter(train) [1300/9216] lr: 1.9361e-05 eta: 9:21:08 time: 4.5707 data_time: 0.0152 memory: 15751 loss: 0.1482 2024/08/18 05:43:39 - mmengine - INFO - Iter(train) [1310/9216] lr: 1.9348e-05 eta: 9:20:43 time: 4.5401 data_time: 0.0155 memory: 15427 loss: 0.1911 2024/08/18 05:44:24 - mmengine - INFO - Iter(train) [1320/9216] lr: 1.9336e-05 eta: 9:20:15 time: 4.4922 data_time: 0.0144 memory: 15365 loss: 0.1286 2024/08/18 05:45:09 - mmengine - INFO - Iter(train) [1330/9216] lr: 1.9323e-05 eta: 9:19:46 time: 4.5003 data_time: 0.0145 memory: 15472 loss: 0.1624 2024/08/18 05:45:54 - mmengine - INFO - Iter(train) [1340/9216] lr: 1.9310e-05 eta: 9:19:19 time: 4.5220 data_time: 0.0145 memory: 15410 loss: 0.1737 2024/08/18 05:46:39 - mmengine - INFO - Iter(train) [1350/9216] lr: 1.9298e-05 eta: 9:18:48 time: 4.4524 data_time: 0.0146 memory: 15538 loss: 0.2134 2024/08/18 05:47:23 - mmengine - INFO - Iter(train) [1360/9216] lr: 1.9285e-05 eta: 9:18:11 time: 4.3659 data_time: 0.0142 memory: 15213 loss: 0.1493 2024/08/18 05:48:06 - mmengine - INFO - Iter(train) [1370/9216] lr: 1.9271e-05 eta: 9:17:35 time: 4.3782 data_time: 0.0144 memory: 15203 loss: 0.1846 2024/08/18 05:48:49 - mmengine - INFO - Iter(train) [1380/9216] lr: 1.9258e-05 eta: 9:16:51 time: 4.2439 data_time: 0.0142 memory: 15108 loss: 0.2265 2024/08/18 05:49:30 - mmengine - INFO - Iter(train) [1390/9216] lr: 1.9245e-05 eta: 9:16:02 time: 4.1384 data_time: 0.0140 memory: 14967 loss: 0.2226 2024/08/18 05:50:11 - mmengine - INFO - Iter(train) [1400/9216] lr: 1.9231e-05 eta: 9:15:09 time: 4.0899 data_time: 0.0143 memory: 14982 loss: 0.1654 2024/08/18 05:50:51 - mmengine - INFO - Iter(train) [1410/9216] lr: 1.9218e-05 eta: 9:14:10 time: 3.9644 data_time: 0.0133 memory: 14868 loss: 0.1808 2024/08/18 05:51:28 - mmengine - INFO - Iter(train) [1420/9216] lr: 1.9204e-05 eta: 9:12:58 time: 3.7125 data_time: 0.0125 memory: 14641 loss: 0.1383 2024/08/18 05:52:01 - mmengine - INFO - Iter(train) [1430/9216] lr: 1.9190e-05 eta: 9:11:26 time: 3.3606 data_time: 0.0125 memory: 13954 loss: 0.1653 2024/08/18 05:52:34 - mmengine - INFO - Iter(train) [1440/9216] lr: 1.9176e-05 eta: 9:09:48 time: 3.2187 data_time: 0.0122 memory: 13850 loss: 0.2122 2024/08/18 05:53:05 - mmengine - INFO - Iter(train) [1450/9216] lr: 1.9162e-05 eta: 9:08:07 time: 3.1474 data_time: 0.0120 memory: 13486 loss: 0.1064 2024/08/18 05:53:36 - mmengine - INFO - Iter(train) [1460/9216] lr: 1.9148e-05 eta: 9:06:25 time: 3.1030 data_time: 0.0119 memory: 13422 loss: 0.1178 2024/08/18 05:54:06 - mmengine - INFO - Iter(train) [1470/9216] lr: 1.9134e-05 eta: 9:04:39 time: 3.0162 data_time: 0.0119 memory: 13269 loss: 0.1083 2024/08/18 05:54:36 - mmengine - INFO - Iter(train) [1480/9216] lr: 1.9120e-05 eta: 9:02:51 time: 2.9652 data_time: 0.0119 memory: 13225 loss: 0.1730 2024/08/18 05:55:05 - mmengine - INFO - Iter(train) [1490/9216] lr: 1.9105e-05 eta: 9:01:00 time: 2.8816 data_time: 0.0119 memory: 13164 loss: 0.1408 2024/08/18 05:55:33 - mmengine - INFO - Iter(train) [1500/9216] lr: 1.9091e-05 eta: 8:59:06 time: 2.8077 data_time: 0.0123 memory: 12964 loss: 0.2057 2024/08/18 05:55:59 - mmengine - INFO - Iter(train) [1510/9216] lr: 1.9076e-05 eta: 8:57:04 time: 2.6242 data_time: 0.0119 memory: 12775 loss: 0.2794 2024/08/18 05:56:21 - mmengine - INFO - Iter(train) [1520/9216] lr: 1.9061e-05 eta: 8:54:42 time: 2.2019 data_time: 0.0109 memory: 12094 loss: 0.1325 2024/08/18 05:56:39 - mmengine - INFO - Iter(train) [1530/9216] lr: 1.9046e-05 eta: 8:52:02 time: 1.8058 data_time: 0.0100 memory: 11534 loss: 0.1530 2024/08/18 05:57:12 - mmengine - INFO - Iter(train) [1540/9216] lr: 1.9031e-05 eta: 8:50:37 time: 3.2956 data_time: 0.0101 memory: 21992 loss: 0.0907 2024/08/18 05:58:08 - mmengine - INFO - Iter(train) [1550/9216] lr: 1.9016e-05 eta: 8:51:09 time: 5.6218 data_time: 0.0145 memory: 17957 loss: 0.1130 2024/08/18 05:59:00 - mmengine - INFO - Iter(train) [1560/9216] lr: 1.9001e-05 eta: 8:51:15 time: 5.1233 data_time: 0.0144 memory: 16602 loss: 0.0868 2024/08/18 05:59:50 - mmengine - INFO - Iter(train) [1570/9216] lr: 1.8985e-05 eta: 8:51:18 time: 5.0790 data_time: 0.0147 memory: 16491 loss: 0.0626 2024/08/18 06:00:43 - mmengine - INFO - Iter(train) [1580/9216] lr: 1.8970e-05 eta: 8:51:28 time: 5.2406 data_time: 0.0146 memory: 16173 loss: 0.0891 2024/08/18 06:01:32 - mmengine - INFO - Iter(train) [1590/9216] lr: 1.8954e-05 eta: 8:51:24 time: 4.9631 data_time: 0.0151 memory: 16183 loss: 0.0889 2024/08/18 06:02:22 - mmengine - INFO - Iter(train) [1600/9216] lr: 1.8939e-05 eta: 8:51:19 time: 4.9680 data_time: 0.0144 memory: 16216 loss: 0.0810 2024/08/18 06:03:12 - mmengine - INFO - Iter(train) [1610/9216] lr: 1.8923e-05 eta: 8:51:14 time: 4.9487 data_time: 0.0146 memory: 16363 loss: 0.0666 2024/08/18 06:04:01 - mmengine - INFO - Iter(train) [1620/9216] lr: 1.8907e-05 eta: 8:51:06 time: 4.9251 data_time: 0.0143 memory: 16535 loss: 0.0827 2024/08/18 06:04:49 - mmengine - INFO - Iter(train) [1630/9216] lr: 1.8891e-05 eta: 8:50:53 time: 4.8232 data_time: 0.0145 memory: 15964 loss: 0.0746 2024/08/18 06:05:38 - mmengine - INFO - Iter(train) [1640/9216] lr: 1.8875e-05 eta: 8:50:44 time: 4.9089 data_time: 0.0148 memory: 15969 loss: 0.0766 2024/08/18 06:06:29 - mmengine - INFO - Iter(train) [1650/9216] lr: 1.8858e-05 eta: 8:50:41 time: 5.0539 data_time: 0.0143 memory: 16026 loss: 0.1453 2024/08/18 06:07:18 - mmengine - INFO - Iter(train) [1660/9216] lr: 1.8842e-05 eta: 8:50:31 time: 4.9211 data_time: 0.0146 memory: 15881 loss: 0.1191 2024/08/18 06:08:08 - mmengine - INFO - Iter(train) [1670/9216] lr: 1.8826e-05 eta: 8:50:23 time: 4.9565 data_time: 0.0148 memory: 15855 loss: 0.1179 2024/08/18 06:08:56 - mmengine - INFO - Iter(train) [1680/9216] lr: 1.8809e-05 eta: 8:50:11 time: 4.8971 data_time: 0.0153 memory: 16173 loss: 0.0862 2024/08/18 06:09:44 - mmengine - INFO - Iter(train) [1690/9216] lr: 1.8792e-05 eta: 8:49:53 time: 4.7639 data_time: 0.0146 memory: 15933 loss: 0.0860 2024/08/18 06:10:31 - mmengine - INFO - Iter(train) [1700/9216] lr: 1.8776e-05 eta: 8:49:30 time: 4.6635 data_time: 0.0144 memory: 15734 loss: 0.1069 2024/08/18 06:11:18 - mmengine - INFO - Iter(train) [1710/9216] lr: 1.8759e-05 eta: 8:49:10 time: 4.7325 data_time: 0.0145 memory: 15872 loss: 0.1514 2024/08/18 06:12:05 - mmengine - INFO - Iter(train) [1720/9216] lr: 1.8742e-05 eta: 8:48:49 time: 4.7143 data_time: 0.0144 memory: 15713 loss: 0.0853 2024/08/18 06:12:52 - mmengine - INFO - Iter(train) [1730/9216] lr: 1.8725e-05 eta: 8:48:27 time: 4.7018 data_time: 0.0145 memory: 15662 loss: 0.1801 2024/08/18 06:13:39 - mmengine - INFO - Iter(train) [1740/9216] lr: 1.8707e-05 eta: 8:48:02 time: 4.6296 data_time: 0.0144 memory: 15734 loss: 0.0872 2024/08/18 06:14:25 - mmengine - INFO - Iter(train) [1750/9216] lr: 1.8690e-05 eta: 8:47:35 time: 4.6086 data_time: 0.0143 memory: 15671 loss: 0.1123 2024/08/18 06:15:10 - mmengine - INFO - Iter(train) [1760/9216] lr: 1.8673e-05 eta: 8:47:07 time: 4.5844 data_time: 0.0151 memory: 15555 loss: 0.1723 2024/08/18 06:15:57 - mmengine - INFO - Iter(train) [1770/9216] lr: 1.8655e-05 eta: 8:46:41 time: 4.6285 data_time: 0.0149 memory: 15626 loss: 0.1151 2024/08/18 06:16:43 - mmengine - INFO - Iter(train) [1780/9216] lr: 1.8637e-05 eta: 8:46:15 time: 4.6383 data_time: 0.0142 memory: 15557 loss: 0.1322 2024/08/18 06:17:28 - mmengine - INFO - Iter(train) [1790/9216] lr: 1.8620e-05 eta: 8:45:45 time: 4.5348 data_time: 0.0143 memory: 15512 loss: 0.1898 2024/08/18 06:18:14 - mmengine - INFO - Iter(train) [1800/9216] lr: 1.8602e-05 eta: 8:45:15 time: 4.5680 data_time: 0.0145 memory: 15531 loss: 0.1024 2024/08/18 06:19:00 - mmengine - INFO - Iter(train) [1810/9216] lr: 1.8584e-05 eta: 8:44:46 time: 4.5618 data_time: 0.0145 memory: 15476 loss: 0.1095 2024/08/18 06:19:45 - mmengine - INFO - Iter(train) [1820/9216] lr: 1.8566e-05 eta: 8:44:13 time: 4.4973 data_time: 0.0144 memory: 15474 loss: 0.1477 2024/08/18 06:20:29 - mmengine - INFO - Iter(train) [1830/9216] lr: 1.8547e-05 eta: 8:43:39 time: 4.4731 data_time: 0.0147 memory: 15320 loss: 0.1119 2024/08/18 06:21:14 - mmengine - INFO - Iter(train) [1840/9216] lr: 1.8529e-05 eta: 8:43:03 time: 4.4162 data_time: 0.0148 memory: 15246 loss: 0.2279 2024/08/18 06:21:57 - mmengine - INFO - Iter(train) [1850/9216] lr: 1.8511e-05 eta: 8:42:25 time: 4.3553 data_time: 0.0150 memory: 15306 loss: 0.2425 2024/08/18 06:22:40 - mmengine - INFO - Iter(train) [1860/9216] lr: 1.8492e-05 eta: 8:41:45 time: 4.3145 data_time: 0.0143 memory: 15181 loss: 0.2189 2024/08/18 06:23:23 - mmengine - INFO - Iter(train) [1870/9216] lr: 1.8474e-05 eta: 8:41:02 time: 4.2444 data_time: 0.0140 memory: 15017 loss: 0.1465 2024/08/18 06:24:04 - mmengine - INFO - Iter(train) [1880/9216] lr: 1.8455e-05 eta: 8:40:13 time: 4.0969 data_time: 0.0138 memory: 14887 loss: 0.2018 2024/08/18 06:24:43 - mmengine - INFO - Iter(train) [1890/9216] lr: 1.8436e-05 eta: 8:39:18 time: 3.9444 data_time: 0.0138 memory: 14789 loss: 0.3513 2024/08/18 06:25:21 - mmengine - INFO - Iter(train) [1900/9216] lr: 1.8417e-05 eta: 8:38:17 time: 3.7725 data_time: 0.0133 memory: 14619 loss: 0.1069 2024/08/18 06:25:57 - mmengine - INFO - Iter(train) [1910/9216] lr: 1.8398e-05 eta: 8:37:09 time: 3.5841 data_time: 0.0130 memory: 14413 loss: 0.1197 2024/08/18 06:26:30 - mmengine - INFO - Iter(train) [1920/9216] lr: 1.8379e-05 eta: 8:35:53 time: 3.3550 data_time: 0.0123 memory: 13918 loss: 0.1234 2024/08/18 06:27:03 - mmengine - INFO - Iter(train) [1930/9216] lr: 1.8360e-05 eta: 8:34:34 time: 3.2894 data_time: 0.0122 memory: 13848 loss: 0.1319 2024/08/18 06:27:35 - mmengine - INFO - Iter(train) [1940/9216] lr: 1.8340e-05 eta: 8:33:14 time: 3.2153 data_time: 0.0122 memory: 13773 loss: 0.1035 2024/08/18 06:28:07 - mmengine - INFO - Iter(train) [1950/9216] lr: 1.8321e-05 eta: 8:31:51 time: 3.1484 data_time: 0.0122 memory: 13495 loss: 0.1003 2024/08/18 06:28:38 - mmengine - INFO - Iter(train) [1960/9216] lr: 1.8301e-05 eta: 8:30:29 time: 3.1523 data_time: 0.0123 memory: 13452 loss: 0.1005 2024/08/18 06:29:10 - mmengine - INFO - Iter(train) [1970/9216] lr: 1.8282e-05 eta: 8:29:06 time: 3.1232 data_time: 0.0124 memory: 13452 loss: 0.1136 2024/08/18 06:29:40 - mmengine - INFO - Iter(train) [1980/9216] lr: 1.8262e-05 eta: 8:27:42 time: 3.0603 data_time: 0.0119 memory: 13303 loss: 0.1011 2024/08/18 06:30:13 - mmengine - INFO - Iter(train) [1990/9216] lr: 1.8242e-05 eta: 8:26:26 time: 3.2775 data_time: 0.0117 memory: 13329 loss: 0.1240 2024/08/18 06:30:42 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 06:30:42 - mmengine - INFO - Iter(train) [2000/9216] lr: 1.8222e-05 eta: 8:24:58 time: 2.9359 data_time: 0.0118 memory: 13090 loss: 0.1820 2024/08/18 06:30:42 - mmengine - INFO - Saving checkpoint at 2000 iterations 2024/08/18 06:31:13 - mmengine - INFO - Iter(train) [2010/9216] lr: 1.8202e-05 eta: 8:23:34 time: 3.0163 data_time: 0.1796 memory: 13019 loss: 0.1583 2024/08/18 06:31:38 - mmengine - INFO - Iter(train) [2020/9216] lr: 1.8182e-05 eta: 8:21:52 time: 2.5161 data_time: 0.0113 memory: 12590 loss: 0.1722 2024/08/18 06:32:00 - mmengine - INFO - Iter(train) [2030/9216] lr: 1.8162e-05 eta: 8:20:00 time: 2.2023 data_time: 0.0106 memory: 11901 loss: 0.1400 2024/08/18 06:32:19 - mmengine - INFO - Iter(train) [2040/9216] lr: 1.8141e-05 eta: 8:17:59 time: 1.9203 data_time: 0.0102 memory: 11575 loss: 0.1207 2024/08/18 06:32:42 - mmengine - INFO - Iter(train) [2050/9216] lr: 1.8121e-05 eta: 8:16:11 time: 2.2705 data_time: 0.0090 memory: 20850 loss: 0.1632 2024/08/18 06:33:35 - mmengine - INFO - Iter(train) [2060/9216] lr: 1.8100e-05 eta: 8:16:11 time: 5.3445 data_time: 0.0143 memory: 18009 loss: 0.1029 2024/08/18 06:34:25 - mmengine - INFO - Iter(train) [2070/9216] lr: 1.8080e-05 eta: 8:15:57 time: 4.9459 data_time: 0.0143 memory: 16140 loss: 0.0704 2024/08/18 06:35:13 - mmengine - INFO - Iter(train) [2080/9216] lr: 1.8059e-05 eta: 8:15:40 time: 4.8927 data_time: 0.0149 memory: 16128 loss: 0.0725 2024/08/18 06:36:02 - mmengine - INFO - Iter(train) [2090/9216] lr: 1.8038e-05 eta: 8:15:23 time: 4.8969 data_time: 0.0144 memory: 16173 loss: 0.0718 2024/08/18 06:36:51 - mmengine - INFO - Iter(train) [2100/9216] lr: 1.8017e-05 eta: 8:15:04 time: 4.8474 data_time: 0.0145 memory: 15986 loss: 0.0747 2024/08/18 06:37:38 - mmengine - INFO - Iter(train) [2110/9216] lr: 1.7996e-05 eta: 8:14:40 time: 4.7001 data_time: 0.0144 memory: 15767 loss: 0.0967 2024/08/18 06:38:24 - mmengine - INFO - Iter(train) [2120/9216] lr: 1.7975e-05 eta: 8:14:15 time: 4.6620 data_time: 0.0143 memory: 15888 loss: 0.0870 2024/08/18 06:39:11 - mmengine - INFO - Iter(train) [2130/9216] lr: 1.7954e-05 eta: 8:13:48 time: 4.6403 data_time: 0.0156 memory: 15727 loss: 0.1113 2024/08/18 06:39:56 - mmengine - INFO - Iter(train) [2140/9216] lr: 1.7932e-05 eta: 8:13:19 time: 4.5564 data_time: 0.0155 memory: 15586 loss: 0.1310 2024/08/18 06:40:43 - mmengine - INFO - Iter(train) [2150/9216] lr: 1.7911e-05 eta: 8:12:52 time: 4.6412 data_time: 0.0145 memory: 15691 loss: 0.1381 2024/08/18 06:41:28 - mmengine - INFO - Iter(train) [2160/9216] lr: 1.7889e-05 eta: 8:12:21 time: 4.5176 data_time: 0.0144 memory: 15590 loss: 0.1347 2024/08/18 06:42:12 - mmengine - INFO - Iter(train) [2170/9216] lr: 1.7868e-05 eta: 8:11:47 time: 4.4288 data_time: 0.0141 memory: 15533 loss: 0.1400 2024/08/18 06:42:56 - mmengine - INFO - Iter(train) [2180/9216] lr: 1.7846e-05 eta: 8:11:11 time: 4.3716 data_time: 0.0143 memory: 15434 loss: 0.1150 2024/08/18 06:43:39 - mmengine - INFO - Iter(train) [2190/9216] lr: 1.7824e-05 eta: 8:10:32 time: 4.2853 data_time: 0.0143 memory: 15151 loss: 0.1631 2024/08/18 06:44:21 - mmengine - INFO - Iter(train) [2200/9216] lr: 1.7802e-05 eta: 8:09:50 time: 4.1716 data_time: 0.0142 memory: 15037 loss: 0.1518 2024/08/18 06:44:59 - mmengine - INFO - Iter(train) [2210/9216] lr: 1.7780e-05 eta: 8:08:58 time: 3.8833 data_time: 0.0134 memory: 14775 loss: 0.1504 2024/08/18 06:45:34 - mmengine - INFO - Iter(train) [2220/9216] lr: 1.7758e-05 eta: 8:07:55 time: 3.4975 data_time: 0.0125 memory: 14469 loss: 0.1108 2024/08/18 06:46:07 - mmengine - INFO - Iter(train) [2230/9216] lr: 1.7736e-05 eta: 8:06:43 time: 3.2219 data_time: 0.0122 memory: 13964 loss: 0.1377 2024/08/18 06:46:37 - mmengine - INFO - Iter(train) [2240/9216] lr: 1.7714e-05 eta: 8:05:26 time: 3.0618 data_time: 0.0118 memory: 13462 loss: 0.1051 2024/08/18 06:47:07 - mmengine - INFO - Iter(train) [2250/9216] lr: 1.7691e-05 eta: 8:04:07 time: 2.9584 data_time: 0.0123 memory: 13320 loss: 0.0978 2024/08/18 06:47:35 - mmengine - INFO - Iter(train) [2260/9216] lr: 1.7669e-05 eta: 8:02:44 time: 2.8349 data_time: 0.0119 memory: 13076 loss: 0.1609 2024/08/18 06:48:00 - mmengine - INFO - Iter(train) [2270/9216] lr: 1.7646e-05 eta: 8:01:11 time: 2.4903 data_time: 0.0114 memory: 12607 loss: 0.1645 2024/08/18 06:48:19 - mmengine - INFO - Iter(train) [2280/9216] lr: 1.7623e-05 eta: 7:59:21 time: 1.8902 data_time: 0.0105 memory: 11887 loss: 0.1039 2024/08/18 06:49:21 - mmengine - INFO - Iter(train) [2290/9216] lr: 1.7601e-05 eta: 7:59:41 time: 6.2100 data_time: 0.0136 memory: 22898 loss: 0.2099 2024/08/18 06:50:15 - mmengine - INFO - Iter(train) [2300/9216] lr: 1.7578e-05 eta: 7:59:38 time: 5.4334 data_time: 0.0146 memory: 17567 loss: 0.0797 2024/08/18 06:50:36 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 06:50:36 - mmengine - WARNING - Reach the end of the dataloader, it will be restarted and continue to iterate. It is recommended to use `mmengine.dataset.InfiniteSampler` to enable the dataloader to iterate infinitely. 2024/08/18 06:51:19 - mmengine - INFO - Iter(train) [2310/9216] lr: 1.7555e-05 eta: 8:00:02 time: 6.3485 data_time: 0.2749 memory: 22610 loss: 0.1183 2024/08/18 06:52:14 - mmengine - INFO - Iter(train) [2320/9216] lr: 1.7532e-05 eta: 8:00:01 time: 5.5457 data_time: 0.0145 memory: 18004 loss: 0.0705 2024/08/18 06:53:06 - mmengine - INFO - Iter(train) [2330/9216] lr: 1.7509e-05 eta: 7:59:49 time: 5.1717 data_time: 0.0145 memory: 16735 loss: 0.0639 2024/08/18 06:53:55 - mmengine - INFO - Iter(train) [2340/9216] lr: 1.7485e-05 eta: 7:59:29 time: 4.9212 data_time: 0.0150 memory: 16304 loss: 0.0757 2024/08/18 06:54:45 - mmengine - INFO - Iter(train) [2350/9216] lr: 1.7462e-05 eta: 7:59:09 time: 4.9349 data_time: 0.0153 memory: 16335 loss: 0.0772 2024/08/18 06:55:34 - mmengine - INFO - Iter(train) [2360/9216] lr: 1.7438e-05 eta: 7:58:49 time: 4.9349 data_time: 0.0145 memory: 16069 loss: 0.0823 2024/08/18 06:56:24 - mmengine - INFO - Iter(train) [2370/9216] lr: 1.7415e-05 eta: 7:58:29 time: 4.9633 data_time: 0.0150 memory: 16691 loss: 0.0768 2024/08/18 06:57:13 - mmengine - INFO - Iter(train) [2380/9216] lr: 1.7391e-05 eta: 7:58:08 time: 4.9199 data_time: 0.0150 memory: 16043 loss: 0.0501 2024/08/18 06:58:01 - mmengine - INFO - Iter(train) [2390/9216] lr: 1.7368e-05 eta: 7:57:45 time: 4.8469 data_time: 0.0143 memory: 16169 loss: 0.0681 2024/08/18 06:58:50 - mmengine - INFO - Iter(train) [2400/9216] lr: 1.7344e-05 eta: 7:57:22 time: 4.8915 data_time: 0.0144 memory: 16544 loss: 0.1360 2024/08/18 06:59:38 - mmengine - INFO - Iter(train) [2410/9216] lr: 1.7320e-05 eta: 7:56:58 time: 4.8256 data_time: 0.0144 memory: 15926 loss: 0.0809 2024/08/18 07:00:29 - mmengine - INFO - Iter(train) [2420/9216] lr: 1.7296e-05 eta: 7:56:40 time: 5.0673 data_time: 0.0143 memory: 16005 loss: 0.1640 2024/08/18 07:01:17 - mmengine - INFO - Iter(train) [2430/9216] lr: 1.7272e-05 eta: 7:56:14 time: 4.7921 data_time: 0.0146 memory: 16154 loss: 0.1274 2024/08/18 07:02:04 - mmengine - INFO - Iter(train) [2440/9216] lr: 1.7248e-05 eta: 7:55:46 time: 4.7139 data_time: 0.0148 memory: 16043 loss: 0.1941 2024/08/18 07:02:51 - mmengine - INFO - Iter(train) [2450/9216] lr: 1.7223e-05 eta: 7:55:17 time: 4.6912 data_time: 0.0145 memory: 16173 loss: 0.0707 2024/08/18 07:03:38 - mmengine - INFO - Iter(train) [2460/9216] lr: 1.7199e-05 eta: 7:54:47 time: 4.6463 data_time: 0.0148 memory: 15870 loss: 0.1153 2024/08/18 07:04:25 - mmengine - INFO - Iter(train) [2470/9216] lr: 1.7175e-05 eta: 7:54:18 time: 4.6947 data_time: 0.0157 memory: 15807 loss: 0.0960 2024/08/18 07:05:14 - mmengine - INFO - Iter(train) [2480/9216] lr: 1.7150e-05 eta: 7:53:54 time: 4.9012 data_time: 0.0161 memory: 15777 loss: 0.1051 2024/08/18 07:06:04 - mmengine - INFO - Iter(train) [2490/9216] lr: 1.7126e-05 eta: 7:53:34 time: 5.0150 data_time: 0.0145 memory: 15713 loss: 0.0993 2024/08/18 07:06:54 - mmengine - INFO - Iter(train) [2500/9216] lr: 1.7101e-05 eta: 7:53:12 time: 4.9902 data_time: 0.0145 memory: 15705 loss: 0.1174 2024/08/18 07:07:41 - mmengine - INFO - Iter(train) [2510/9216] lr: 1.7076e-05 eta: 7:52:44 time: 4.7801 data_time: 0.0143 memory: 15755 loss: 0.1141 2024/08/18 07:08:28 - mmengine - INFO - Iter(train) [2520/9216] lr: 1.7051e-05 eta: 7:52:14 time: 4.6690 data_time: 0.0150 memory: 15569 loss: 0.1060 2024/08/18 07:09:15 - mmengine - INFO - Iter(train) [2530/9216] lr: 1.7026e-05 eta: 7:51:43 time: 4.6706 data_time: 0.0147 memory: 15793 loss: 0.1096 2024/08/18 07:10:02 - mmengine - INFO - Iter(train) [2540/9216] lr: 1.7001e-05 eta: 7:51:13 time: 4.7206 data_time: 0.0148 memory: 15767 loss: 0.0961 2024/08/18 07:10:50 - mmengine - INFO - Iter(train) [2550/9216] lr: 1.6976e-05 eta: 7:50:45 time: 4.7793 data_time: 0.0150 memory: 15590 loss: 0.1707 2024/08/18 07:11:37 - mmengine - INFO - Iter(train) [2560/9216] lr: 1.6951e-05 eta: 7:50:16 time: 4.7493 data_time: 0.0146 memory: 15710 loss: 0.1506 2024/08/18 07:12:24 - mmengine - INFO - Iter(train) [2570/9216] lr: 1.6925e-05 eta: 7:49:45 time: 4.6679 data_time: 0.0150 memory: 15555 loss: 0.0860 2024/08/18 07:13:10 - mmengine - INFO - Iter(train) [2580/9216] lr: 1.6900e-05 eta: 7:49:11 time: 4.5735 data_time: 0.0143 memory: 15457 loss: 0.0998 2024/08/18 07:13:55 - mmengine - INFO - Iter(train) [2590/9216] lr: 1.6875e-05 eta: 7:48:35 time: 4.4986 data_time: 0.0143 memory: 15344 loss: 0.1193 2024/08/18 07:14:40 - mmengine - INFO - Iter(train) [2600/9216] lr: 1.6849e-05 eta: 7:48:01 time: 4.5512 data_time: 0.0145 memory: 15410 loss: 0.1436 2024/08/18 07:15:26 - mmengine - INFO - Iter(train) [2610/9216] lr: 1.6823e-05 eta: 7:47:26 time: 4.5518 data_time: 0.0148 memory: 15253 loss: 0.1507 2024/08/18 07:16:11 - mmengine - INFO - Iter(train) [2620/9216] lr: 1.6798e-05 eta: 7:46:51 time: 4.5362 data_time: 0.0150 memory: 15334 loss: 0.1607 2024/08/18 07:16:55 - mmengine - INFO - Iter(train) [2630/9216] lr: 1.6772e-05 eta: 7:46:13 time: 4.4386 data_time: 0.0143 memory: 15306 loss: 0.1821 2024/08/18 07:17:39 - mmengine - INFO - Iter(train) [2640/9216] lr: 1.6746e-05 eta: 7:45:34 time: 4.3715 data_time: 0.0143 memory: 15068 loss: 0.1565 2024/08/18 07:18:23 - mmengine - INFO - Iter(train) [2650/9216] lr: 1.6720e-05 eta: 7:44:55 time: 4.4120 data_time: 0.0144 memory: 15168 loss: 0.1373 2024/08/18 07:19:10 - mmengine - INFO - Iter(train) [2660/9216] lr: 1.6694e-05 eta: 7:44:23 time: 4.6497 data_time: 0.0140 memory: 14901 loss: 0.1276 2024/08/18 07:19:57 - mmengine - INFO - Iter(train) [2670/9216] lr: 1.6668e-05 eta: 7:43:53 time: 4.7600 data_time: 0.0141 memory: 15131 loss: 0.1802 2024/08/18 07:20:45 - mmengine - INFO - Iter(train) [2680/9216] lr: 1.6642e-05 eta: 7:43:22 time: 4.7502 data_time: 0.0136 memory: 14830 loss: 0.1312 2024/08/18 07:21:31 - mmengine - INFO - Iter(train) [2690/9216] lr: 1.6615e-05 eta: 7:42:50 time: 4.6531 data_time: 0.0130 memory: 14657 loss: 0.1058 2024/08/18 07:22:15 - mmengine - INFO - Iter(train) [2700/9216] lr: 1.6589e-05 eta: 7:42:09 time: 4.3492 data_time: 0.0126 memory: 14167 loss: 0.1274 2024/08/18 07:22:54 - mmengine - INFO - Iter(train) [2710/9216] lr: 1.6562e-05 eta: 7:41:19 time: 3.9437 data_time: 0.0127 memory: 13678 loss: 0.1119 2024/08/18 07:23:32 - mmengine - INFO - Iter(train) [2720/9216] lr: 1.6536e-05 eta: 7:40:26 time: 3.7914 data_time: 0.0124 memory: 13467 loss: 0.0832 2024/08/18 07:24:10 - mmengine - INFO - Iter(train) [2730/9216] lr: 1.6509e-05 eta: 7:39:32 time: 3.7884 data_time: 0.0122 memory: 13369 loss: 0.0913 2024/08/18 07:24:47 - mmengine - INFO - Iter(train) [2740/9216] lr: 1.6482e-05 eta: 7:38:36 time: 3.6760 data_time: 0.0121 memory: 13431 loss: 0.0848 2024/08/18 07:25:23 - mmengine - INFO - Iter(train) [2750/9216] lr: 1.6456e-05 eta: 7:37:37 time: 3.5666 data_time: 0.0120 memory: 13269 loss: 0.1308 2024/08/18 07:25:58 - mmengine - INFO - Iter(train) [2760/9216] lr: 1.6429e-05 eta: 7:36:37 time: 3.4908 data_time: 0.0119 memory: 13196 loss: 0.1644 2024/08/18 07:26:33 - mmengine - INFO - Iter(train) [2770/9216] lr: 1.6402e-05 eta: 7:35:38 time: 3.5240 data_time: 0.0119 memory: 13164 loss: 0.1748 2024/08/18 07:27:07 - mmengine - INFO - Iter(train) [2780/9216] lr: 1.6375e-05 eta: 7:34:37 time: 3.4217 data_time: 0.0118 memory: 12919 loss: 0.1685 2024/08/18 07:27:36 - mmengine - INFO - Iter(train) [2790/9216] lr: 1.6348e-05 eta: 7:33:23 time: 2.8584 data_time: 0.0108 memory: 12420 loss: 0.1390 2024/08/18 07:28:01 - mmengine - INFO - Iter(train) [2800/9216] lr: 1.6320e-05 eta: 7:32:01 time: 2.5062 data_time: 0.0105 memory: 11733 loss: 0.1155 2024/08/18 07:28:21 - mmengine - INFO - Iter(train) [2810/9216] lr: 1.6293e-05 eta: 7:30:29 time: 2.0469 data_time: 0.0099 memory: 11368 loss: 0.0918 2024/08/18 07:29:02 - mmengine - INFO - Iter(train) [2820/9216] lr: 1.6266e-05 eta: 7:29:43 time: 4.0579 data_time: 0.0099 memory: 21992 loss: 0.1309 2024/08/18 07:30:11 - mmengine - INFO - Iter(train) [2830/9216] lr: 1.6238e-05 eta: 7:30:03 time: 6.9825 data_time: 0.0148 memory: 18378 loss: 0.1019 2024/08/18 07:31:13 - mmengine - INFO - Iter(train) [2840/9216] lr: 1.6211e-05 eta: 7:30:03 time: 6.1205 data_time: 0.0147 memory: 17156 loss: 0.0804 2024/08/18 07:32:11 - mmengine - INFO - Iter(train) [2850/9216] lr: 1.6183e-05 eta: 7:29:57 time: 5.8561 data_time: 0.0146 memory: 16535 loss: 0.0974 2024/08/18 07:33:09 - mmengine - INFO - Iter(train) [2860/9216] lr: 1.6156e-05 eta: 7:29:49 time: 5.7822 data_time: 0.0145 memory: 16491 loss: 0.0609 2024/08/18 07:34:06 - mmengine - INFO - Iter(train) [2870/9216] lr: 1.6128e-05 eta: 7:29:38 time: 5.6728 data_time: 0.0147 memory: 16252 loss: 0.0818 2024/08/18 07:35:04 - mmengine - INFO - Iter(train) [2880/9216] lr: 1.6100e-05 eta: 7:29:30 time: 5.8305 data_time: 0.0149 memory: 16245 loss: 0.0844 2024/08/18 07:36:02 - mmengine - INFO - Iter(train) [2890/9216] lr: 1.6072e-05 eta: 7:29:20 time: 5.7485 data_time: 0.0150 memory: 16085 loss: 0.0939 2024/08/18 07:37:00 - mmengine - INFO - Iter(train) [2900/9216] lr: 1.6044e-05 eta: 7:29:12 time: 5.8211 data_time: 0.0145 memory: 16178 loss: 0.0854 2024/08/18 07:37:59 - mmengine - INFO - Iter(train) [2910/9216] lr: 1.6016e-05 eta: 7:29:05 time: 5.9348 data_time: 0.0145 memory: 16535 loss: 0.0697 2024/08/18 07:38:57 - mmengine - INFO - Iter(train) [2920/9216] lr: 1.5988e-05 eta: 7:28:54 time: 5.7350 data_time: 0.0146 memory: 16161 loss: 0.0862 2024/08/18 07:39:54 - mmengine - INFO - Iter(train) [2930/9216] lr: 1.5960e-05 eta: 7:28:42 time: 5.7256 data_time: 0.0145 memory: 16024 loss: 0.0725 2024/08/18 07:40:52 - mmengine - INFO - Iter(train) [2940/9216] lr: 1.5932e-05 eta: 7:28:32 time: 5.7910 data_time: 0.0146 memory: 16026 loss: 0.0799 2024/08/18 07:41:51 - mmengine - INFO - Iter(train) [2950/9216] lr: 1.5903e-05 eta: 7:28:23 time: 5.9246 data_time: 0.0144 memory: 16071 loss: 0.1116 2024/08/18 07:42:50 - mmengine - INFO - Iter(train) [2960/9216] lr: 1.5875e-05 eta: 7:28:15 time: 5.9117 data_time: 0.0143 memory: 15822 loss: 0.0910 2024/08/18 07:43:48 - mmengine - INFO - Iter(train) [2970/9216] lr: 1.5847e-05 eta: 7:28:03 time: 5.7922 data_time: 0.0144 memory: 15819 loss: 0.0835 2024/08/18 07:44:43 - mmengine - INFO - Iter(train) [2980/9216] lr: 1.5818e-05 eta: 7:27:46 time: 5.5527 data_time: 0.0145 memory: 15924 loss: 0.0698 2024/08/18 07:45:39 - mmengine - INFO - Iter(train) [2990/9216] lr: 1.5789e-05 eta: 7:27:29 time: 5.5529 data_time: 0.0144 memory: 15781 loss: 0.0917 2024/08/18 07:46:35 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 07:46:35 - mmengine - INFO - Iter(train) [3000/9216] lr: 1.5761e-05 eta: 7:27:13 time: 5.5994 data_time: 0.0144 memory: 15724 loss: 0.0940 2024/08/18 07:46:35 - mmengine - INFO - Saving checkpoint at 3000 iterations 2024/08/18 07:47:33 - mmengine - INFO - Iter(train) [3010/9216] lr: 1.5732e-05 eta: 7:27:00 time: 5.7951 data_time: 0.1970 memory: 15727 loss: 0.0686 2024/08/18 07:48:29 - mmengine - INFO - Iter(train) [3020/9216] lr: 1.5703e-05 eta: 7:26:42 time: 5.5728 data_time: 0.0144 memory: 15705 loss: 0.1536 2024/08/18 07:49:25 - mmengine - INFO - Iter(train) [3030/9216] lr: 1.5674e-05 eta: 7:26:26 time: 5.6545 data_time: 0.0145 memory: 15671 loss: 0.0705 2024/08/18 07:50:20 - mmengine - INFO - Iter(train) [3040/9216] lr: 1.5645e-05 eta: 7:26:07 time: 5.5238 data_time: 0.0145 memory: 15943 loss: 0.1123 2024/08/18 07:51:15 - mmengine - INFO - Iter(train) [3050/9216] lr: 1.5616e-05 eta: 7:25:47 time: 5.4894 data_time: 0.0143 memory: 15624 loss: 0.1287 2024/08/18 07:52:10 - mmengine - INFO - Iter(train) [3060/9216] lr: 1.5587e-05 eta: 7:25:26 time: 5.4527 data_time: 0.0144 memory: 15689 loss: 0.1155 2024/08/18 07:53:03 - mmengine - INFO - Iter(train) [3070/9216] lr: 1.5558e-05 eta: 7:25:02 time: 5.3050 data_time: 0.0145 memory: 15578 loss: 0.1206 2024/08/18 07:53:55 - mmengine - INFO - Iter(train) [3080/9216] lr: 1.5529e-05 eta: 7:24:35 time: 5.1848 data_time: 0.0144 memory: 15533 loss: 0.1038 2024/08/18 07:54:46 - mmengine - INFO - Iter(train) [3090/9216] lr: 1.5499e-05 eta: 7:24:08 time: 5.1443 data_time: 0.0146 memory: 15398 loss: 0.1029 2024/08/18 07:55:40 - mmengine - INFO - Iter(train) [3100/9216] lr: 1.5470e-05 eta: 7:23:44 time: 5.3786 data_time: 0.0144 memory: 15362 loss: 0.1360 2024/08/18 07:56:34 - mmengine - INFO - Iter(train) [3110/9216] lr: 1.5440e-05 eta: 7:23:20 time: 5.3495 data_time: 0.0144 memory: 15474 loss: 0.1548 2024/08/18 07:57:26 - mmengine - INFO - Iter(train) [3120/9216] lr: 1.5411e-05 eta: 7:22:54 time: 5.2073 data_time: 0.0146 memory: 15453 loss: 0.1058 2024/08/18 07:58:18 - mmengine - INFO - Iter(train) [3130/9216] lr: 1.5381e-05 eta: 7:22:26 time: 5.1922 data_time: 0.0144 memory: 15346 loss: 0.1081 2024/08/18 07:59:08 - mmengine - INFO - Iter(train) [3140/9216] lr: 1.5352e-05 eta: 7:21:56 time: 5.0534 data_time: 0.0149 memory: 15244 loss: 0.1865 2024/08/18 07:59:59 - mmengine - INFO - Iter(train) [3150/9216] lr: 1.5322e-05 eta: 7:21:27 time: 5.1131 data_time: 0.0143 memory: 15410 loss: 0.1346 2024/08/18 08:00:51 - mmengine - INFO - Iter(train) [3160/9216] lr: 1.5292e-05 eta: 7:20:59 time: 5.2140 data_time: 0.0151 memory: 15294 loss: 0.2531 2024/08/18 08:01:41 - mmengine - INFO - Iter(train) [3170/9216] lr: 1.5262e-05 eta: 7:20:28 time: 5.0011 data_time: 0.0141 memory: 14934 loss: 0.1826 2024/08/18 08:02:33 - mmengine - INFO - Iter(train) [3180/9216] lr: 1.5232e-05 eta: 7:19:58 time: 5.1368 data_time: 0.0140 memory: 14965 loss: 0.1372 2024/08/18 08:03:21 - mmengine - INFO - Iter(train) [3190/9216] lr: 1.5202e-05 eta: 7:19:24 time: 4.8626 data_time: 0.0133 memory: 14859 loss: 0.0999 2024/08/18 08:04:08 - mmengine - INFO - Iter(train) [3200/9216] lr: 1.5172e-05 eta: 7:18:46 time: 4.6583 data_time: 0.0127 memory: 14541 loss: 0.0897 2024/08/18 08:04:52 - mmengine - INFO - Iter(train) [3210/9216] lr: 1.5142e-05 eta: 7:18:02 time: 4.3751 data_time: 0.0127 memory: 14491 loss: 0.1285 2024/08/18 08:05:32 - mmengine - INFO - Iter(train) [3220/9216] lr: 1.5112e-05 eta: 7:17:12 time: 4.0692 data_time: 0.0128 memory: 13972 loss: 0.0789 2024/08/18 08:06:11 - mmengine - INFO - Iter(train) [3230/9216] lr: 1.5082e-05 eta: 7:16:19 time: 3.8650 data_time: 0.0121 memory: 13850 loss: 0.1413 2024/08/18 08:06:51 - mmengine - INFO - Iter(train) [3240/9216] lr: 1.5052e-05 eta: 7:15:29 time: 4.0263 data_time: 0.0125 memory: 13523 loss: 0.0866 2024/08/18 08:07:31 - mmengine - INFO - Iter(train) [3250/9216] lr: 1.5021e-05 eta: 7:14:37 time: 3.9367 data_time: 0.0119 memory: 13452 loss: 0.0607 2024/08/18 08:08:10 - mmengine - INFO - Iter(train) [3260/9216] lr: 1.4991e-05 eta: 7:13:45 time: 3.8992 data_time: 0.0121 memory: 13452 loss: 0.1051 2024/08/18 08:08:48 - mmengine - INFO - Iter(train) [3270/9216] lr: 1.4960e-05 eta: 7:12:51 time: 3.8059 data_time: 0.0121 memory: 13235 loss: 0.0919 2024/08/18 08:09:23 - mmengine - INFO - Iter(train) [3280/9216] lr: 1.4930e-05 eta: 7:11:52 time: 3.5269 data_time: 0.0118 memory: 13159 loss: 0.1239 2024/08/18 08:09:56 - mmengine - INFO - Iter(train) [3290/9216] lr: 1.4899e-05 eta: 7:10:49 time: 3.2712 data_time: 0.0117 memory: 13019 loss: 0.1552 2024/08/18 08:10:25 - mmengine - INFO - Iter(train) [3300/9216] lr: 1.4869e-05 eta: 7:09:39 time: 2.9228 data_time: 0.0116 memory: 12775 loss: 0.1719 2024/08/18 08:10:50 - mmengine - INFO - Iter(train) [3310/9216] lr: 1.4838e-05 eta: 7:08:23 time: 2.5363 data_time: 0.0103 memory: 12185 loss: 0.0714 2024/08/18 08:11:13 - mmengine - INFO - Iter(train) [3320/9216] lr: 1.4807e-05 eta: 7:07:03 time: 2.2867 data_time: 0.0103 memory: 11785 loss: 0.1532 2024/08/18 08:11:41 - mmengine - INFO - Iter(train) [3330/9216] lr: 1.4776e-05 eta: 7:05:52 time: 2.7663 data_time: 0.0099 memory: 20866 loss: 0.0932 2024/08/18 08:12:42 - mmengine - INFO - Iter(train) [3340/9216] lr: 1.4745e-05 eta: 7:05:39 time: 6.1144 data_time: 0.0147 memory: 19373 loss: 0.0983 2024/08/18 08:13:35 - mmengine - INFO - Iter(train) [3350/9216] lr: 1.4714e-05 eta: 7:05:13 time: 5.3485 data_time: 0.0148 memory: 16566 loss: 0.0710 2024/08/18 08:14:27 - mmengine - INFO - Iter(train) [3360/9216] lr: 1.4683e-05 eta: 7:04:45 time: 5.1910 data_time: 0.0143 memory: 16261 loss: 0.1230 2024/08/18 08:15:19 - mmengine - INFO - Iter(train) [3370/9216] lr: 1.4652e-05 eta: 7:04:15 time: 5.1455 data_time: 0.0144 memory: 16216 loss: 0.0933 2024/08/18 08:16:09 - mmengine - INFO - Iter(train) [3380/9216] lr: 1.4621e-05 eta: 7:03:43 time: 5.0395 data_time: 0.0147 memory: 16183 loss: 0.0563 2024/08/18 08:16:58 - mmengine - INFO - Iter(train) [3390/9216] lr: 1.4590e-05 eta: 7:03:09 time: 4.8881 data_time: 0.0144 memory: 16043 loss: 0.0830 2024/08/18 08:17:47 - mmengine - INFO - Iter(train) [3400/9216] lr: 1.4559e-05 eta: 7:02:34 time: 4.8706 data_time: 0.0149 memory: 16059 loss: 0.0524 2024/08/18 08:18:36 - mmengine - INFO - Iter(train) [3410/9216] lr: 1.4527e-05 eta: 7:01:59 time: 4.8761 data_time: 0.0143 memory: 16283 loss: 0.1103 2024/08/18 08:19:24 - mmengine - INFO - Iter(train) [3420/9216] lr: 1.4496e-05 eta: 7:01:23 time: 4.8184 data_time: 0.0145 memory: 16043 loss: 0.0898 2024/08/18 08:20:12 - mmengine - INFO - Iter(train) [3430/9216] lr: 1.4465e-05 eta: 7:00:47 time: 4.8223 data_time: 0.0163 memory: 16012 loss: 0.1457 2024/08/18 08:21:00 - mmengine - INFO - Iter(train) [3440/9216] lr: 1.4433e-05 eta: 7:00:11 time: 4.8203 data_time: 0.0145 memory: 15919 loss: 0.0668 2024/08/18 08:21:48 - mmengine - INFO - Iter(train) [3450/9216] lr: 1.4402e-05 eta: 6:59:35 time: 4.7741 data_time: 0.0148 memory: 15967 loss: 0.0599 2024/08/18 08:22:36 - mmengine - INFO - Iter(train) [3460/9216] lr: 1.4370e-05 eta: 6:58:58 time: 4.8100 data_time: 0.0143 memory: 16112 loss: 0.0903 2024/08/18 08:23:24 - mmengine - INFO - Iter(train) [3470/9216] lr: 1.4338e-05 eta: 6:58:21 time: 4.7662 data_time: 0.0146 memory: 15910 loss: 0.0728 2024/08/18 08:24:12 - mmengine - INFO - Iter(train) [3480/9216] lr: 1.4307e-05 eta: 6:57:45 time: 4.7908 data_time: 0.0143 memory: 15933 loss: 0.0804 2024/08/18 08:24:59 - mmengine - INFO - Iter(train) [3490/9216] lr: 1.4275e-05 eta: 6:57:07 time: 4.7310 data_time: 0.0143 memory: 15708 loss: 0.0765 2024/08/18 08:25:46 - mmengine - INFO - Iter(train) [3500/9216] lr: 1.4243e-05 eta: 6:56:29 time: 4.7198 data_time: 0.0145 memory: 15772 loss: 0.0818 2024/08/18 08:26:33 - mmengine - INFO - Iter(train) [3510/9216] lr: 1.4211e-05 eta: 6:55:50 time: 4.6830 data_time: 0.0143 memory: 15696 loss: 0.0743 2024/08/18 08:27:19 - mmengine - INFO - Iter(train) [3520/9216] lr: 1.4179e-05 eta: 6:55:11 time: 4.6600 data_time: 0.0144 memory: 15755 loss: 0.0901 2024/08/18 08:28:06 - mmengine - INFO - Iter(train) [3530/9216] lr: 1.4147e-05 eta: 6:54:31 time: 4.6313 data_time: 0.0143 memory: 15689 loss: 0.0739 2024/08/18 08:28:52 - mmengine - INFO - Iter(train) [3540/9216] lr: 1.4115e-05 eta: 6:53:52 time: 4.6362 data_time: 0.0149 memory: 15686 loss: 0.0882 2024/08/18 08:29:38 - mmengine - INFO - Iter(train) [3550/9216] lr: 1.4083e-05 eta: 6:53:11 time: 4.5626 data_time: 0.0148 memory: 15555 loss: 0.1103 2024/08/18 08:30:26 - mmengine - INFO - Iter(train) [3560/9216] lr: 1.4051e-05 eta: 6:52:35 time: 4.8421 data_time: 0.0143 memory: 15614 loss: 0.0933 2024/08/18 08:31:12 - mmengine - INFO - Iter(train) [3570/9216] lr: 1.4019e-05 eta: 6:51:55 time: 4.6160 data_time: 0.0148 memory: 15564 loss: 0.1024 2024/08/18 08:31:58 - mmengine - INFO - Iter(train) [3580/9216] lr: 1.3987e-05 eta: 6:51:13 time: 4.5303 data_time: 0.0144 memory: 15512 loss: 0.0791 2024/08/18 08:32:43 - mmengine - INFO - Iter(train) [3590/9216] lr: 1.3955e-05 eta: 6:50:32 time: 4.5409 data_time: 0.0143 memory: 15607 loss: 0.1081 2024/08/18 08:33:28 - mmengine - INFO - Iter(train) [3600/9216] lr: 1.3922e-05 eta: 6:49:50 time: 4.4681 data_time: 0.0143 memory: 15450 loss: 0.0947 2024/08/18 08:34:13 - mmengine - INFO - Iter(train) [3610/9216] lr: 1.3890e-05 eta: 6:49:08 time: 4.5252 data_time: 0.0143 memory: 15465 loss: 0.1560 2024/08/18 08:34:57 - mmengine - INFO - Iter(train) [3620/9216] lr: 1.3858e-05 eta: 6:48:26 time: 4.4486 data_time: 0.0143 memory: 15336 loss: 0.0960 2024/08/18 08:35:42 - mmengine - INFO - Iter(train) [3630/9216] lr: 1.3825e-05 eta: 6:47:43 time: 4.4383 data_time: 0.0142 memory: 15282 loss: 0.1420 2024/08/18 08:36:26 - mmengine - INFO - Iter(train) [3640/9216] lr: 1.3793e-05 eta: 6:47:00 time: 4.4352 data_time: 0.0161 memory: 15256 loss: 0.2000 2024/08/18 08:37:09 - mmengine - INFO - Iter(train) [3650/9216] lr: 1.3760e-05 eta: 6:46:15 time: 4.3220 data_time: 0.0143 memory: 15161 loss: 0.1168 2024/08/18 08:37:52 - mmengine - INFO - Iter(train) [3660/9216] lr: 1.3728e-05 eta: 6:45:29 time: 4.2068 data_time: 0.0137 memory: 15066 loss: 0.1000 2024/08/18 08:38:33 - mmengine - INFO - Iter(train) [3670/9216] lr: 1.3695e-05 eta: 6:44:42 time: 4.1970 data_time: 0.0141 memory: 15108 loss: 0.1866 2024/08/18 08:39:14 - mmengine - INFO - Iter(train) [3680/9216] lr: 1.3662e-05 eta: 6:43:54 time: 4.0997 data_time: 0.0140 memory: 14827 loss: 0.0924 2024/08/18 08:39:55 - mmengine - INFO - Iter(train) [3690/9216] lr: 1.3630e-05 eta: 6:43:05 time: 4.0057 data_time: 0.0143 memory: 14732 loss: 0.1391 2024/08/18 08:40:32 - mmengine - INFO - Iter(train) [3700/9216] lr: 1.3597e-05 eta: 6:42:12 time: 3.7688 data_time: 0.0126 memory: 14460 loss: 0.1929 2024/08/18 08:41:07 - mmengine - INFO - Iter(train) [3710/9216] lr: 1.3564e-05 eta: 6:41:15 time: 3.5082 data_time: 0.0129 memory: 14322 loss: 0.0893 2024/08/18 08:41:41 - mmengine - INFO - Iter(train) [3720/9216] lr: 1.3531e-05 eta: 6:40:16 time: 3.3448 data_time: 0.0128 memory: 13786 loss: 0.1208 2024/08/18 08:42:14 - mmengine - INFO - Iter(train) [3730/9216] lr: 1.3498e-05 eta: 6:39:17 time: 3.3105 data_time: 0.0124 memory: 13848 loss: 0.1151 2024/08/18 08:42:46 - mmengine - INFO - Iter(train) [3740/9216] lr: 1.3465e-05 eta: 6:38:17 time: 3.2469 data_time: 0.0123 memory: 13519 loss: 0.0942 2024/08/18 08:43:19 - mmengine - INFO - Iter(train) [3750/9216] lr: 1.3432e-05 eta: 6:37:18 time: 3.2845 data_time: 0.0119 memory: 13476 loss: 0.0826 2024/08/18 08:43:52 - mmengine - INFO - Iter(train) [3760/9216] lr: 1.3399e-05 eta: 6:36:19 time: 3.3080 data_time: 0.0121 memory: 13500 loss: 0.0683 2024/08/18 08:44:25 - mmengine - INFO - Iter(train) [3770/9216] lr: 1.3366e-05 eta: 6:35:19 time: 3.2597 data_time: 0.0119 memory: 13420 loss: 0.0730 2024/08/18 08:44:57 - mmengine - INFO - Iter(train) [3780/9216] lr: 1.3333e-05 eta: 6:34:20 time: 3.2360 data_time: 0.0121 memory: 13287 loss: 0.1089 2024/08/18 08:45:30 - mmengine - INFO - Iter(train) [3790/9216] lr: 1.3300e-05 eta: 6:33:21 time: 3.2645 data_time: 0.0121 memory: 13175 loss: 0.1373 2024/08/18 08:46:01 - mmengine - INFO - Iter(train) [3800/9216] lr: 1.3267e-05 eta: 6:32:20 time: 3.1360 data_time: 0.0123 memory: 13024 loss: 0.1599 2024/08/18 08:46:31 - mmengine - INFO - Iter(train) [3810/9216] lr: 1.3234e-05 eta: 6:31:17 time: 2.9811 data_time: 0.0128 memory: 12687 loss: 0.1576 2024/08/18 08:46:57 - mmengine - INFO - Iter(train) [3820/9216] lr: 1.3200e-05 eta: 6:30:10 time: 2.6381 data_time: 0.0114 memory: 12217 loss: 0.0711 2024/08/18 08:47:22 - mmengine - INFO - Iter(train) [3830/9216] lr: 1.3167e-05 eta: 6:28:59 time: 2.4381 data_time: 0.0105 memory: 11762 loss: 0.1364 2024/08/18 08:47:35 - mmengine - INFO - Iter(train) [3840/9216] lr: 1.3134e-05 eta: 6:27:34 time: 1.3475 data_time: 0.0087 memory: 11155 loss: 0.1467 2024/08/18 08:48:41 - mmengine - INFO - Iter(train) [3850/9216] lr: 1.3100e-05 eta: 6:27:22 time: 6.5426 data_time: 0.0135 memory: 18617 loss: 0.1416 2024/08/18 08:49:41 - mmengine - INFO - Iter(train) [3860/9216] lr: 1.3067e-05 eta: 6:27:01 time: 5.9809 data_time: 0.0145 memory: 16793 loss: 0.1815 2024/08/18 08:50:38 - mmengine - INFO - Iter(train) [3870/9216] lr: 1.3033e-05 eta: 6:26:37 time: 5.7236 data_time: 0.0154 memory: 16290 loss: 0.0453 2024/08/18 08:51:33 - mmengine - INFO - Iter(train) [3880/9216] lr: 1.3000e-05 eta: 6:26:11 time: 5.5690 data_time: 0.0150 memory: 16161 loss: 0.0438 2024/08/18 08:52:31 - mmengine - INFO - Iter(train) [3890/9216] lr: 1.2966e-05 eta: 6:25:46 time: 5.7186 data_time: 0.0144 memory: 16214 loss: 0.0629 2024/08/18 08:53:29 - mmengine - INFO - Iter(train) [3900/9216] lr: 1.2933e-05 eta: 6:25:24 time: 5.8825 data_time: 0.0144 memory: 16031 loss: 0.0666 2024/08/18 08:54:27 - mmengine - INFO - Iter(train) [3910/9216] lr: 1.2899e-05 eta: 6:24:59 time: 5.7584 data_time: 0.0161 memory: 16195 loss: 0.0397 2024/08/18 08:55:24 - mmengine - INFO - Iter(train) [3920/9216] lr: 1.2865e-05 eta: 6:24:34 time: 5.7002 data_time: 0.0149 memory: 16363 loss: 0.0612 2024/08/18 08:56:19 - mmengine - INFO - Iter(train) [3930/9216] lr: 1.2832e-05 eta: 6:24:06 time: 5.4935 data_time: 0.0154 memory: 16147 loss: 0.0571 2024/08/18 08:57:13 - mmengine - INFO - Iter(train) [3940/9216] lr: 1.2798e-05 eta: 6:23:36 time: 5.3755 data_time: 0.0144 memory: 15997 loss: 0.0624 2024/08/18 08:58:07 - mmengine - INFO - Iter(train) [3950/9216] lr: 1.2764e-05 eta: 6:23:06 time: 5.3946 data_time: 0.0145 memory: 15995 loss: 0.0550 2024/08/18 08:58:59 - mmengine - INFO - Iter(train) [3960/9216] lr: 1.2730e-05 eta: 6:22:34 time: 5.2595 data_time: 0.0147 memory: 15986 loss: 0.0697 2024/08/18 08:59:51 - mmengine - INFO - Iter(train) [3970/9216] lr: 1.2697e-05 eta: 6:22:01 time: 5.1327 data_time: 0.0146 memory: 16540 loss: 0.1058 2024/08/18 09:00:42 - mmengine - INFO - Iter(train) [3980/9216] lr: 1.2663e-05 eta: 6:21:27 time: 5.1696 data_time: 0.0145 memory: 15969 loss: 0.0542 2024/08/18 09:01:32 - mmengine - INFO - Iter(train) [3990/9216] lr: 1.2629e-05 eta: 6:20:51 time: 4.9225 data_time: 0.0143 memory: 16740 loss: 0.0681 2024/08/18 09:02:20 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 09:02:20 - mmengine - INFO - Iter(train) [4000/9216] lr: 1.2595e-05 eta: 6:20:13 time: 4.8028 data_time: 0.0151 memory: 15841 loss: 0.0856 2024/08/18 09:02:20 - mmengine - INFO - Saving checkpoint at 4000 iterations 2024/08/18 09:03:10 - mmengine - INFO - Iter(train) [4010/9216] lr: 1.2561e-05 eta: 6:19:38 time: 5.0749 data_time: 0.2281 memory: 15831 loss: 0.1018 2024/08/18 09:04:00 - mmengine - INFO - Iter(train) [4020/9216] lr: 1.2527e-05 eta: 6:19:02 time: 4.9748 data_time: 0.0145 memory: 16033 loss: 0.1532 2024/08/18 09:04:49 - mmengine - INFO - Iter(train) [4030/9216] lr: 1.2493e-05 eta: 6:18:25 time: 4.8743 data_time: 0.0143 memory: 15853 loss: 0.0777 2024/08/18 09:05:37 - mmengine - INFO - Iter(train) [4040/9216] lr: 1.2459e-05 eta: 6:17:47 time: 4.8690 data_time: 0.0154 memory: 15662 loss: 0.1081 2024/08/18 09:06:26 - mmengine - INFO - Iter(train) [4050/9216] lr: 1.2425e-05 eta: 6:17:10 time: 4.8714 data_time: 0.0142 memory: 15779 loss: 0.0752 2024/08/18 09:07:14 - mmengine - INFO - Iter(train) [4060/9216] lr: 1.2391e-05 eta: 6:16:31 time: 4.7920 data_time: 0.0143 memory: 15803 loss: 0.0689 2024/08/18 09:08:01 - mmengine - INFO - Iter(train) [4070/9216] lr: 1.2357e-05 eta: 6:15:51 time: 4.6787 data_time: 0.0144 memory: 15633 loss: 0.0810 2024/08/18 09:08:47 - mmengine - INFO - Iter(train) [4080/9216] lr: 1.2322e-05 eta: 6:15:10 time: 4.6291 data_time: 0.0151 memory: 15659 loss: 0.1036 2024/08/18 09:09:32 - mmengine - INFO - Iter(train) [4090/9216] lr: 1.2288e-05 eta: 6:14:28 time: 4.5177 data_time: 0.0145 memory: 15529 loss: 0.0821 2024/08/18 09:10:18 - mmengine - INFO - Iter(train) [4100/9216] lr: 1.2254e-05 eta: 6:13:46 time: 4.5492 data_time: 0.0144 memory: 15612 loss: 0.1313 2024/08/18 09:11:04 - mmengine - INFO - Iter(train) [4110/9216] lr: 1.2220e-05 eta: 6:13:05 time: 4.5652 data_time: 0.0143 memory: 15701 loss: 0.1121 2024/08/18 09:11:49 - mmengine - INFO - Iter(train) [4120/9216] lr: 1.2186e-05 eta: 6:12:23 time: 4.5770 data_time: 0.0143 memory: 15751 loss: 0.0966 2024/08/18 09:12:35 - mmengine - INFO - Iter(train) [4130/9216] lr: 1.2151e-05 eta: 6:11:42 time: 4.5421 data_time: 0.0146 memory: 15419 loss: 0.0881 2024/08/18 09:13:20 - mmengine - INFO - Iter(train) [4140/9216] lr: 1.2117e-05 eta: 6:11:00 time: 4.5461 data_time: 0.0143 memory: 15533 loss: 0.0945 2024/08/18 09:14:07 - mmengine - INFO - Iter(train) [4150/9216] lr: 1.2083e-05 eta: 6:10:19 time: 4.6569 data_time: 0.0143 memory: 15472 loss: 0.1127 2024/08/18 09:14:53 - mmengine - INFO - Iter(train) [4160/9216] lr: 1.2048e-05 eta: 6:09:38 time: 4.5993 data_time: 0.0151 memory: 15320 loss: 0.1097 2024/08/18 09:15:39 - mmengine - INFO - Iter(train) [4170/9216] lr: 1.2014e-05 eta: 6:08:56 time: 4.5938 data_time: 0.0143 memory: 15408 loss: 0.1520 2024/08/18 09:16:25 - mmengine - INFO - Iter(train) [4180/9216] lr: 1.1979e-05 eta: 6:08:16 time: 4.6496 data_time: 0.0143 memory: 15239 loss: 0.1292 2024/08/18 09:17:11 - mmengine - INFO - Iter(train) [4190/9216] lr: 1.1945e-05 eta: 6:07:34 time: 4.5633 data_time: 0.0149 memory: 15270 loss: 0.1659 2024/08/18 09:17:54 - mmengine - INFO - Iter(train) [4200/9216] lr: 1.1910e-05 eta: 6:06:49 time: 4.2730 data_time: 0.0140 memory: 15089 loss: 0.1393 2024/08/18 09:18:35 - mmengine - INFO - Iter(train) [4210/9216] lr: 1.1876e-05 eta: 6:06:02 time: 4.1268 data_time: 0.0139 memory: 14837 loss: 0.2619 2024/08/18 09:19:16 - mmengine - INFO - Iter(train) [4220/9216] lr: 1.1841e-05 eta: 6:05:14 time: 4.0842 data_time: 0.0134 memory: 14716 loss: 0.1552 2024/08/18 09:19:53 - mmengine - INFO - Iter(train) [4230/9216] lr: 1.1807e-05 eta: 6:04:23 time: 3.7834 data_time: 0.0128 memory: 14457 loss: 0.1105 2024/08/18 09:20:28 - mmengine - INFO - Iter(train) [4240/9216] lr: 1.1772e-05 eta: 6:03:29 time: 3.4526 data_time: 0.0123 memory: 14006 loss: 0.1239 2024/08/18 09:21:00 - mmengine - INFO - Iter(train) [4250/9216] lr: 1.1738e-05 eta: 6:02:31 time: 3.2323 data_time: 0.0119 memory: 13610 loss: 0.0910 2024/08/18 09:21:33 - mmengine - INFO - Iter(train) [4260/9216] lr: 1.1703e-05 eta: 6:01:34 time: 3.2484 data_time: 0.0121 memory: 13773 loss: 0.0917 2024/08/18 09:22:06 - mmengine - INFO - Iter(train) [4270/9216] lr: 1.1668e-05 eta: 6:00:38 time: 3.2985 data_time: 0.0120 memory: 13565 loss: 0.0944 2024/08/18 09:22:39 - mmengine - INFO - Iter(train) [4280/9216] lr: 1.1634e-05 eta: 5:59:42 time: 3.3315 data_time: 0.0124 memory: 13326 loss: 0.1303 2024/08/18 09:23:13 - mmengine - INFO - Iter(train) [4290/9216] lr: 1.1599e-05 eta: 5:58:47 time: 3.4080 data_time: 0.0120 memory: 13320 loss: 0.0933 2024/08/18 09:23:46 - mmengine - INFO - Iter(train) [4300/9216] lr: 1.1564e-05 eta: 5:57:52 time: 3.3003 data_time: 0.0117 memory: 13100 loss: 0.0966 2024/08/18 09:24:18 - mmengine - INFO - Iter(train) [4310/9216] lr: 1.1530e-05 eta: 5:56:55 time: 3.2137 data_time: 0.0129 memory: 13137 loss: 0.1181 2024/08/18 09:24:48 - mmengine - INFO - Iter(train) [4320/9216] lr: 1.1495e-05 eta: 5:55:55 time: 2.9248 data_time: 0.0117 memory: 12915 loss: 0.1411 2024/08/18 09:25:13 - mmengine - INFO - Iter(train) [4330/9216] lr: 1.1460e-05 eta: 5:54:51 time: 2.5740 data_time: 0.0107 memory: 12060 loss: 0.0886 2024/08/18 09:25:37 - mmengine - INFO - Iter(train) [4340/9216] lr: 1.1425e-05 eta: 5:53:45 time: 2.4079 data_time: 0.0107 memory: 11766 loss: 0.1393 2024/08/18 09:25:56 - mmengine - INFO - Iter(train) [4350/9216] lr: 1.1391e-05 eta: 5:52:35 time: 1.9042 data_time: 0.0095 memory: 11241 loss: 0.1172 2024/08/18 09:26:54 - mmengine - INFO - Iter(train) [4360/9216] lr: 1.1356e-05 eta: 5:52:07 time: 5.7727 data_time: 0.0125 memory: 21632 loss: 0.1381 2024/08/18 09:27:55 - mmengine - INFO - Iter(train) [4370/9216] lr: 1.1321e-05 eta: 5:51:42 time: 6.0644 data_time: 0.0144 memory: 16226 loss: 0.0541 2024/08/18 09:28:56 - mmengine - INFO - Iter(train) [4380/9216] lr: 1.1286e-05 eta: 5:51:18 time: 6.1046 data_time: 0.0144 memory: 16173 loss: 0.0605 2024/08/18 09:29:54 - mmengine - INFO - Iter(train) [4390/9216] lr: 1.1251e-05 eta: 5:50:51 time: 5.8481 data_time: 0.0141 memory: 16064 loss: 0.0703 2024/08/18 09:30:56 - mmengine - INFO - Iter(train) [4400/9216] lr: 1.1216e-05 eta: 5:50:27 time: 6.1441 data_time: 0.0144 memory: 16116 loss: 0.0577 2024/08/18 09:31:54 - mmengine - INFO - Iter(train) [4410/9216] lr: 1.1181e-05 eta: 5:49:59 time: 5.8125 data_time: 0.0144 memory: 15962 loss: 0.0508 2024/08/18 09:32:52 - mmengine - INFO - Iter(train) [4420/9216] lr: 1.1147e-05 eta: 5:49:31 time: 5.8011 data_time: 0.0144 memory: 15788 loss: 0.0700 2024/08/18 09:33:47 - mmengine - INFO - Iter(train) [4430/9216] lr: 1.1112e-05 eta: 5:48:59 time: 5.4985 data_time: 0.0144 memory: 15810 loss: 0.0755 2024/08/18 09:34:43 - mmengine - INFO - Iter(train) [4440/9216] lr: 1.1077e-05 eta: 5:48:29 time: 5.6519 data_time: 0.0145 memory: 15779 loss: 0.0798 2024/08/18 09:35:41 - mmengine - INFO - Iter(train) [4450/9216] lr: 1.1042e-05 eta: 5:48:00 time: 5.7222 data_time: 0.0143 memory: 15777 loss: 0.0543 2024/08/18 09:36:36 - mmengine - INFO - Iter(train) [4460/9216] lr: 1.1007e-05 eta: 5:47:28 time: 5.5392 data_time: 0.0150 memory: 15666 loss: 0.1712 2024/08/18 09:37:30 - mmengine - INFO - Iter(train) [4470/9216] lr: 1.0972e-05 eta: 5:46:55 time: 5.4150 data_time: 0.0146 memory: 15691 loss: 0.1082 2024/08/18 09:38:25 - mmengine - INFO - Iter(train) [4480/9216] lr: 1.0937e-05 eta: 5:46:23 time: 5.4355 data_time: 0.0147 memory: 15493 loss: 0.1367 2024/08/18 09:39:18 - mmengine - INFO - Iter(train) [4490/9216] lr: 1.0902e-05 eta: 5:45:49 time: 5.3188 data_time: 0.0144 memory: 15434 loss: 0.1202 2024/08/18 09:40:10 - mmengine - INFO - Iter(train) [4500/9216] lr: 1.0867e-05 eta: 5:45:14 time: 5.2652 data_time: 0.0143 memory: 15538 loss: 0.1185 2024/08/18 09:40:59 - mmengine - INFO - Iter(train) [4510/9216] lr: 1.0832e-05 eta: 5:44:35 time: 4.8808 data_time: 0.0138 memory: 15012 loss: 0.1221 2024/08/18 09:41:45 - mmengine - INFO - Iter(train) [4520/9216] lr: 1.0797e-05 eta: 5:43:53 time: 4.5484 data_time: 0.0135 memory: 14727 loss: 0.1058 2024/08/18 09:42:25 - mmengine - INFO - Iter(train) [4530/9216] lr: 1.0762e-05 eta: 5:43:05 time: 4.0001 data_time: 0.0129 memory: 14343 loss: 0.1316 2024/08/18 09:43:00 - mmengine - INFO - Iter(train) [4540/9216] lr: 1.0727e-05 eta: 5:42:12 time: 3.5755 data_time: 0.0120 memory: 13559 loss: 0.0932 2024/08/18 09:43:36 - mmengine - INFO - Iter(train) [4550/9216] lr: 1.0692e-05 eta: 5:41:19 time: 3.5148 data_time: 0.0120 memory: 13504 loss: 0.0666 2024/08/18 09:44:10 - mmengine - INFO - Iter(train) [4560/9216] lr: 1.0657e-05 eta: 5:40:26 time: 3.4914 data_time: 0.0118 memory: 13332 loss: 0.1112 2024/08/18 09:44:43 - mmengine - INFO - Iter(train) [4570/9216] lr: 1.0622e-05 eta: 5:39:31 time: 3.2145 data_time: 0.0116 memory: 12974 loss: 0.2499 2024/08/18 09:45:09 - mmengine - INFO - Iter(train) [4580/9216] lr: 1.0587e-05 eta: 5:38:29 time: 2.6467 data_time: 0.0104 memory: 11963 loss: 0.1309 2024/08/18 09:46:03 - mmengine - INFO - Iter(train) [4590/9216] lr: 1.0551e-05 eta: 5:37:55 time: 5.3509 data_time: 0.0113 memory: 22898 loss: 0.1125 2024/08/18 09:47:11 - mmengine - INFO - Iter(train) [4600/9216] lr: 1.0516e-05 eta: 5:37:36 time: 6.8273 data_time: 0.0143 memory: 18518 loss: 0.0654 2024/08/18 09:48:19 - mmengine - INFO - Iter(train) [4610/9216] lr: 1.0481e-05 eta: 5:37:17 time: 6.8605 data_time: 0.2832 memory: 22610 loss: 0.1119 2024/08/18 09:49:25 - mmengine - INFO - Iter(train) [4620/9216] lr: 1.0446e-05 eta: 5:36:54 time: 6.5490 data_time: 0.0148 memory: 19489 loss: 0.0817 2024/08/18 09:50:27 - mmengine - INFO - Iter(train) [4630/9216] lr: 1.0411e-05 eta: 5:36:28 time: 6.1545 data_time: 0.0146 memory: 17381 loss: 0.0788 2024/08/18 09:51:25 - mmengine - INFO - Iter(train) [4640/9216] lr: 1.0376e-05 eta: 5:35:58 time: 5.8545 data_time: 0.0146 memory: 16316 loss: 0.0872 2024/08/18 09:52:23 - mmengine - INFO - Iter(train) [4650/9216] lr: 1.0341e-05 eta: 5:35:27 time: 5.7472 data_time: 0.0144 memory: 16252 loss: 0.0678 2024/08/18 09:53:23 - mmengine - INFO - Iter(train) [4660/9216] lr: 1.0306e-05 eta: 5:34:59 time: 6.0542 data_time: 0.0144 memory: 16245 loss: 0.0732 2024/08/18 09:54:25 - mmengine - INFO - Iter(train) [4670/9216] lr: 1.0271e-05 eta: 5:34:32 time: 6.1664 data_time: 0.0157 memory: 16178 loss: 0.0713 2024/08/18 09:55:26 - mmengine - INFO - Iter(train) [4680/9216] lr: 1.0235e-05 eta: 5:34:05 time: 6.1457 data_time: 0.0155 memory: 16059 loss: 0.0598 2024/08/18 09:56:26 - mmengine - INFO - Iter(train) [4690/9216] lr: 1.0200e-05 eta: 5:33:35 time: 5.9351 data_time: 0.0145 memory: 16062 loss: 0.0515 2024/08/18 09:57:24 - mmengine - INFO - Iter(train) [4700/9216] lr: 1.0165e-05 eta: 5:33:04 time: 5.8426 data_time: 0.0148 memory: 16043 loss: 0.0872 2024/08/18 09:58:23 - mmengine - INFO - Iter(train) [4710/9216] lr: 1.0130e-05 eta: 5:32:34 time: 5.8821 data_time: 0.0147 memory: 16012 loss: 0.1240 2024/08/18 09:59:20 - mmengine - INFO - Iter(train) [4720/9216] lr: 1.0095e-05 eta: 5:32:03 time: 5.7675 data_time: 0.0149 memory: 15986 loss: 0.0528 2024/08/18 10:00:20 - mmengine - INFO - Iter(train) [4730/9216] lr: 1.0060e-05 eta: 5:31:33 time: 5.9530 data_time: 0.0144 memory: 16071 loss: 0.0625 2024/08/18 10:01:18 - mmengine - INFO - Iter(train) [4740/9216] lr: 1.0025e-05 eta: 5:31:02 time: 5.8331 data_time: 0.0145 memory: 15957 loss: 0.1219 2024/08/18 10:02:16 - mmengine - INFO - Iter(train) [4750/9216] lr: 9.9895e-06 eta: 5:30:30 time: 5.7667 data_time: 0.0144 memory: 16740 loss: 0.0599 2024/08/18 10:03:14 - mmengine - INFO - Iter(train) [4760/9216] lr: 9.9543e-06 eta: 5:29:58 time: 5.8109 data_time: 0.0145 memory: 15933 loss: 0.0818 2024/08/18 10:04:12 - mmengine - INFO - Iter(train) [4770/9216] lr: 9.9192e-06 eta: 5:29:27 time: 5.8329 data_time: 0.0145 memory: 15784 loss: 0.0787 2024/08/18 10:05:10 - mmengine - INFO - Iter(train) [4780/9216] lr: 9.8840e-06 eta: 5:28:54 time: 5.7306 data_time: 0.0151 memory: 16033 loss: 0.0709 2024/08/18 10:06:05 - mmengine - INFO - Iter(train) [4790/9216] lr: 9.8489e-06 eta: 5:28:20 time: 5.5358 data_time: 0.0145 memory: 15724 loss: 0.0769 2024/08/18 10:07:02 - mmengine - INFO - Iter(train) [4800/9216] lr: 9.8138e-06 eta: 5:27:46 time: 5.6490 data_time: 0.0147 memory: 15734 loss: 0.0696 2024/08/18 10:07:57 - mmengine - INFO - Iter(train) [4810/9216] lr: 9.7786e-06 eta: 5:27:12 time: 5.5354 data_time: 0.0145 memory: 15659 loss: 0.1362 2024/08/18 10:08:52 - mmengine - INFO - Iter(train) [4820/9216] lr: 9.7435e-06 eta: 5:26:37 time: 5.5267 data_time: 0.0147 memory: 15647 loss: 0.0835 2024/08/18 10:09:48 - mmengine - INFO - Iter(train) [4830/9216] lr: 9.7084e-06 eta: 5:26:03 time: 5.5985 data_time: 0.0152 memory: 15555 loss: 0.1748 2024/08/18 10:10:43 - mmengine - INFO - Iter(train) [4840/9216] lr: 9.6732e-06 eta: 5:25:27 time: 5.5041 data_time: 0.0146 memory: 15793 loss: 0.0814 2024/08/18 10:11:36 - mmengine - INFO - Iter(train) [4850/9216] lr: 9.6381e-06 eta: 5:24:50 time: 5.2426 data_time: 0.0145 memory: 15767 loss: 0.0896 2024/08/18 10:12:26 - mmengine - INFO - Iter(train) [4860/9216] lr: 9.6030e-06 eta: 5:24:11 time: 5.0802 data_time: 0.0145 memory: 15545 loss: 0.0967 2024/08/18 10:13:15 - mmengine - INFO - Iter(train) [4870/9216] lr: 9.5679e-06 eta: 5:23:29 time: 4.8537 data_time: 0.0145 memory: 15493 loss: 0.0956 2024/08/18 10:14:03 - mmengine - INFO - Iter(train) [4880/9216] lr: 9.5328e-06 eta: 5:22:48 time: 4.7833 data_time: 0.0145 memory: 15488 loss: 0.0924 2024/08/18 10:14:50 - mmengine - INFO - Iter(train) [4890/9216] lr: 9.4977e-06 eta: 5:22:05 time: 4.6728 data_time: 0.0149 memory: 15457 loss: 0.1071 2024/08/18 10:15:36 - mmengine - INFO - Iter(train) [4900/9216] lr: 9.4626e-06 eta: 5:21:21 time: 4.6053 data_time: 0.0150 memory: 15374 loss: 0.1045 2024/08/18 10:16:21 - mmengine - INFO - Iter(train) [4910/9216] lr: 9.4275e-06 eta: 5:20:38 time: 4.5822 data_time: 0.0147 memory: 15298 loss: 0.1371 2024/08/18 10:17:07 - mmengine - INFO - Iter(train) [4920/9216] lr: 9.3924e-06 eta: 5:19:54 time: 4.5286 data_time: 0.0145 memory: 15282 loss: 0.1250 2024/08/18 10:17:51 - mmengine - INFO - Iter(train) [4930/9216] lr: 9.3574e-06 eta: 5:19:08 time: 4.4272 data_time: 0.0144 memory: 15538 loss: 0.0895 2024/08/18 10:18:35 - mmengine - INFO - Iter(train) [4940/9216] lr: 9.3223e-06 eta: 5:18:23 time: 4.3765 data_time: 0.0144 memory: 15287 loss: 0.1438 2024/08/18 10:19:17 - mmengine - INFO - Iter(train) [4950/9216] lr: 9.2872e-06 eta: 5:17:36 time: 4.2337 data_time: 0.0145 memory: 14948 loss: 0.0960 2024/08/18 10:19:58 - mmengine - INFO - Iter(train) [4960/9216] lr: 9.2522e-06 eta: 5:16:49 time: 4.1336 data_time: 0.0138 memory: 14830 loss: 0.1701 2024/08/18 10:20:39 - mmengine - INFO - Iter(train) [4970/9216] lr: 9.2172e-06 eta: 5:16:00 time: 4.0073 data_time: 0.0139 memory: 14796 loss: 0.1832 2024/08/18 10:21:17 - mmengine - INFO - Iter(train) [4980/9216] lr: 9.1821e-06 eta: 5:15:10 time: 3.8454 data_time: 0.0132 memory: 14642 loss: 0.0963 2024/08/18 10:21:53 - mmengine - INFO - Iter(train) [4990/9216] lr: 9.1471e-06 eta: 5:14:18 time: 3.5658 data_time: 0.0125 memory: 14375 loss: 0.1377 2024/08/18 10:22:26 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 10:22:26 - mmengine - INFO - Iter(train) [5000/9216] lr: 9.1121e-06 eta: 5:13:24 time: 3.3476 data_time: 0.0128 memory: 13964 loss: 0.1252 2024/08/18 10:22:26 - mmengine - INFO - Saving checkpoint at 5000 iterations 2024/08/18 10:23:00 - mmengine - INFO - Iter(train) [5010/9216] lr: 9.0771e-06 eta: 5:12:31 time: 3.4208 data_time: 0.2151 memory: 13850 loss: 0.1441 2024/08/18 10:23:32 - mmengine - INFO - Iter(train) [5020/9216] lr: 9.0421e-06 eta: 5:11:35 time: 3.1567 data_time: 0.0121 memory: 13500 loss: 0.0978 2024/08/18 10:24:03 - mmengine - INFO - Iter(train) [5030/9216] lr: 9.0072e-06 eta: 5:10:39 time: 3.1103 data_time: 0.0118 memory: 13486 loss: 0.0592 2024/08/18 10:24:34 - mmengine - INFO - Iter(train) [5040/9216] lr: 8.9722e-06 eta: 5:09:43 time: 3.0585 data_time: 0.0120 memory: 13443 loss: 0.0954 2024/08/18 10:25:04 - mmengine - INFO - Iter(train) [5050/9216] lr: 8.9372e-06 eta: 5:08:47 time: 3.0124 data_time: 0.0121 memory: 13372 loss: 0.0758 2024/08/18 10:25:34 - mmengine - INFO - Iter(train) [5060/9216] lr: 8.9023e-06 eta: 5:07:51 time: 2.9931 data_time: 0.0120 memory: 13329 loss: 0.1187 2024/08/18 10:26:03 - mmengine - INFO - Iter(train) [5070/9216] lr: 8.8674e-06 eta: 5:06:54 time: 2.9209 data_time: 0.0120 memory: 13159 loss: 0.1130 2024/08/18 10:26:31 - mmengine - INFO - Iter(train) [5080/9216] lr: 8.8325e-06 eta: 5:05:56 time: 2.8632 data_time: 0.0127 memory: 13024 loss: 0.1360 2024/08/18 10:26:58 - mmengine - INFO - Iter(train) [5090/9216] lr: 8.7976e-06 eta: 5:04:58 time: 2.6445 data_time: 0.0119 memory: 12763 loss: 0.1429 2024/08/18 10:27:21 - mmengine - INFO - Iter(train) [5100/9216] lr: 8.7627e-06 eta: 5:03:56 time: 2.2851 data_time: 0.0108 memory: 12060 loss: 0.1011 2024/08/18 10:27:42 - mmengine - INFO - Iter(train) [5110/9216] lr: 8.7278e-06 eta: 5:02:53 time: 2.1154 data_time: 0.0106 memory: 11807 loss: 0.1110 2024/08/18 10:27:56 - mmengine - INFO - Iter(train) [5120/9216] lr: 8.6930e-06 eta: 5:01:44 time: 1.3666 data_time: 0.0094 memory: 11476 loss: 0.1540 2024/08/18 10:29:01 - mmengine - INFO - Iter(train) [5130/9216] lr: 8.6582e-06 eta: 5:01:17 time: 6.4971 data_time: 0.0139 memory: 21965 loss: 0.0872 2024/08/18 10:29:54 - mmengine - INFO - Iter(train) [5140/9216] lr: 8.6234e-06 eta: 5:00:40 time: 5.3382 data_time: 0.0148 memory: 17143 loss: 0.0989 2024/08/18 10:30:47 - mmengine - INFO - Iter(train) [5150/9216] lr: 8.5886e-06 eta: 5:00:02 time: 5.2636 data_time: 0.0145 memory: 16323 loss: 0.0553 2024/08/18 10:31:36 - mmengine - INFO - Iter(train) [5160/9216] lr: 8.5538e-06 eta: 4:59:21 time: 4.8934 data_time: 0.0144 memory: 16209 loss: 0.0670 2024/08/18 10:32:25 - mmengine - INFO - Iter(train) [5170/9216] lr: 8.5190e-06 eta: 4:58:41 time: 4.9194 data_time: 0.0148 memory: 16216 loss: 0.0683 2024/08/18 10:33:17 - mmengine - INFO - Iter(train) [5180/9216] lr: 8.4843e-06 eta: 4:58:03 time: 5.2108 data_time: 0.0149 memory: 16057 loss: 0.0433 2024/08/18 10:34:11 - mmengine - INFO - Iter(train) [5190/9216] lr: 8.4495e-06 eta: 4:57:26 time: 5.3767 data_time: 0.0152 memory: 16195 loss: 0.0522 2024/08/18 10:35:06 - mmengine - INFO - Iter(train) [5200/9216] lr: 8.4148e-06 eta: 4:56:50 time: 5.5685 data_time: 0.0154 memory: 16195 loss: 0.0566 2024/08/18 10:36:08 - mmengine - INFO - Iter(train) [5210/9216] lr: 8.3801e-06 eta: 4:56:19 time: 6.1421 data_time: 0.0153 memory: 16169 loss: 0.0599 2024/08/18 10:37:10 - mmengine - INFO - Iter(train) [5220/9216] lr: 8.3455e-06 eta: 4:55:48 time: 6.2023 data_time: 0.0156 memory: 16544 loss: 0.1108 2024/08/18 10:38:10 - mmengine - INFO - Iter(train) [5230/9216] lr: 8.3108e-06 eta: 4:55:16 time: 6.0113 data_time: 0.0151 memory: 16000 loss: 0.0879 2024/08/18 10:39:09 - mmengine - INFO - Iter(train) [5240/9216] lr: 8.2762e-06 eta: 4:54:42 time: 5.8763 data_time: 0.0150 memory: 15988 loss: 0.0618 2024/08/18 10:40:04 - mmengine - INFO - Iter(train) [5250/9216] lr: 8.2416e-06 eta: 4:54:06 time: 5.5355 data_time: 0.0149 memory: 15967 loss: 0.1217 2024/08/18 10:40:59 - mmengine - INFO - Iter(train) [5260/9216] lr: 8.2070e-06 eta: 4:53:29 time: 5.5340 data_time: 0.0154 memory: 16540 loss: 0.0948 2024/08/18 10:41:55 - mmengine - INFO - Iter(train) [5270/9216] lr: 8.1725e-06 eta: 4:52:53 time: 5.5589 data_time: 0.0151 memory: 15803 loss: 0.0486 2024/08/18 10:42:50 - mmengine - INFO - Iter(train) [5280/9216] lr: 8.1379e-06 eta: 4:52:16 time: 5.4867 data_time: 0.0157 memory: 15807 loss: 0.0811 2024/08/18 10:43:43 - mmengine - INFO - Iter(train) [5290/9216] lr: 8.1034e-06 eta: 4:51:38 time: 5.3182 data_time: 0.0158 memory: 15831 loss: 0.0737 2024/08/18 10:44:38 - mmengine - INFO - Iter(train) [5300/9216] lr: 8.0689e-06 eta: 4:51:01 time: 5.4871 data_time: 0.0149 memory: 15841 loss: 0.0714 2024/08/18 10:45:31 - mmengine - INFO - Iter(train) [5310/9216] lr: 8.0345e-06 eta: 4:50:23 time: 5.3282 data_time: 0.0139 memory: 15853 loss: 0.0763 2024/08/18 10:46:24 - mmengine - INFO - Iter(train) [5320/9216] lr: 8.0000e-06 eta: 4:49:45 time: 5.3217 data_time: 0.0135 memory: 15777 loss: 0.0860 2024/08/18 10:47:17 - mmengine - INFO - Iter(train) [5330/9216] lr: 7.9656e-06 eta: 4:49:06 time: 5.2362 data_time: 0.0133 memory: 15803 loss: 0.0478 2024/08/18 10:48:09 - mmengine - INFO - Iter(train) [5340/9216] lr: 7.9312e-06 eta: 4:48:27 time: 5.2286 data_time: 0.0136 memory: 15691 loss: 0.0671 2024/08/18 10:49:01 - mmengine - INFO - Iter(train) [5350/9216] lr: 7.8968e-06 eta: 4:47:48 time: 5.2531 data_time: 0.0134 memory: 15571 loss: 0.0771 2024/08/18 10:49:54 - mmengine - INFO - Iter(train) [5360/9216] lr: 7.8625e-06 eta: 4:47:08 time: 5.2080 data_time: 0.0135 memory: 15538 loss: 0.0912 2024/08/18 10:50:47 - mmengine - INFO - Iter(train) [5370/9216] lr: 7.8282e-06 eta: 4:46:30 time: 5.3819 data_time: 0.0134 memory: 15612 loss: 0.1159 2024/08/18 10:51:41 - mmengine - INFO - Iter(train) [5380/9216] lr: 7.7939e-06 eta: 4:45:52 time: 5.3974 data_time: 0.0142 memory: 15696 loss: 0.1549 2024/08/18 10:52:34 - mmengine - INFO - Iter(train) [5390/9216] lr: 7.7596e-06 eta: 4:45:13 time: 5.3144 data_time: 0.0143 memory: 15710 loss: 0.1091 2024/08/18 10:53:25 - mmengine - INFO - Iter(train) [5400/9216] lr: 7.7254e-06 eta: 4:44:32 time: 5.0033 data_time: 0.0144 memory: 15403 loss: 0.1298 2024/08/18 10:54:13 - mmengine - INFO - Iter(train) [5410/9216] lr: 7.6912e-06 eta: 4:43:50 time: 4.8605 data_time: 0.0147 memory: 15384 loss: 0.0808 2024/08/18 10:55:01 - mmengine - INFO - Iter(train) [5420/9216] lr: 7.6570e-06 eta: 4:43:07 time: 4.7405 data_time: 0.0148 memory: 15329 loss: 0.1133 2024/08/18 10:55:48 - mmengine - INFO - Iter(train) [5430/9216] lr: 7.6229e-06 eta: 4:42:24 time: 4.7048 data_time: 0.0150 memory: 15173 loss: 0.1325 2024/08/18 10:56:34 - mmengine - INFO - Iter(train) [5440/9216] lr: 7.5887e-06 eta: 4:41:41 time: 4.6293 data_time: 0.0146 memory: 15196 loss: 0.1484 2024/08/18 10:57:21 - mmengine - INFO - Iter(train) [5450/9216] lr: 7.5546e-06 eta: 4:40:57 time: 4.6688 data_time: 0.0149 memory: 15294 loss: 0.1485 2024/08/18 10:58:06 - mmengine - INFO - Iter(train) [5460/9216] lr: 7.5206e-06 eta: 4:40:13 time: 4.5816 data_time: 0.0143 memory: 15091 loss: 0.1182 2024/08/18 10:58:51 - mmengine - INFO - Iter(train) [5470/9216] lr: 7.4866e-06 eta: 4:39:28 time: 4.4278 data_time: 0.0141 memory: 14967 loss: 0.1451 2024/08/18 10:59:35 - mmengine - INFO - Iter(train) [5480/9216] lr: 7.4526e-06 eta: 4:38:43 time: 4.4389 data_time: 0.0141 memory: 15131 loss: 0.1473 2024/08/18 11:00:19 - mmengine - INFO - Iter(train) [5490/9216] lr: 7.4186e-06 eta: 4:37:58 time: 4.4400 data_time: 0.0136 memory: 14647 loss: 0.1095 2024/08/18 11:00:58 - mmengine - INFO - Iter(train) [5500/9216] lr: 7.3847e-06 eta: 4:37:09 time: 3.8964 data_time: 0.0129 memory: 14491 loss: 0.1409 2024/08/18 11:01:33 - mmengine - INFO - Iter(train) [5510/9216] lr: 7.3508e-06 eta: 4:36:18 time: 3.4943 data_time: 0.0128 memory: 13977 loss: 0.0451 2024/08/18 11:02:07 - mmengine - INFO - Iter(train) [5520/9216] lr: 7.3169e-06 eta: 4:35:26 time: 3.3536 data_time: 0.0125 memory: 13642 loss: 0.0747 2024/08/18 11:02:39 - mmengine - INFO - Iter(train) [5530/9216] lr: 7.2831e-06 eta: 4:34:33 time: 3.2423 data_time: 0.0134 memory: 13487 loss: 0.0717 2024/08/18 11:03:11 - mmengine - INFO - Iter(train) [5540/9216] lr: 7.2493e-06 eta: 4:33:40 time: 3.1992 data_time: 0.0121 memory: 13495 loss: 0.0811 2024/08/18 11:03:43 - mmengine - INFO - Iter(train) [5550/9216] lr: 7.2155e-06 eta: 4:32:46 time: 3.1592 data_time: 0.0130 memory: 13440 loss: 0.0850 2024/08/18 11:04:15 - mmengine - INFO - Iter(train) [5560/9216] lr: 7.1818e-06 eta: 4:31:54 time: 3.2087 data_time: 0.0121 memory: 13332 loss: 0.0872 2024/08/18 11:04:47 - mmengine - INFO - Iter(train) [5570/9216] lr: 7.1481e-06 eta: 4:31:01 time: 3.2159 data_time: 0.0122 memory: 13174 loss: 0.1591 2024/08/18 11:05:19 - mmengine - INFO - Iter(train) [5580/9216] lr: 7.1144e-06 eta: 4:30:08 time: 3.1862 data_time: 0.0126 memory: 13164 loss: 0.1406 2024/08/18 11:05:50 - mmengine - INFO - Iter(train) [5590/9216] lr: 7.0808e-06 eta: 4:29:14 time: 3.0846 data_time: 0.0121 memory: 12950 loss: 0.1367 2024/08/18 11:06:18 - mmengine - INFO - Iter(train) [5600/9216] lr: 7.0472e-06 eta: 4:28:20 time: 2.8626 data_time: 0.0116 memory: 12451 loss: 0.1199 2024/08/18 11:06:46 - mmengine - INFO - Iter(train) [5610/9216] lr: 7.0136e-06 eta: 4:27:24 time: 2.7135 data_time: 0.0110 memory: 11887 loss: 0.0662 2024/08/18 11:07:10 - mmengine - INFO - Iter(train) [5620/9216] lr: 6.9801e-06 eta: 4:26:26 time: 2.4134 data_time: 0.0103 memory: 11681 loss: 0.1183 2024/08/18 11:07:27 - mmengine - INFO - Iter(train) [5630/9216] lr: 6.9466e-06 eta: 4:25:24 time: 1.7173 data_time: 0.0095 memory: 11051 loss: 0.1116 2024/08/18 11:08:27 - mmengine - INFO - Iter(train) [5640/9216] lr: 6.9132e-06 eta: 4:24:50 time: 6.0482 data_time: 0.0124 memory: 21992 loss: 0.1130 2024/08/18 11:09:30 - mmengine - INFO - Iter(train) [5650/9216] lr: 6.8798e-06 eta: 4:24:17 time: 6.2885 data_time: 0.0149 memory: 17198 loss: 0.1576 2024/08/18 11:10:31 - mmengine - INFO - Iter(train) [5660/9216] lr: 6.8464e-06 eta: 4:23:43 time: 6.0491 data_time: 0.0147 memory: 16535 loss: 0.0539 2024/08/18 11:11:29 - mmengine - INFO - Iter(train) [5670/9216] lr: 6.8131e-06 eta: 4:23:07 time: 5.8671 data_time: 0.0148 memory: 16491 loss: 0.0463 2024/08/18 11:12:28 - mmengine - INFO - Iter(train) [5680/9216] lr: 6.7798e-06 eta: 4:22:31 time: 5.8058 data_time: 0.0134 memory: 16138 loss: 0.0568 2024/08/18 11:13:26 - mmengine - INFO - Iter(train) [5690/9216] lr: 6.7465e-06 eta: 4:21:56 time: 5.8982 data_time: 0.0142 memory: 16335 loss: 0.0666 2024/08/18 11:14:23 - mmengine - INFO - Iter(train) [5700/9216] lr: 6.7133e-06 eta: 4:21:19 time: 5.7000 data_time: 0.0147 memory: 16691 loss: 0.0732 2024/08/18 11:15:17 - mmengine - INFO - Iter(train) [5710/9216] lr: 6.6802e-06 eta: 4:20:40 time: 5.3738 data_time: 0.0155 memory: 16038 loss: 0.0468 2024/08/18 11:16:10 - mmengine - INFO - Iter(train) [5720/9216] lr: 6.6470e-06 eta: 4:20:00 time: 5.2568 data_time: 0.0149 memory: 16159 loss: 0.0513 2024/08/18 11:17:01 - mmengine - INFO - Iter(train) [5730/9216] lr: 6.6139e-06 eta: 4:19:19 time: 5.1061 data_time: 0.0147 memory: 16097 loss: 0.0424 2024/08/18 11:17:52 - mmengine - INFO - Iter(train) [5740/9216] lr: 6.5809e-06 eta: 4:18:39 time: 5.1174 data_time: 0.0160 memory: 16007 loss: 0.0540 2024/08/18 11:18:42 - mmengine - INFO - Iter(train) [5750/9216] lr: 6.5479e-06 eta: 4:17:57 time: 5.0089 data_time: 0.0158 memory: 15959 loss: 0.0585 2024/08/18 11:19:33 - mmengine - INFO - Iter(train) [5760/9216] lr: 6.5149e-06 eta: 4:17:17 time: 5.1281 data_time: 0.0150 memory: 16024 loss: 0.0907 2024/08/18 11:20:25 - mmengine - INFO - Iter(train) [5770/9216] lr: 6.4820e-06 eta: 4:16:36 time: 5.2090 data_time: 0.0154 memory: 16154 loss: 0.1118 2024/08/18 11:21:16 - mmengine - INFO - Iter(train) [5780/9216] lr: 6.4491e-06 eta: 4:15:55 time: 5.0856 data_time: 0.0154 memory: 15831 loss: 0.0827 2024/08/18 11:22:06 - mmengine - INFO - Iter(train) [5790/9216] lr: 6.4163e-06 eta: 4:15:14 time: 4.9774 data_time: 0.0152 memory: 16173 loss: 0.0647 2024/08/18 11:22:54 - mmengine - INFO - Iter(train) [5800/9216] lr: 6.3835e-06 eta: 4:14:31 time: 4.8093 data_time: 0.0147 memory: 15870 loss: 0.0685 2024/08/18 11:23:42 - mmengine - INFO - Iter(train) [5810/9216] lr: 6.3508e-06 eta: 4:13:48 time: 4.7669 data_time: 0.0139 memory: 15781 loss: 0.1003 2024/08/18 11:24:30 - mmengine - INFO - Iter(train) [5820/9216] lr: 6.3181e-06 eta: 4:13:06 time: 4.8386 data_time: 0.0135 memory: 15732 loss: 0.1155 2024/08/18 11:25:19 - mmengine - INFO - Iter(train) [5830/9216] lr: 6.2855e-06 eta: 4:12:23 time: 4.8916 data_time: 0.0147 memory: 15755 loss: 0.0642 2024/08/18 11:26:11 - mmengine - INFO - Iter(train) [5840/9216] lr: 6.2528e-06 eta: 4:11:43 time: 5.2239 data_time: 0.0151 memory: 15674 loss: 0.0806 2024/08/18 11:27:06 - mmengine - INFO - Iter(train) [5850/9216] lr: 6.2203e-06 eta: 4:11:04 time: 5.4580 data_time: 0.0141 memory: 15674 loss: 0.0745 2024/08/18 11:27:58 - mmengine - INFO - Iter(train) [5860/9216] lr: 6.1878e-06 eta: 4:10:23 time: 5.2005 data_time: 0.0149 memory: 15552 loss: 0.1382 2024/08/18 11:28:50 - mmengine - INFO - Iter(train) [5870/9216] lr: 6.1553e-06 eta: 4:09:43 time: 5.2205 data_time: 0.0148 memory: 15559 loss: 0.0984 2024/08/18 11:29:43 - mmengine - INFO - Iter(train) [5880/9216] lr: 6.1229e-06 eta: 4:09:02 time: 5.2843 data_time: 0.0145 memory: 15621 loss: 0.1256 2024/08/18 11:30:40 - mmengine - INFO - Iter(train) [5890/9216] lr: 6.0905e-06 eta: 4:08:24 time: 5.6914 data_time: 0.0147 memory: 15559 loss: 0.1171 2024/08/18 11:31:34 - mmengine - INFO - Iter(train) [5900/9216] lr: 6.0582e-06 eta: 4:07:45 time: 5.4389 data_time: 0.0146 memory: 15531 loss: 0.0772 2024/08/18 11:32:27 - mmengine - INFO - Iter(train) [5910/9216] lr: 6.0259e-06 eta: 4:07:05 time: 5.2850 data_time: 0.0147 memory: 15751 loss: 0.0959 2024/08/18 11:33:17 - mmengine - INFO - Iter(train) [5920/9216] lr: 5.9937e-06 eta: 4:06:23 time: 5.0151 data_time: 0.0149 memory: 15474 loss: 0.1062 2024/08/18 11:34:05 - mmengine - INFO - Iter(train) [5930/9216] lr: 5.9615e-06 eta: 4:05:39 time: 4.7286 data_time: 0.0154 memory: 15344 loss: 0.0951 2024/08/18 11:34:56 - mmengine - INFO - Iter(train) [5940/9216] lr: 5.9294e-06 eta: 4:04:58 time: 5.1603 data_time: 0.0154 memory: 15434 loss: 0.1023 2024/08/18 11:35:47 - mmengine - INFO - Iter(train) [5950/9216] lr: 5.8974e-06 eta: 4:04:16 time: 5.0595 data_time: 0.0150 memory: 15410 loss: 0.0879 2024/08/18 11:36:38 - mmengine - INFO - Iter(train) [5960/9216] lr: 5.8653e-06 eta: 4:03:35 time: 5.1358 data_time: 0.0143 memory: 15161 loss: 0.1523 2024/08/18 11:37:28 - mmengine - INFO - Iter(train) [5970/9216] lr: 5.8334e-06 eta: 4:02:53 time: 4.9585 data_time: 0.0146 memory: 15161 loss: 0.0913 2024/08/18 11:38:17 - mmengine - INFO - Iter(train) [5980/9216] lr: 5.8014e-06 eta: 4:02:10 time: 4.8971 data_time: 0.0147 memory: 15168 loss: 0.1365 2024/08/18 11:39:05 - mmengine - INFO - Iter(train) [5990/9216] lr: 5.7696e-06 eta: 4:01:27 time: 4.8352 data_time: 0.0144 memory: 14998 loss: 0.1147 2024/08/18 11:39:54 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 11:39:54 - mmengine - INFO - Iter(train) [6000/9216] lr: 5.7378e-06 eta: 4:00:44 time: 4.8787 data_time: 0.0143 memory: 14787 loss: 0.1823 2024/08/18 11:39:54 - mmengine - INFO - Saving checkpoint at 6000 iterations 2024/08/18 11:40:42 - mmengine - INFO - Iter(train) [6010/9216] lr: 5.7060e-06 eta: 4:00:01 time: 4.8285 data_time: 0.2319 memory: 14699 loss: 0.0966 2024/08/18 11:41:26 - mmengine - INFO - Iter(train) [6020/9216] lr: 5.6743e-06 eta: 3:59:16 time: 4.4201 data_time: 0.0135 memory: 14421 loss: 0.1594 2024/08/18 11:42:07 - mmengine - INFO - Iter(train) [6030/9216] lr: 5.6426e-06 eta: 3:58:29 time: 4.0681 data_time: 0.0129 memory: 14060 loss: 0.0884 2024/08/18 11:42:45 - mmengine - INFO - Iter(train) [6040/9216] lr: 5.6110e-06 eta: 3:57:40 time: 3.7679 data_time: 0.0125 memory: 13673 loss: 0.0877 2024/08/18 11:43:21 - mmengine - INFO - Iter(train) [6050/9216] lr: 5.5795e-06 eta: 3:56:51 time: 3.6533 data_time: 0.0122 memory: 13565 loss: 0.0776 2024/08/18 11:43:58 - mmengine - INFO - Iter(train) [6060/9216] lr: 5.5480e-06 eta: 3:56:02 time: 3.6874 data_time: 0.0123 memory: 13414 loss: 0.0861 2024/08/18 11:44:34 - mmengine - INFO - Iter(train) [6070/9216] lr: 5.5165e-06 eta: 3:55:12 time: 3.5957 data_time: 0.0124 memory: 13422 loss: 0.0773 2024/08/18 11:45:09 - mmengine - INFO - Iter(train) [6080/9216] lr: 5.4852e-06 eta: 3:54:22 time: 3.4777 data_time: 0.0121 memory: 13287 loss: 0.0740 2024/08/18 11:45:41 - mmengine - INFO - Iter(train) [6090/9216] lr: 5.4538e-06 eta: 3:53:31 time: 3.2437 data_time: 0.0120 memory: 13073 loss: 0.1258 2024/08/18 11:46:12 - mmengine - INFO - Iter(train) [6100/9216] lr: 5.4226e-06 eta: 3:52:39 time: 3.0907 data_time: 0.0124 memory: 13137 loss: 0.1380 2024/08/18 11:46:41 - mmengine - INFO - Iter(train) [6110/9216] lr: 5.3913e-06 eta: 3:51:46 time: 2.9026 data_time: 0.0120 memory: 12758 loss: 0.1718 2024/08/18 11:47:06 - mmengine - INFO - Iter(train) [6120/9216] lr: 5.3602e-06 eta: 3:50:51 time: 2.4941 data_time: 0.0109 memory: 12295 loss: 0.0676 2024/08/18 11:47:30 - mmengine - INFO - Iter(train) [6130/9216] lr: 5.3291e-06 eta: 3:49:56 time: 2.3375 data_time: 0.0108 memory: 11927 loss: 0.1186 2024/08/18 11:47:49 - mmengine - INFO - Iter(train) [6140/9216] lr: 5.2980e-06 eta: 3:48:58 time: 1.9180 data_time: 0.0103 memory: 11241 loss: 0.0941 2024/08/18 11:48:37 - mmengine - INFO - Iter(train) [6150/9216] lr: 5.2671e-06 eta: 3:48:15 time: 4.8386 data_time: 0.0119 memory: 21206 loss: 0.1023 2024/08/18 11:49:40 - mmengine - INFO - Iter(train) [6160/9216] lr: 5.2361e-06 eta: 3:47:40 time: 6.3298 data_time: 0.0151 memory: 17646 loss: 0.0728 2024/08/18 11:50:39 - mmengine - INFO - Iter(train) [6170/9216] lr: 5.2053e-06 eta: 3:47:02 time: 5.8307 data_time: 0.0147 memory: 16474 loss: 0.0571 2024/08/18 11:51:36 - mmengine - INFO - Iter(train) [6180/9216] lr: 5.1745e-06 eta: 3:46:23 time: 5.7010 data_time: 0.0151 memory: 16202 loss: 0.0997 2024/08/18 11:52:34 - mmengine - INFO - Iter(train) [6190/9216] lr: 5.1437e-06 eta: 3:45:45 time: 5.7905 data_time: 0.0151 memory: 16173 loss: 0.0519 2024/08/18 11:53:32 - mmengine - INFO - Iter(train) [6200/9216] lr: 5.1130e-06 eta: 3:45:07 time: 5.7863 data_time: 0.0157 memory: 16216 loss: 0.0782 2024/08/18 11:54:29 - mmengine - INFO - Iter(train) [6210/9216] lr: 5.0824e-06 eta: 3:44:28 time: 5.7717 data_time: 0.0145 memory: 16071 loss: 0.0575 2024/08/18 11:55:27 - mmengine - INFO - Iter(train) [6220/9216] lr: 5.0518e-06 eta: 3:43:50 time: 5.8007 data_time: 0.0144 memory: 16138 loss: 0.0778 2024/08/18 11:56:23 - mmengine - INFO - Iter(train) [6230/9216] lr: 5.0213e-06 eta: 3:43:10 time: 5.5279 data_time: 0.0143 memory: 16283 loss: 0.1156 2024/08/18 11:57:17 - mmengine - INFO - Iter(train) [6240/9216] lr: 4.9909e-06 eta: 3:42:30 time: 5.4755 data_time: 0.0146 memory: 15997 loss: 0.0730 2024/08/18 11:58:12 - mmengine - INFO - Iter(train) [6250/9216] lr: 4.9605e-06 eta: 3:41:50 time: 5.4728 data_time: 0.0154 memory: 16116 loss: 0.0538 2024/08/18 11:59:08 - mmengine - INFO - Iter(train) [6260/9216] lr: 4.9302e-06 eta: 3:41:10 time: 5.6118 data_time: 0.0148 memory: 16026 loss: 0.0714 2024/08/18 12:00:07 - mmengine - INFO - Iter(train) [6270/9216] lr: 4.8999e-06 eta: 3:40:32 time: 5.8759 data_time: 0.0144 memory: 15836 loss: 0.0497 2024/08/18 12:01:03 - mmengine - INFO - Iter(train) [6280/9216] lr: 4.8697e-06 eta: 3:39:52 time: 5.6008 data_time: 0.0148 memory: 16043 loss: 0.0719 2024/08/18 12:01:57 - mmengine - INFO - Iter(train) [6290/9216] lr: 4.8396e-06 eta: 3:39:11 time: 5.4347 data_time: 0.0146 memory: 15969 loss: 0.0577 2024/08/18 12:02:52 - mmengine - INFO - Iter(train) [6300/9216] lr: 4.8095e-06 eta: 3:38:31 time: 5.4561 data_time: 0.0145 memory: 15841 loss: 0.0688 2024/08/18 12:03:46 - mmengine - INFO - Iter(train) [6310/9216] lr: 4.7795e-06 eta: 3:37:50 time: 5.3957 data_time: 0.0147 memory: 15739 loss: 0.0826 2024/08/18 12:04:42 - mmengine - INFO - Iter(train) [6320/9216] lr: 4.7496e-06 eta: 3:37:10 time: 5.6257 data_time: 0.0147 memory: 15779 loss: 0.0978 2024/08/18 12:05:35 - mmengine - INFO - Iter(train) [6330/9216] lr: 4.7197e-06 eta: 3:36:29 time: 5.3461 data_time: 0.0144 memory: 15736 loss: 0.0791 2024/08/18 12:06:29 - mmengine - INFO - Iter(train) [6340/9216] lr: 4.6899e-06 eta: 3:35:48 time: 5.3625 data_time: 0.0145 memory: 15734 loss: 0.0629 2024/08/18 12:07:21 - mmengine - INFO - Iter(train) [6350/9216] lr: 4.6601e-06 eta: 3:35:06 time: 5.2337 data_time: 0.0147 memory: 15779 loss: 0.0829 2024/08/18 12:08:16 - mmengine - INFO - Iter(train) [6360/9216] lr: 4.6305e-06 eta: 3:34:25 time: 5.4512 data_time: 0.0145 memory: 15943 loss: 0.1121 2024/08/18 12:09:09 - mmengine - INFO - Iter(train) [6370/9216] lr: 4.6009e-06 eta: 3:33:44 time: 5.3027 data_time: 0.0148 memory: 15633 loss: 0.0840 2024/08/18 12:10:01 - mmengine - INFO - Iter(train) [6380/9216] lr: 4.5713e-06 eta: 3:33:02 time: 5.2259 data_time: 0.0146 memory: 15659 loss: 0.1061 2024/08/18 12:10:52 - mmengine - INFO - Iter(train) [6390/9216] lr: 4.5418e-06 eta: 3:32:20 time: 5.0948 data_time: 0.0149 memory: 15569 loss: 0.0763 2024/08/18 12:11:45 - mmengine - INFO - Iter(train) [6400/9216] lr: 4.5124e-06 eta: 3:31:38 time: 5.3263 data_time: 0.0147 memory: 15557 loss: 0.0643 2024/08/18 12:12:39 - mmengine - INFO - Iter(train) [6410/9216] lr: 4.4831e-06 eta: 3:30:57 time: 5.3542 data_time: 0.0148 memory: 15701 loss: 0.1194 2024/08/18 12:13:30 - mmengine - INFO - Iter(train) [6420/9216] lr: 4.4538e-06 eta: 3:30:14 time: 5.1304 data_time: 0.0145 memory: 15562 loss: 0.0899 2024/08/18 12:14:19 - mmengine - INFO - Iter(train) [6430/9216] lr: 4.4246e-06 eta: 3:29:31 time: 4.8694 data_time: 0.0148 memory: 15419 loss: 0.1008 2024/08/18 12:15:09 - mmengine - INFO - Iter(train) [6440/9216] lr: 4.3955e-06 eta: 3:28:48 time: 5.0136 data_time: 0.0143 memory: 15533 loss: 0.0942 2024/08/18 12:15:58 - mmengine - INFO - Iter(train) [6450/9216] lr: 4.3664e-06 eta: 3:28:04 time: 4.9239 data_time: 0.0148 memory: 15472 loss: 0.0616 2024/08/18 12:16:49 - mmengine - INFO - Iter(train) [6460/9216] lr: 4.3374e-06 eta: 3:27:22 time: 5.1082 data_time: 0.0149 memory: 15379 loss: 0.1882 2024/08/18 12:17:40 - mmengine - INFO - Iter(train) [6470/9216] lr: 4.3085e-06 eta: 3:26:39 time: 5.0375 data_time: 0.0146 memory: 15410 loss: 0.1708 2024/08/18 12:18:28 - mmengine - INFO - Iter(train) [6480/9216] lr: 4.2796e-06 eta: 3:25:55 time: 4.7710 data_time: 0.0141 memory: 15158 loss: 0.1337 2024/08/18 12:19:13 - mmengine - INFO - Iter(train) [6490/9216] lr: 4.2508e-06 eta: 3:25:10 time: 4.5525 data_time: 0.0149 memory: 15175 loss: 0.1248 2024/08/18 12:19:57 - mmengine - INFO - Iter(train) [6500/9216] lr: 4.2221e-06 eta: 3:24:24 time: 4.3504 data_time: 0.0142 memory: 14936 loss: 0.1350 2024/08/18 12:20:39 - mmengine - INFO - Iter(train) [6510/9216] lr: 4.1934e-06 eta: 3:23:38 time: 4.2001 data_time: 0.0142 memory: 14852 loss: 0.1671 2024/08/18 12:21:19 - mmengine - INFO - Iter(train) [6520/9216] lr: 4.1649e-06 eta: 3:22:50 time: 4.0406 data_time: 0.0132 memory: 14868 loss: 0.0964 2024/08/18 12:21:58 - mmengine - INFO - Iter(train) [6530/9216] lr: 4.1364e-06 eta: 3:22:03 time: 3.8736 data_time: 0.0128 memory: 14549 loss: 0.0900 2024/08/18 12:22:34 - mmengine - INFO - Iter(train) [6540/9216] lr: 4.1079e-06 eta: 3:21:14 time: 3.6343 data_time: 0.0127 memory: 14314 loss: 0.0801 2024/08/18 12:23:08 - mmengine - INFO - Iter(train) [6550/9216] lr: 4.0796e-06 eta: 3:20:24 time: 3.3907 data_time: 0.0137 memory: 14123 loss: 0.1314 2024/08/18 12:23:41 - mmengine - INFO - Iter(train) [6560/9216] lr: 4.0513e-06 eta: 3:19:34 time: 3.2675 data_time: 0.0128 memory: 13804 loss: 0.0773 2024/08/18 12:24:11 - mmengine - INFO - Iter(train) [6570/9216] lr: 4.0231e-06 eta: 3:18:43 time: 3.0850 data_time: 0.0124 memory: 13523 loss: 0.0815 2024/08/18 12:24:43 - mmengine - INFO - Iter(train) [6580/9216] lr: 3.9950e-06 eta: 3:17:53 time: 3.1224 data_time: 0.0120 memory: 13452 loss: 0.0603 2024/08/18 12:25:13 - mmengine - INFO - Iter(train) [6590/9216] lr: 3.9669e-06 eta: 3:17:02 time: 3.0301 data_time: 0.0127 memory: 13424 loss: 0.0681 2024/08/18 12:25:43 - mmengine - INFO - Iter(train) [6600/9216] lr: 3.9389e-06 eta: 3:16:11 time: 3.0100 data_time: 0.0119 memory: 13196 loss: 0.0861 2024/08/18 12:26:12 - mmengine - INFO - Iter(train) [6610/9216] lr: 3.9110e-06 eta: 3:15:20 time: 2.9158 data_time: 0.0121 memory: 13159 loss: 0.0854 2024/08/18 12:26:41 - mmengine - INFO - Iter(train) [6620/9216] lr: 3.8832e-06 eta: 3:14:28 time: 2.8835 data_time: 0.0119 memory: 12991 loss: 0.1032 2024/08/18 12:27:08 - mmengine - INFO - Iter(train) [6630/9216] lr: 3.8554e-06 eta: 3:13:36 time: 2.6835 data_time: 0.0121 memory: 12775 loss: 0.1379 2024/08/18 12:27:32 - mmengine - INFO - Iter(train) [6640/9216] lr: 3.8277e-06 eta: 3:12:43 time: 2.3847 data_time: 0.0106 memory: 12278 loss: 0.0823 2024/08/18 12:27:51 - mmengine - INFO - Iter(train) [6650/9216] lr: 3.8001e-06 eta: 3:11:48 time: 1.9470 data_time: 0.0104 memory: 11662 loss: 0.0985 2024/08/18 12:28:23 - mmengine - INFO - Iter(train) [6660/9216] lr: 3.7726e-06 eta: 3:10:59 time: 3.2129 data_time: 0.0106 memory: 19211 loss: 0.1234 2024/08/18 12:29:17 - mmengine - INFO - Iter(train) [6670/9216] lr: 3.7451e-06 eta: 3:10:17 time: 5.3262 data_time: 0.0141 memory: 17413 loss: 0.0696 2024/08/18 12:30:10 - mmengine - INFO - Iter(train) [6680/9216] lr: 3.7177e-06 eta: 3:09:35 time: 5.3094 data_time: 0.0137 memory: 16226 loss: 0.0647 2024/08/18 12:30:59 - mmengine - INFO - Iter(train) [6690/9216] lr: 3.6904e-06 eta: 3:08:52 time: 4.9118 data_time: 0.0137 memory: 16535 loss: 0.0691 2024/08/18 12:31:47 - mmengine - INFO - Iter(train) [6700/9216] lr: 3.6632e-06 eta: 3:08:09 time: 4.8387 data_time: 0.0160 memory: 15969 loss: 0.0561 2024/08/18 12:32:36 - mmengine - INFO - Iter(train) [6710/9216] lr: 3.6361e-06 eta: 3:07:25 time: 4.8669 data_time: 0.0158 memory: 15974 loss: 0.0840 2024/08/18 12:33:24 - mmengine - INFO - Iter(train) [6720/9216] lr: 3.6090e-06 eta: 3:06:42 time: 4.8381 data_time: 0.0146 memory: 15879 loss: 0.0869 2024/08/18 12:34:12 - mmengine - INFO - Iter(train) [6730/9216] lr: 3.5820e-06 eta: 3:05:58 time: 4.7544 data_time: 0.0143 memory: 15824 loss: 0.0806 2024/08/18 12:34:59 - mmengine - INFO - Iter(train) [6740/9216] lr: 3.5551e-06 eta: 3:05:14 time: 4.7333 data_time: 0.0141 memory: 15872 loss: 0.0916 2024/08/18 12:35:46 - mmengine - INFO - Iter(train) [6750/9216] lr: 3.5283e-06 eta: 3:04:29 time: 4.6648 data_time: 0.0144 memory: 15703 loss: 0.0681 2024/08/18 12:36:32 - mmengine - INFO - Iter(train) [6760/9216] lr: 3.5015e-06 eta: 3:03:45 time: 4.6263 data_time: 0.0144 memory: 15796 loss: 0.0671 2024/08/18 12:37:18 - mmengine - INFO - Iter(train) [6770/9216] lr: 3.4748e-06 eta: 3:03:01 time: 4.6271 data_time: 0.0150 memory: 15689 loss: 0.1122 2024/08/18 12:38:04 - mmengine - INFO - Iter(train) [6780/9216] lr: 3.4483e-06 eta: 3:02:16 time: 4.5851 data_time: 0.0149 memory: 15557 loss: 0.0631 2024/08/18 12:38:49 - mmengine - INFO - Iter(train) [6790/9216] lr: 3.4217e-06 eta: 3:01:31 time: 4.5150 data_time: 0.0144 memory: 15465 loss: 0.1179 2024/08/18 12:39:33 - mmengine - INFO - Iter(train) [6800/9216] lr: 3.3953e-06 eta: 3:00:46 time: 4.3805 data_time: 0.0143 memory: 15322 loss: 0.0954 2024/08/18 12:40:17 - mmengine - INFO - Iter(train) [6810/9216] lr: 3.3690e-06 eta: 3:00:01 time: 4.3680 data_time: 0.0142 memory: 15408 loss: 0.1266 2024/08/18 12:40:59 - mmengine - INFO - Iter(train) [6820/9216] lr: 3.3427e-06 eta: 2:59:15 time: 4.2161 data_time: 0.0144 memory: 15203 loss: 0.1920 2024/08/18 12:41:39 - mmengine - INFO - Iter(train) [6830/9216] lr: 3.3165e-06 eta: 2:58:28 time: 3.9952 data_time: 0.0135 memory: 14830 loss: 0.0705 2024/08/18 12:42:14 - mmengine - INFO - Iter(train) [6840/9216] lr: 3.2904e-06 eta: 2:57:40 time: 3.5002 data_time: 0.0127 memory: 14343 loss: 0.1187 2024/08/18 12:42:46 - mmengine - INFO - Iter(train) [6850/9216] lr: 3.2644e-06 eta: 2:56:51 time: 3.1696 data_time: 0.0121 memory: 13581 loss: 0.0979 2024/08/18 12:43:16 - mmengine - INFO - Iter(train) [6860/9216] lr: 3.2385e-06 eta: 2:56:01 time: 3.0575 data_time: 0.0122 memory: 13500 loss: 0.0594 2024/08/18 12:43:46 - mmengine - INFO - Iter(train) [6870/9216] lr: 3.2126e-06 eta: 2:55:11 time: 2.9870 data_time: 0.0118 memory: 13243 loss: 0.1045 2024/08/18 12:44:12 - mmengine - INFO - Iter(train) [6880/9216] lr: 3.1869e-06 eta: 2:54:20 time: 2.5878 data_time: 0.0114 memory: 12867 loss: 0.2032 2024/08/18 12:44:34 - mmengine - INFO - Iter(train) [6890/9216] lr: 3.1612e-06 eta: 2:53:27 time: 2.2448 data_time: 0.0099 memory: 22898 loss: 0.0968 2024/08/18 12:45:34 - mmengine - INFO - Iter(train) [6900/9216] lr: 3.1356e-06 eta: 2:52:48 time: 5.9542 data_time: 0.0144 memory: 21154 loss: 0.1258 2024/08/18 12:46:27 - mmengine - INFO - Iter(train) [6910/9216] lr: 3.1101e-06 eta: 2:52:06 time: 5.3108 data_time: 0.0143 memory: 17381 loss: 0.0747 2024/08/18 12:47:30 - mmengine - INFO - Iter(train) [6920/9216] lr: 3.0847e-06 eta: 2:51:27 time: 6.2985 data_time: 0.2937 memory: 22610 loss: 0.1400 2024/08/18 12:48:23 - mmengine - INFO - Iter(train) [6930/9216] lr: 3.0593e-06 eta: 2:50:45 time: 5.3098 data_time: 0.0152 memory: 17381 loss: 0.0865 2024/08/18 12:49:13 - mmengine - INFO - Iter(train) [6940/9216] lr: 3.0341e-06 eta: 2:50:02 time: 5.0187 data_time: 0.0150 memory: 16209 loss: 0.0471 2024/08/18 12:50:03 - mmengine - INFO - Iter(train) [6950/9216] lr: 3.0089e-06 eta: 2:49:19 time: 4.9924 data_time: 0.0145 memory: 16159 loss: 0.0618 2024/08/18 12:50:53 - mmengine - INFO - Iter(train) [6960/9216] lr: 2.9838e-06 eta: 2:48:35 time: 4.9757 data_time: 0.0144 memory: 16150 loss: 0.0457 2024/08/18 12:51:43 - mmengine - INFO - Iter(train) [6970/9216] lr: 2.9588e-06 eta: 2:47:52 time: 4.9687 data_time: 0.0152 memory: 16059 loss: 0.0702 2024/08/18 12:52:33 - mmengine - INFO - Iter(train) [6980/9216] lr: 2.9339e-06 eta: 2:47:09 time: 4.9811 data_time: 0.0150 memory: 16075 loss: 0.0567 2024/08/18 12:53:22 - mmengine - INFO - Iter(train) [6990/9216] lr: 2.9091e-06 eta: 2:46:25 time: 4.9066 data_time: 0.0151 memory: 16024 loss: 0.0489 2024/08/18 12:54:10 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 12:54:10 - mmengine - INFO - Iter(train) [7000/9216] lr: 2.8843e-06 eta: 2:45:42 time: 4.8740 data_time: 0.0148 memory: 15971 loss: 0.0491 2024/08/18 12:54:10 - mmengine - INFO - Saving checkpoint at 7000 iterations 2024/08/18 12:55:00 - mmengine - INFO - Iter(train) [7010/9216] lr: 2.8597e-06 eta: 2:44:58 time: 5.0068 data_time: 0.2250 memory: 15962 loss: 0.0535 2024/08/18 12:55:48 - mmengine - INFO - Iter(train) [7020/9216] lr: 2.8351e-06 eta: 2:44:14 time: 4.7725 data_time: 0.0148 memory: 15836 loss: 0.0747 2024/08/18 12:56:36 - mmengine - INFO - Iter(train) [7030/9216] lr: 2.8107e-06 eta: 2:43:30 time: 4.7893 data_time: 0.0154 memory: 15855 loss: 0.0716 2024/08/18 12:57:25 - mmengine - INFO - Iter(train) [7040/9216] lr: 2.7863e-06 eta: 2:42:47 time: 4.8663 data_time: 0.0157 memory: 15957 loss: 0.1202 2024/08/18 12:58:13 - mmengine - INFO - Iter(train) [7050/9216] lr: 2.7620e-06 eta: 2:42:03 time: 4.8706 data_time: 0.0156 memory: 16173 loss: 0.0599 2024/08/18 12:59:01 - mmengine - INFO - Iter(train) [7060/9216] lr: 2.7378e-06 eta: 2:41:19 time: 4.7152 data_time: 0.0151 memory: 15810 loss: 0.0721 2024/08/18 12:59:47 - mmengine - INFO - Iter(train) [7070/9216] lr: 2.7137e-06 eta: 2:40:34 time: 4.6565 data_time: 0.0146 memory: 15781 loss: 0.0892 2024/08/18 13:00:36 - mmengine - INFO - Iter(train) [7080/9216] lr: 2.6897e-06 eta: 2:39:51 time: 4.8587 data_time: 0.0148 memory: 15831 loss: 0.0659 2024/08/18 13:01:23 - mmengine - INFO - Iter(train) [7090/9216] lr: 2.6657e-06 eta: 2:39:06 time: 4.6843 data_time: 0.0131 memory: 15853 loss: 0.1169 2024/08/18 13:02:09 - mmengine - INFO - Iter(train) [7100/9216] lr: 2.6419e-06 eta: 2:38:22 time: 4.6885 data_time: 0.0136 memory: 15755 loss: 0.0747 2024/08/18 13:02:56 - mmengine - INFO - Iter(train) [7110/9216] lr: 2.6181e-06 eta: 2:37:38 time: 4.6530 data_time: 0.0135 memory: 15581 loss: 0.0727 2024/08/18 13:03:42 - mmengine - INFO - Iter(train) [7120/9216] lr: 2.5945e-06 eta: 2:36:53 time: 4.6469 data_time: 0.0134 memory: 15796 loss: 0.0789 2024/08/18 13:04:28 - mmengine - INFO - Iter(train) [7130/9216] lr: 2.5709e-06 eta: 2:36:08 time: 4.5174 data_time: 0.0154 memory: 15505 loss: 0.0975 2024/08/18 13:05:14 - mmengine - INFO - Iter(train) [7140/9216] lr: 2.5474e-06 eta: 2:35:24 time: 4.5995 data_time: 0.0143 memory: 15691 loss: 0.0848 2024/08/18 13:05:59 - mmengine - INFO - Iter(train) [7150/9216] lr: 2.5240e-06 eta: 2:34:39 time: 4.5876 data_time: 0.0147 memory: 15621 loss: 0.1208 2024/08/18 13:06:46 - mmengine - INFO - Iter(train) [7160/9216] lr: 2.5007e-06 eta: 2:33:55 time: 4.6201 data_time: 0.0148 memory: 15590 loss: 0.1064 2024/08/18 13:07:31 - mmengine - INFO - Iter(train) [7170/9216] lr: 2.4775e-06 eta: 2:33:10 time: 4.5325 data_time: 0.0146 memory: 15529 loss: 0.0814 2024/08/18 13:08:16 - mmengine - INFO - Iter(train) [7180/9216] lr: 2.4544e-06 eta: 2:32:25 time: 4.4932 data_time: 0.0147 memory: 15476 loss: 0.0763 2024/08/18 13:09:01 - mmengine - INFO - Iter(train) [7190/9216] lr: 2.4314e-06 eta: 2:31:40 time: 4.4742 data_time: 0.0146 memory: 15474 loss: 0.0973 2024/08/18 13:09:46 - mmengine - INFO - Iter(train) [7200/9216] lr: 2.4085e-06 eta: 2:30:55 time: 4.5689 data_time: 0.0151 memory: 15410 loss: 0.1179 2024/08/18 13:10:31 - mmengine - INFO - Iter(train) [7210/9216] lr: 2.3857e-06 eta: 2:30:10 time: 4.4323 data_time: 0.0146 memory: 15244 loss: 0.1539 2024/08/18 13:11:15 - mmengine - INFO - Iter(train) [7220/9216] lr: 2.3629e-06 eta: 2:29:25 time: 4.4260 data_time: 0.0143 memory: 15408 loss: 0.1181 2024/08/18 13:11:59 - mmengine - INFO - Iter(train) [7230/9216] lr: 2.3403e-06 eta: 2:28:40 time: 4.4098 data_time: 0.0146 memory: 15410 loss: 0.1175 2024/08/18 13:12:42 - mmengine - INFO - Iter(train) [7240/9216] lr: 2.3178e-06 eta: 2:27:54 time: 4.3193 data_time: 0.0144 memory: 15276 loss: 0.1095 2024/08/18 13:13:24 - mmengine - INFO - Iter(train) [7250/9216] lr: 2.2953e-06 eta: 2:27:09 time: 4.1659 data_time: 0.0148 memory: 14934 loss: 0.0900 2024/08/18 13:14:05 - mmengine - INFO - Iter(train) [7260/9216] lr: 2.2730e-06 eta: 2:26:23 time: 4.0773 data_time: 0.0145 memory: 14791 loss: 0.2116 2024/08/18 13:14:44 - mmengine - INFO - Iter(train) [7270/9216] lr: 2.2507e-06 eta: 2:25:36 time: 3.9359 data_time: 0.0139 memory: 14813 loss: 0.0641 2024/08/18 13:15:21 - mmengine - INFO - Iter(train) [7280/9216] lr: 2.2285e-06 eta: 2:24:49 time: 3.7340 data_time: 0.0129 memory: 14473 loss: 0.1489 2024/08/18 13:15:56 - mmengine - INFO - Iter(train) [7290/9216] lr: 2.2065e-06 eta: 2:24:02 time: 3.5063 data_time: 0.0131 memory: 14248 loss: 0.1049 2024/08/18 13:16:30 - mmengine - INFO - Iter(train) [7300/9216] lr: 2.1845e-06 eta: 2:23:14 time: 3.3202 data_time: 0.0125 memory: 13926 loss: 0.1204 2024/08/18 13:17:01 - mmengine - INFO - Iter(train) [7310/9216] lr: 2.1626e-06 eta: 2:22:26 time: 3.1770 data_time: 0.0122 memory: 13812 loss: 0.0983 2024/08/18 13:17:33 - mmengine - INFO - Iter(train) [7320/9216] lr: 2.1409e-06 eta: 2:21:37 time: 3.1311 data_time: 0.0124 memory: 13476 loss: 0.0764 2024/08/18 13:18:04 - mmengine - INFO - Iter(train) [7330/9216] lr: 2.1192e-06 eta: 2:20:49 time: 3.1093 data_time: 0.0122 memory: 13500 loss: 0.0621 2024/08/18 13:18:35 - mmengine - INFO - Iter(train) [7340/9216] lr: 2.0976e-06 eta: 2:20:01 time: 3.0706 data_time: 0.0132 memory: 13441 loss: 0.1027 2024/08/18 13:19:05 - mmengine - INFO - Iter(train) [7350/9216] lr: 2.0761e-06 eta: 2:19:12 time: 3.0392 data_time: 0.0125 memory: 13329 loss: 0.0663 2024/08/18 13:19:35 - mmengine - INFO - Iter(train) [7360/9216] lr: 2.0547e-06 eta: 2:18:24 time: 2.9941 data_time: 0.0122 memory: 13243 loss: 0.1247 2024/08/18 13:20:04 - mmengine - INFO - Iter(train) [7370/9216] lr: 2.0334e-06 eta: 2:17:35 time: 2.8956 data_time: 0.0125 memory: 13069 loss: 0.1141 2024/08/18 13:20:32 - mmengine - INFO - Iter(train) [7380/9216] lr: 2.0122e-06 eta: 2:16:46 time: 2.8117 data_time: 0.0120 memory: 12950 loss: 0.0990 2024/08/18 13:20:58 - mmengine - INFO - Iter(train) [7390/9216] lr: 1.9911e-06 eta: 2:15:57 time: 2.5706 data_time: 0.0119 memory: 12621 loss: 0.1313 2024/08/18 13:21:20 - mmengine - INFO - Iter(train) [7400/9216] lr: 1.9701e-06 eta: 2:15:07 time: 2.2811 data_time: 0.0106 memory: 12185 loss: 0.0603 2024/08/18 13:21:42 - mmengine - INFO - Iter(train) [7410/9216] lr: 1.9493e-06 eta: 2:14:16 time: 2.1127 data_time: 0.0111 memory: 11758 loss: 0.0750 2024/08/18 13:21:59 - mmengine - INFO - Iter(train) [7420/9216] lr: 1.9285e-06 eta: 2:13:25 time: 1.7596 data_time: 0.0100 memory: 11400 loss: 0.0757 2024/08/18 13:22:39 - mmengine - INFO - Iter(train) [7430/9216] lr: 1.9078e-06 eta: 2:12:39 time: 4.0302 data_time: 0.0112 memory: 20866 loss: 0.0918 2024/08/18 13:23:34 - mmengine - INFO - Iter(train) [7440/9216] lr: 1.8872e-06 eta: 2:11:57 time: 5.4650 data_time: 0.0148 memory: 17409 loss: 0.0882 2024/08/18 13:24:26 - mmengine - INFO - Iter(train) [7450/9216] lr: 1.8667e-06 eta: 2:11:15 time: 5.2334 data_time: 0.0147 memory: 16589 loss: 0.0573 2024/08/18 13:25:18 - mmengine - INFO - Iter(train) [7460/9216] lr: 1.8463e-06 eta: 2:10:32 time: 5.1152 data_time: 0.0148 memory: 16535 loss: 0.0404 2024/08/18 13:26:07 - mmengine - INFO - Iter(train) [7470/9216] lr: 1.8260e-06 eta: 2:09:48 time: 4.9856 data_time: 0.0144 memory: 16133 loss: 0.0723 2024/08/18 13:26:57 - mmengine - INFO - Iter(train) [7480/9216] lr: 1.8058e-06 eta: 2:09:05 time: 4.9676 data_time: 0.0140 memory: 16245 loss: 0.0496 2024/08/18 13:27:46 - mmengine - INFO - Iter(train) [7490/9216] lr: 1.7857e-06 eta: 2:08:21 time: 4.9287 data_time: 0.0136 memory: 16133 loss: 0.0369 2024/08/18 13:28:37 - mmengine - INFO - Iter(train) [7500/9216] lr: 1.7657e-06 eta: 2:07:38 time: 5.0116 data_time: 0.0154 memory: 16173 loss: 0.0780 2024/08/18 13:29:25 - mmengine - INFO - Iter(train) [7510/9216] lr: 1.7458e-06 eta: 2:06:54 time: 4.8464 data_time: 0.0138 memory: 15983 loss: 0.0508 2024/08/18 13:30:16 - mmengine - INFO - Iter(train) [7520/9216] lr: 1.7260e-06 eta: 2:06:11 time: 5.0819 data_time: 0.0154 memory: 16544 loss: 0.1176 2024/08/18 13:31:04 - mmengine - INFO - Iter(train) [7530/9216] lr: 1.7064e-06 eta: 2:05:27 time: 4.8141 data_time: 0.0146 memory: 15959 loss: 0.0577 2024/08/18 13:31:52 - mmengine - INFO - Iter(train) [7540/9216] lr: 1.6868e-06 eta: 2:04:43 time: 4.8172 data_time: 0.0149 memory: 16026 loss: 0.0676 2024/08/18 13:32:40 - mmengine - INFO - Iter(train) [7550/9216] lr: 1.6673e-06 eta: 2:03:59 time: 4.8224 data_time: 0.0147 memory: 16112 loss: 0.0507 2024/08/18 13:33:28 - mmengine - INFO - Iter(train) [7560/9216] lr: 1.6479e-06 eta: 2:03:15 time: 4.7823 data_time: 0.0148 memory: 15796 loss: 0.0879 2024/08/18 13:34:17 - mmengine - INFO - Iter(train) [7570/9216] lr: 1.6286e-06 eta: 2:02:32 time: 4.8696 data_time: 0.0154 memory: 16740 loss: 0.0411 2024/08/18 13:35:04 - mmengine - INFO - Iter(train) [7580/9216] lr: 1.6095e-06 eta: 2:01:47 time: 4.7377 data_time: 0.0142 memory: 15933 loss: 0.0780 2024/08/18 13:35:51 - mmengine - INFO - Iter(train) [7590/9216] lr: 1.5904e-06 eta: 2:01:03 time: 4.7155 data_time: 0.0150 memory: 15758 loss: 0.0915 2024/08/18 13:36:38 - mmengine - INFO - Iter(train) [7600/9216] lr: 1.5714e-06 eta: 2:00:19 time: 4.6879 data_time: 0.0145 memory: 15732 loss: 0.0847 2024/08/18 13:37:25 - mmengine - INFO - Iter(train) [7610/9216] lr: 1.5526e-06 eta: 1:59:35 time: 4.6500 data_time: 0.0146 memory: 15616 loss: 0.0761 2024/08/18 13:38:11 - mmengine - INFO - Iter(train) [7620/9216] lr: 1.5338e-06 eta: 1:58:50 time: 4.6121 data_time: 0.0145 memory: 15705 loss: 0.1438 2024/08/18 13:38:57 - mmengine - INFO - Iter(train) [7630/9216] lr: 1.5152e-06 eta: 1:58:06 time: 4.5980 data_time: 0.0146 memory: 15803 loss: 0.0624 2024/08/18 13:39:43 - mmengine - INFO - Iter(train) [7640/9216] lr: 1.4966e-06 eta: 1:57:22 time: 4.6113 data_time: 0.0146 memory: 15671 loss: 0.0799 2024/08/18 13:40:28 - mmengine - INFO - Iter(train) [7650/9216] lr: 1.4782e-06 eta: 1:56:37 time: 4.5445 data_time: 0.0145 memory: 15559 loss: 0.0805 2024/08/18 13:41:15 - mmengine - INFO - Iter(train) [7660/9216] lr: 1.4599e-06 eta: 1:55:53 time: 4.6761 data_time: 0.0151 memory: 15793 loss: 0.0773 2024/08/18 13:42:01 - mmengine - INFO - Iter(train) [7670/9216] lr: 1.4416e-06 eta: 1:55:08 time: 4.5948 data_time: 0.0146 memory: 15538 loss: 0.0742 2024/08/18 13:42:47 - mmengine - INFO - Iter(train) [7680/9216] lr: 1.4235e-06 eta: 1:54:24 time: 4.6092 data_time: 0.0152 memory: 15557 loss: 0.0698 2024/08/18 13:43:33 - mmengine - INFO - Iter(train) [7690/9216] lr: 1.4055e-06 eta: 1:53:40 time: 4.5867 data_time: 0.0162 memory: 15710 loss: 0.1144 2024/08/18 13:44:19 - mmengine - INFO - Iter(train) [7700/9216] lr: 1.3876e-06 eta: 1:52:55 time: 4.5897 data_time: 0.0143 memory: 15751 loss: 0.1108 2024/08/18 13:45:05 - mmengine - INFO - Iter(train) [7710/9216] lr: 1.3698e-06 eta: 1:52:11 time: 4.5529 data_time: 0.0139 memory: 15427 loss: 0.1293 2024/08/18 13:45:50 - mmengine - INFO - Iter(train) [7720/9216] lr: 1.3521e-06 eta: 1:51:26 time: 4.5296 data_time: 0.0150 memory: 15533 loss: 0.0926 2024/08/18 13:46:35 - mmengine - INFO - Iter(train) [7730/9216] lr: 1.3345e-06 eta: 1:50:41 time: 4.5097 data_time: 0.0147 memory: 15346 loss: 0.1197 2024/08/18 13:47:19 - mmengine - INFO - Iter(train) [7740/9216] lr: 1.3170e-06 eta: 1:49:57 time: 4.4286 data_time: 0.0148 memory: 15379 loss: 0.1072 2024/08/18 13:48:03 - mmengine - INFO - Iter(train) [7750/9216] lr: 1.2996e-06 eta: 1:49:12 time: 4.3639 data_time: 0.0143 memory: 15256 loss: 0.2206 2024/08/18 13:48:46 - mmengine - INFO - Iter(train) [7760/9216] lr: 1.2823e-06 eta: 1:48:27 time: 4.3426 data_time: 0.0152 memory: 15287 loss: 0.1363 2024/08/18 13:49:28 - mmengine - INFO - Iter(train) [7770/9216] lr: 1.2652e-06 eta: 1:47:42 time: 4.2051 data_time: 0.0145 memory: 15049 loss: 0.1648 2024/08/18 13:50:09 - mmengine - INFO - Iter(train) [7780/9216] lr: 1.2481e-06 eta: 1:46:56 time: 4.0700 data_time: 0.0144 memory: 14822 loss: 0.1239 2024/08/18 13:50:48 - mmengine - INFO - Iter(train) [7790/9216] lr: 1.2312e-06 eta: 1:46:10 time: 3.9279 data_time: 0.0137 memory: 14643 loss: 0.1332 2024/08/18 13:51:26 - mmengine - INFO - Iter(train) [7800/9216] lr: 1.2143e-06 eta: 1:45:25 time: 3.7852 data_time: 0.0132 memory: 14619 loss: 0.0642 2024/08/18 13:52:03 - mmengine - INFO - Iter(train) [7810/9216] lr: 1.1976e-06 eta: 1:44:38 time: 3.6444 data_time: 0.0129 memory: 14491 loss: 0.1071 2024/08/18 13:52:37 - mmengine - INFO - Iter(train) [7820/9216] lr: 1.1810e-06 eta: 1:43:52 time: 3.3902 data_time: 0.0135 memory: 13956 loss: 0.1014 2024/08/18 13:53:09 - mmengine - INFO - Iter(train) [7830/9216] lr: 1.1645e-06 eta: 1:43:05 time: 3.2328 data_time: 0.0123 memory: 13618 loss: 0.0769 2024/08/18 13:53:41 - mmengine - INFO - Iter(train) [7840/9216] lr: 1.1481e-06 eta: 1:42:18 time: 3.1939 data_time: 0.0130 memory: 13812 loss: 0.0787 2024/08/18 13:54:12 - mmengine - INFO - Iter(train) [7850/9216] lr: 1.1318e-06 eta: 1:41:31 time: 3.1395 data_time: 0.0131 memory: 13388 loss: 0.0594 2024/08/18 13:54:43 - mmengine - INFO - Iter(train) [7860/9216] lr: 1.1156e-06 eta: 1:40:44 time: 3.0976 data_time: 0.0126 memory: 13443 loss: 0.0506 2024/08/18 13:55:13 - mmengine - INFO - Iter(train) [7870/9216] lr: 1.0995e-06 eta: 1:39:57 time: 3.0151 data_time: 0.0124 memory: 13332 loss: 0.1363 2024/08/18 13:55:43 - mmengine - INFO - Iter(train) [7880/9216] lr: 1.0836e-06 eta: 1:39:10 time: 2.9330 data_time: 0.0127 memory: 13164 loss: 0.0898 2024/08/18 13:56:12 - mmengine - INFO - Iter(train) [7890/9216] lr: 1.0677e-06 eta: 1:38:23 time: 2.9238 data_time: 0.0127 memory: 13071 loss: 0.1488 2024/08/18 13:56:39 - mmengine - INFO - Iter(train) [7900/9216] lr: 1.0520e-06 eta: 1:37:36 time: 2.7454 data_time: 0.0121 memory: 12841 loss: 0.1688 2024/08/18 13:57:04 - mmengine - INFO - Iter(train) [7910/9216] lr: 1.0363e-06 eta: 1:36:48 time: 2.4295 data_time: 0.0113 memory: 12326 loss: 0.0693 2024/08/18 13:57:26 - mmengine - INFO - Iter(train) [7920/9216] lr: 1.0208e-06 eta: 1:36:00 time: 2.2067 data_time: 0.0109 memory: 11901 loss: 0.1015 2024/08/18 13:57:44 - mmengine - INFO - Iter(train) [7930/9216] lr: 1.0054e-06 eta: 1:35:11 time: 1.8529 data_time: 0.0105 memory: 11662 loss: 0.1609 2024/08/18 13:58:20 - mmengine - INFO - Iter(train) [7940/9216] lr: 9.9009e-07 eta: 1:34:25 time: 3.5328 data_time: 0.0105 memory: 21965 loss: 0.1057 2024/08/18 13:59:20 - mmengine - INFO - Iter(train) [7950/9216] lr: 9.7490e-07 eta: 1:33:43 time: 6.0336 data_time: 0.0147 memory: 19143 loss: 0.0906 2024/08/18 14:00:15 - mmengine - INFO - Iter(train) [7960/9216] lr: 9.5983e-07 eta: 1:33:01 time: 5.5331 data_time: 0.0149 memory: 17020 loss: 0.1552 2024/08/18 14:01:06 - mmengine - INFO - Iter(train) [7970/9216] lr: 9.4486e-07 eta: 1:32:17 time: 5.0513 data_time: 0.0150 memory: 16390 loss: 0.0881 2024/08/18 14:01:56 - mmengine - INFO - Iter(train) [7980/9216] lr: 9.3000e-07 eta: 1:31:33 time: 4.9890 data_time: 0.0163 memory: 16491 loss: 0.0470 2024/08/18 14:02:45 - mmengine - INFO - Iter(train) [7990/9216] lr: 9.1526e-07 eta: 1:30:50 time: 4.9260 data_time: 0.0144 memory: 16252 loss: 0.0745 2024/08/18 14:03:34 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 14:03:34 - mmengine - INFO - Iter(train) [8000/9216] lr: 9.0063e-07 eta: 1:30:06 time: 4.8822 data_time: 0.0150 memory: 16128 loss: 0.0669 2024/08/18 14:03:34 - mmengine - INFO - Saving checkpoint at 8000 iterations 2024/08/18 14:04:26 - mmengine - INFO - Iter(train) [8010/9216] lr: 8.8611e-07 eta: 1:29:23 time: 5.2444 data_time: 0.3460 memory: 16199 loss: 0.0731 2024/08/18 14:05:15 - mmengine - INFO - Iter(train) [8020/9216] lr: 8.7171e-07 eta: 1:28:39 time: 4.8442 data_time: 0.0154 memory: 16043 loss: 0.0470 2024/08/18 14:06:04 - mmengine - INFO - Iter(train) [8030/9216] lr: 8.5741e-07 eta: 1:27:55 time: 4.9297 data_time: 0.0152 memory: 16535 loss: 0.0505 2024/08/18 14:06:53 - mmengine - INFO - Iter(train) [8040/9216] lr: 8.4323e-07 eta: 1:27:11 time: 4.9481 data_time: 0.0147 memory: 16064 loss: 0.0607 2024/08/18 14:07:42 - mmengine - INFO - Iter(train) [8050/9216] lr: 8.2917e-07 eta: 1:26:27 time: 4.8156 data_time: 0.0146 memory: 15921 loss: 0.0708 2024/08/18 14:08:31 - mmengine - INFO - Iter(train) [8060/9216] lr: 8.1521e-07 eta: 1:25:44 time: 4.9694 data_time: 0.0152 memory: 16012 loss: 0.0940 2024/08/18 14:09:20 - mmengine - INFO - Iter(train) [8070/9216] lr: 8.0137e-07 eta: 1:25:00 time: 4.8662 data_time: 0.0147 memory: 15853 loss: 0.0986 2024/08/18 14:10:09 - mmengine - INFO - Iter(train) [8080/9216] lr: 7.8764e-07 eta: 1:24:16 time: 4.8646 data_time: 0.0151 memory: 16540 loss: 0.0843 2024/08/18 14:10:56 - mmengine - INFO - Iter(train) [8090/9216] lr: 7.7403e-07 eta: 1:23:32 time: 4.7637 data_time: 0.0147 memory: 15812 loss: 0.0607 2024/08/18 14:11:44 - mmengine - INFO - Iter(train) [8100/9216] lr: 7.6053e-07 eta: 1:22:48 time: 4.8251 data_time: 0.0148 memory: 15870 loss: 0.0602 2024/08/18 14:12:32 - mmengine - INFO - Iter(train) [8110/9216] lr: 7.4715e-07 eta: 1:22:04 time: 4.7252 data_time: 0.0147 memory: 15713 loss: 0.0766 2024/08/18 14:13:19 - mmengine - INFO - Iter(train) [8120/9216] lr: 7.3388e-07 eta: 1:21:19 time: 4.7022 data_time: 0.0148 memory: 15841 loss: 0.0667 2024/08/18 14:14:06 - mmengine - INFO - Iter(train) [8130/9216] lr: 7.2072e-07 eta: 1:20:35 time: 4.7040 data_time: 0.0153 memory: 15703 loss: 0.0605 2024/08/18 14:14:52 - mmengine - INFO - Iter(train) [8140/9216] lr: 7.0768e-07 eta: 1:19:51 time: 4.6094 data_time: 0.0149 memory: 15777 loss: 0.0586 2024/08/18 14:15:38 - mmengine - INFO - Iter(train) [8150/9216] lr: 6.9475e-07 eta: 1:19:07 time: 4.6418 data_time: 0.0149 memory: 15691 loss: 0.0653 2024/08/18 14:16:24 - mmengine - INFO - Iter(train) [8160/9216] lr: 6.8194e-07 eta: 1:18:22 time: 4.5963 data_time: 0.0141 memory: 15943 loss: 0.1017 2024/08/18 14:17:10 - mmengine - INFO - Iter(train) [8170/9216] lr: 6.6924e-07 eta: 1:17:38 time: 4.5435 data_time: 0.0145 memory: 15614 loss: 0.1488 2024/08/18 14:17:56 - mmengine - INFO - Iter(train) [8180/9216] lr: 6.5666e-07 eta: 1:16:53 time: 4.6076 data_time: 0.0142 memory: 15662 loss: 0.0780 2024/08/18 14:18:42 - mmengine - INFO - Iter(train) [8190/9216] lr: 6.4419e-07 eta: 1:16:09 time: 4.6737 data_time: 0.0138 memory: 15767 loss: 0.1039 2024/08/18 14:19:28 - mmengine - INFO - Iter(train) [8200/9216] lr: 6.3184e-07 eta: 1:15:25 time: 4.5326 data_time: 0.0148 memory: 15696 loss: 0.0893 2024/08/18 14:20:12 - mmengine - INFO - Iter(train) [8210/9216] lr: 6.1961e-07 eta: 1:14:40 time: 4.4349 data_time: 0.0150 memory: 15562 loss: 0.0954 2024/08/18 14:20:57 - mmengine - INFO - Iter(train) [8220/9216] lr: 6.0749e-07 eta: 1:13:56 time: 4.4757 data_time: 0.0148 memory: 15460 loss: 0.0836 2024/08/18 14:21:41 - mmengine - INFO - Iter(train) [8230/9216] lr: 5.9549e-07 eta: 1:13:11 time: 4.4382 data_time: 0.0147 memory: 15465 loss: 0.1017 2024/08/18 14:22:25 - mmengine - INFO - Iter(train) [8240/9216] lr: 5.8360e-07 eta: 1:12:27 time: 4.4152 data_time: 0.0150 memory: 15294 loss: 0.1080 2024/08/18 14:23:10 - mmengine - INFO - Iter(train) [8250/9216] lr: 5.7183e-07 eta: 1:11:42 time: 4.4608 data_time: 0.0148 memory: 15472 loss: 0.0902 2024/08/18 14:23:54 - mmengine - INFO - Iter(train) [8260/9216] lr: 5.6017e-07 eta: 1:10:57 time: 4.4172 data_time: 0.0146 memory: 15538 loss: 0.1323 2024/08/18 14:24:38 - mmengine - INFO - Iter(train) [8270/9216] lr: 5.4863e-07 eta: 1:10:13 time: 4.3908 data_time: 0.0144 memory: 15213 loss: 0.1040 2024/08/18 14:25:22 - mmengine - INFO - Iter(train) [8280/9216] lr: 5.3721e-07 eta: 1:09:28 time: 4.3705 data_time: 0.0147 memory: 15158 loss: 0.1487 2024/08/18 14:26:05 - mmengine - INFO - Iter(train) [8290/9216] lr: 5.2591e-07 eta: 1:08:44 time: 4.3049 data_time: 0.0144 memory: 15203 loss: 0.1283 2024/08/18 14:26:47 - mmengine - INFO - Iter(train) [8300/9216] lr: 5.1472e-07 eta: 1:07:59 time: 4.1701 data_time: 0.0147 memory: 15017 loss: 0.0908 2024/08/18 14:27:27 - mmengine - INFO - Iter(train) [8310/9216] lr: 5.0365e-07 eta: 1:07:14 time: 4.0851 data_time: 0.0139 memory: 14818 loss: 0.0956 2024/08/18 14:28:08 - mmengine - INFO - Iter(train) [8320/9216] lr: 4.9270e-07 eta: 1:06:29 time: 4.0698 data_time: 0.0141 memory: 14965 loss: 0.1295 2024/08/18 14:28:47 - mmengine - INFO - Iter(train) [8330/9216] lr: 4.8186e-07 eta: 1:05:44 time: 3.8637 data_time: 0.0134 memory: 14657 loss: 0.1018 2024/08/18 14:29:22 - mmengine - INFO - Iter(train) [8340/9216] lr: 4.7114e-07 eta: 1:04:58 time: 3.5557 data_time: 0.0133 memory: 14306 loss: 0.1152 2024/08/18 14:29:55 - mmengine - INFO - Iter(train) [8350/9216] lr: 4.6054e-07 eta: 1:04:12 time: 3.2960 data_time: 0.0127 memory: 14132 loss: 0.1015 2024/08/18 14:30:30 - mmengine - INFO - Iter(train) [8360/9216] lr: 4.5006e-07 eta: 1:03:27 time: 3.4253 data_time: 0.0126 memory: 13848 loss: 0.1000 2024/08/18 14:31:01 - mmengine - INFO - Iter(train) [8370/9216] lr: 4.3970e-07 eta: 1:02:41 time: 3.1313 data_time: 0.0120 memory: 13566 loss: 0.0797 2024/08/18 14:31:32 - mmengine - INFO - Iter(train) [8380/9216] lr: 4.2945e-07 eta: 1:01:55 time: 3.0901 data_time: 0.0123 memory: 13414 loss: 0.0830 2024/08/18 14:32:02 - mmengine - INFO - Iter(train) [8390/9216] lr: 4.1932e-07 eta: 1:01:09 time: 3.0539 data_time: 0.0123 memory: 13268 loss: 0.0710 2024/08/18 14:32:32 - mmengine - INFO - Iter(train) [8400/9216] lr: 4.0931e-07 eta: 1:00:24 time: 2.9843 data_time: 0.0121 memory: 13287 loss: 0.0787 2024/08/18 14:33:01 - mmengine - INFO - Iter(train) [8410/9216] lr: 3.9942e-07 eta: 0:59:38 time: 2.8799 data_time: 0.0129 memory: 13204 loss: 0.1286 2024/08/18 14:33:28 - mmengine - INFO - Iter(train) [8420/9216] lr: 3.8965e-07 eta: 0:58:52 time: 2.7538 data_time: 0.0120 memory: 12995 loss: 0.1455 2024/08/18 14:33:52 - mmengine - INFO - Iter(train) [8430/9216] lr: 3.7999e-07 eta: 0:58:05 time: 2.3413 data_time: 0.0109 memory: 12375 loss: 0.0865 2024/08/18 14:34:12 - mmengine - INFO - Iter(train) [8440/9216] lr: 3.7046e-07 eta: 0:57:19 time: 2.0166 data_time: 0.0105 memory: 11834 loss: 0.1395 2024/08/18 14:34:36 - mmengine - INFO - Iter(train) [8450/9216] lr: 3.6104e-07 eta: 0:56:33 time: 2.4043 data_time: 0.0096 memory: 21992 loss: 0.1076 2024/08/18 14:35:34 - mmengine - INFO - Iter(train) [8460/9216] lr: 3.5174e-07 eta: 0:55:50 time: 5.7929 data_time: 0.0147 memory: 19273 loss: 0.0921 2024/08/18 14:36:26 - mmengine - INFO - Iter(train) [8470/9216] lr: 3.4256e-07 eta: 0:55:06 time: 5.2333 data_time: 0.0150 memory: 16852 loss: 0.0665 2024/08/18 14:37:17 - mmengine - INFO - Iter(train) [8480/9216] lr: 3.3350e-07 eta: 0:54:22 time: 5.0295 data_time: 0.0147 memory: 16295 loss: 0.0756 2024/08/18 14:38:06 - mmengine - INFO - Iter(train) [8490/9216] lr: 3.2456e-07 eta: 0:53:38 time: 4.9058 data_time: 0.0148 memory: 16247 loss: 0.0532 2024/08/18 14:38:55 - mmengine - INFO - Iter(train) [8500/9216] lr: 3.1574e-07 eta: 0:52:54 time: 4.9327 data_time: 0.0150 memory: 16335 loss: 0.0776 2024/08/18 14:39:44 - mmengine - INFO - Iter(train) [8510/9216] lr: 3.0704e-07 eta: 0:52:10 time: 4.9233 data_time: 0.0151 memory: 16691 loss: 0.0800 2024/08/18 14:40:33 - mmengine - INFO - Iter(train) [8520/9216] lr: 2.9846e-07 eta: 0:51:26 time: 4.8926 data_time: 0.0151 memory: 16102 loss: 0.0481 2024/08/18 14:41:22 - mmengine - INFO - Iter(train) [8530/9216] lr: 2.9000e-07 eta: 0:50:42 time: 4.9189 data_time: 0.0164 memory: 16062 loss: 0.0400 2024/08/18 14:42:12 - mmengine - INFO - Iter(train) [8540/9216] lr: 2.8166e-07 eta: 0:49:59 time: 4.9359 data_time: 0.0150 memory: 16169 loss: 0.0507 2024/08/18 14:43:01 - mmengine - INFO - Iter(train) [8550/9216] lr: 2.7344e-07 eta: 0:49:15 time: 4.9311 data_time: 0.0151 memory: 16116 loss: 0.0535 2024/08/18 14:43:50 - mmengine - INFO - Iter(train) [8560/9216] lr: 2.6534e-07 eta: 0:48:31 time: 4.9021 data_time: 0.0154 memory: 16024 loss: 0.0589 2024/08/18 14:44:39 - mmengine - INFO - Iter(train) [8570/9216] lr: 2.5735e-07 eta: 0:47:46 time: 4.8526 data_time: 0.0146 memory: 15988 loss: 0.0646 2024/08/18 14:45:27 - mmengine - INFO - Iter(train) [8580/9216] lr: 2.4949e-07 eta: 0:47:02 time: 4.8338 data_time: 0.0149 memory: 16154 loss: 0.0920 2024/08/18 14:46:15 - mmengine - INFO - Iter(train) [8590/9216] lr: 2.4175e-07 eta: 0:46:18 time: 4.8094 data_time: 0.0148 memory: 16043 loss: 0.0780 2024/08/18 14:47:02 - mmengine - INFO - Iter(train) [8600/9216] lr: 2.3413e-07 eta: 0:45:34 time: 4.7334 data_time: 0.0147 memory: 15767 loss: 0.0619 2024/08/18 14:47:50 - mmengine - INFO - Iter(train) [8610/9216] lr: 2.2663e-07 eta: 0:44:50 time: 4.7904 data_time: 0.0149 memory: 15841 loss: 0.0678 2024/08/18 14:48:37 - mmengine - INFO - Iter(train) [8620/9216] lr: 2.1926e-07 eta: 0:44:06 time: 4.7218 data_time: 0.0148 memory: 15807 loss: 0.0668 2024/08/18 14:49:25 - mmengine - INFO - Iter(train) [8630/9216] lr: 2.1200e-07 eta: 0:43:22 time: 4.7896 data_time: 0.0150 memory: 16033 loss: 0.0672 2024/08/18 14:50:12 - mmengine - INFO - Iter(train) [8640/9216] lr: 2.0486e-07 eta: 0:42:37 time: 4.6691 data_time: 0.0149 memory: 15724 loss: 0.0797 2024/08/18 14:50:58 - mmengine - INFO - Iter(train) [8650/9216] lr: 1.9784e-07 eta: 0:41:53 time: 4.6209 data_time: 0.0147 memory: 15689 loss: 0.0726 2024/08/18 14:51:45 - mmengine - INFO - Iter(train) [8660/9216] lr: 1.9095e-07 eta: 0:41:09 time: 4.6557 data_time: 0.0147 memory: 15779 loss: 0.0729 2024/08/18 14:52:31 - mmengine - INFO - Iter(train) [8670/9216] lr: 1.8418e-07 eta: 0:40:24 time: 4.6554 data_time: 0.0156 memory: 15755 loss: 0.1111 2024/08/18 14:53:18 - mmengine - INFO - Iter(train) [8680/9216] lr: 1.7752e-07 eta: 0:39:40 time: 4.6185 data_time: 0.0148 memory: 15626 loss: 0.1079 2024/08/18 14:54:04 - mmengine - INFO - Iter(train) [8690/9216] lr: 1.7099e-07 eta: 0:38:56 time: 4.6429 data_time: 0.0151 memory: 15545 loss: 0.1046 2024/08/18 14:54:50 - mmengine - INFO - Iter(train) [8700/9216] lr: 1.6458e-07 eta: 0:38:12 time: 4.6427 data_time: 0.0150 memory: 15701 loss: 0.1058 2024/08/18 14:55:36 - mmengine - INFO - Iter(train) [8710/9216] lr: 1.5829e-07 eta: 0:37:27 time: 4.5441 data_time: 0.0148 memory: 15526 loss: 0.1007 2024/08/18 14:56:21 - mmengine - INFO - Iter(train) [8720/9216] lr: 1.5213e-07 eta: 0:36:43 time: 4.5025 data_time: 0.0148 memory: 15398 loss: 0.0852 2024/08/18 14:57:05 - mmengine - INFO - Iter(train) [8730/9216] lr: 1.4608e-07 eta: 0:35:58 time: 4.4165 data_time: 0.0148 memory: 15320 loss: 0.0883 2024/08/18 14:57:49 - mmengine - INFO - Iter(train) [8740/9216] lr: 1.4016e-07 eta: 0:35:14 time: 4.3821 data_time: 0.0146 memory: 15279 loss: 0.0861 2024/08/18 14:58:33 - mmengine - INFO - Iter(train) [8750/9216] lr: 1.3435e-07 eta: 0:34:30 time: 4.4488 data_time: 0.0147 memory: 15246 loss: 0.0736 2024/08/18 14:59:17 - mmengine - INFO - Iter(train) [8760/9216] lr: 1.2867e-07 eta: 0:33:45 time: 4.3651 data_time: 0.0148 memory: 15334 loss: 0.1242 2024/08/18 15:00:00 - mmengine - INFO - Iter(train) [8770/9216] lr: 1.2312e-07 eta: 0:33:01 time: 4.3098 data_time: 0.0149 memory: 15151 loss: 0.1534 2024/08/18 15:00:45 - mmengine - INFO - Iter(train) [8780/9216] lr: 1.1768e-07 eta: 0:32:16 time: 4.5087 data_time: 0.0143 memory: 15149 loss: 0.1190 2024/08/18 15:01:27 - mmengine - INFO - Iter(train) [8790/9216] lr: 1.1237e-07 eta: 0:31:32 time: 4.1728 data_time: 0.0145 memory: 15108 loss: 0.1328 2024/08/18 15:02:08 - mmengine - INFO - Iter(train) [8800/9216] lr: 1.0717e-07 eta: 0:30:47 time: 4.0865 data_time: 0.0141 memory: 14889 loss: 0.1536 2024/08/18 15:02:48 - mmengine - INFO - Iter(train) [8810/9216] lr: 1.0210e-07 eta: 0:30:02 time: 4.0303 data_time: 0.0138 memory: 15115 loss: 0.1430 2024/08/18 15:03:27 - mmengine - INFO - Iter(train) [8820/9216] lr: 9.7156e-08 eta: 0:29:18 time: 3.8732 data_time: 0.0136 memory: 14604 loss: 0.0945 2024/08/18 15:04:03 - mmengine - INFO - Iter(train) [8830/9216] lr: 9.2331e-08 eta: 0:28:33 time: 3.6630 data_time: 0.0131 memory: 14388 loss: 0.1028 2024/08/18 15:04:37 - mmengine - INFO - Iter(train) [8840/9216] lr: 8.7628e-08 eta: 0:27:48 time: 3.3606 data_time: 0.0129 memory: 13985 loss: 0.0869 2024/08/18 15:05:09 - mmengine - INFO - Iter(train) [8850/9216] lr: 8.3047e-08 eta: 0:27:03 time: 3.2390 data_time: 0.0125 memory: 13778 loss: 0.0612 2024/08/18 15:05:42 - mmengine - INFO - Iter(train) [8860/9216] lr: 7.8589e-08 eta: 0:26:18 time: 3.2081 data_time: 0.0133 memory: 13850 loss: 0.1272 2024/08/18 15:06:13 - mmengine - INFO - Iter(train) [8870/9216] lr: 7.4253e-08 eta: 0:25:34 time: 3.1384 data_time: 0.0122 memory: 13504 loss: 0.0803 2024/08/18 15:06:45 - mmengine - INFO - Iter(train) [8880/9216] lr: 7.0040e-08 eta: 0:24:49 time: 3.2176 data_time: 0.0124 memory: 13428 loss: 0.0567 2024/08/18 15:07:16 - mmengine - INFO - Iter(train) [8890/9216] lr: 6.5950e-08 eta: 0:24:04 time: 3.0591 data_time: 0.0123 memory: 13372 loss: 0.0776 2024/08/18 15:07:46 - mmengine - INFO - Iter(train) [8900/9216] lr: 6.1982e-08 eta: 0:23:19 time: 3.0051 data_time: 0.0121 memory: 13194 loss: 0.0950 2024/08/18 15:08:15 - mmengine - INFO - Iter(train) [8910/9216] lr: 5.8137e-08 eta: 0:22:34 time: 2.9515 data_time: 0.0120 memory: 13187 loss: 0.1027 2024/08/18 15:08:44 - mmengine - INFO - Iter(train) [8920/9216] lr: 5.4414e-08 eta: 0:21:50 time: 2.8884 data_time: 0.0121 memory: 13019 loss: 0.0903 2024/08/18 15:09:10 - mmengine - INFO - Iter(train) [8930/9216] lr: 5.0815e-08 eta: 0:21:05 time: 2.6313 data_time: 0.0120 memory: 12758 loss: 0.2508 2024/08/18 15:09:33 - mmengine - INFO - Iter(train) [8940/9216] lr: 4.7338e-08 eta: 0:20:20 time: 2.2883 data_time: 0.0107 memory: 12219 loss: 0.0711 2024/08/18 15:09:54 - mmengine - INFO - Iter(train) [8950/9216] lr: 4.3984e-08 eta: 0:19:35 time: 2.0517 data_time: 0.0112 memory: 11807 loss: 0.1595 2024/08/18 15:10:08 - mmengine - INFO - Iter(train) [8960/9216] lr: 4.0754e-08 eta: 0:18:50 time: 1.4570 data_time: 0.0100 memory: 11376 loss: 0.1062 2024/08/18 15:11:07 - mmengine - INFO - Iter(train) [8970/9216] lr: 3.7646e-08 eta: 0:18:06 time: 5.8600 data_time: 0.0146 memory: 21044 loss: 0.0618 2024/08/18 15:11:58 - mmengine - INFO - Iter(train) [8980/9216] lr: 3.4661e-08 eta: 0:17:22 time: 5.1287 data_time: 0.0155 memory: 16323 loss: 0.0740 2024/08/18 15:12:48 - mmengine - INFO - Iter(train) [8990/9216] lr: 3.1799e-08 eta: 0:16:38 time: 5.0149 data_time: 0.0148 memory: 16214 loss: 0.0399 2024/08/18 15:13:37 - mmengine - INFO - Exp name: internvl_v2_internlm2_2b_qlora_finetune_copy_20240818_040903 2024/08/18 15:13:37 - mmengine - INFO - Iter(train) [9000/9216] lr: 2.9061e-08 eta: 0:15:54 time: 4.8907 data_time: 0.0151 memory: 16195 loss: 0.0815 2024/08/18 15:13:37 - mmengine - INFO - Saving checkpoint at 9000 iterations 2024/08/18 15:14:28 - mmengine - INFO - Iter(train) [9010/9216] lr: 2.6445e-08 eta: 0:15:10 time: 5.0281 data_time: 0.1943 memory: 16283 loss: 0.1239 2024/08/18 15:15:16 - mmengine - INFO - Iter(train) [9020/9216] lr: 2.3953e-08 eta: 0:14:26 time: 4.8311 data_time: 0.0149 memory: 16005 loss: 0.1446 2024/08/18 15:16:04 - mmengine - INFO - Iter(train) [9030/9216] lr: 2.1583e-08 eta: 0:13:42 time: 4.7561 data_time: 0.0150 memory: 15872 loss: 0.0863 2024/08/18 15:16:51 - mmengine - INFO - Iter(train) [9040/9216] lr: 1.9338e-08 eta: 0:12:58 time: 4.7370 data_time: 0.0149 memory: 15872 loss: 0.0764 2024/08/18 15:17:38 - mmengine - INFO - Iter(train) [9050/9216] lr: 1.7215e-08 eta: 0:12:13 time: 4.6972 data_time: 0.0147 memory: 15777 loss: 0.0925 2024/08/18 15:18:25 - mmengine - INFO - Iter(train) [9060/9216] lr: 1.5215e-08 eta: 0:11:29 time: 4.6788 data_time: 0.0159 memory: 15701 loss: 0.0631 2024/08/18 15:19:11 - mmengine - INFO - Iter(train) [9070/9216] lr: 1.3339e-08 eta: 0:10:45 time: 4.6670 data_time: 0.0149 memory: 15633 loss: 0.0978 2024/08/18 15:19:57 - mmengine - INFO - Iter(train) [9080/9216] lr: 1.1586e-08 eta: 0:10:01 time: 4.5843 data_time: 0.0151 memory: 15524 loss: 0.0809 2024/08/18 15:20:43 - mmengine - INFO - Iter(train) [9090/9216] lr: 9.9570e-09 eta: 0:09:17 time: 4.5664 data_time: 0.0149 memory: 15559 loss: 0.0865 2024/08/18 15:21:28 - mmengine - INFO - Iter(train) [9100/9216] lr: 8.4509e-09 eta: 0:08:32 time: 4.5302 data_time: 0.0150 memory: 15555 loss: 0.0671 2024/08/18 15:22:13 - mmengine - INFO - Iter(train) [9110/9216] lr: 7.0682e-09 eta: 0:07:48 time: 4.4400 data_time: 0.0151 memory: 15453 loss: 0.0956 2024/08/18 15:22:56 - mmengine - INFO - Iter(train) [9120/9216] lr: 5.8089e-09 eta: 0:07:04 time: 4.3126 data_time: 0.0147 memory: 15239 loss: 0.1006 2024/08/18 15:23:38 - mmengine - INFO - Iter(train) [9130/9216] lr: 4.6730e-09 eta: 0:06:20 time: 4.1910 data_time: 0.0146 memory: 15061 loss: 0.1277 2024/08/18 15:24:16 - mmengine - INFO - Iter(train) [9140/9216] lr: 3.6606e-09 eta: 0:05:36 time: 3.8723 data_time: 0.0137 memory: 14985 loss: 0.1082 2024/08/18 15:24:49 - mmengine - INFO - Iter(train) [9150/9216] lr: 2.7716e-09 eta: 0:04:51 time: 3.2636 data_time: 0.0127 memory: 13954 loss: 0.1056 2024/08/18 15:25:20 - mmengine - INFO - Iter(train) [9160/9216] lr: 2.0060e-09 eta: 0:04:07 time: 3.0686 data_time: 0.0123 memory: 13434 loss: 0.0854 2024/08/18 15:25:50 - mmengine - INFO - Iter(train) [9170/9216] lr: 1.3639e-09 eta: 0:03:23 time: 2.9887 data_time: 0.0124 memory: 13256 loss: 0.0815 2024/08/18 15:26:17 - mmengine - INFO - Iter(train) [9180/9216] lr: 8.4526e-10 eta: 0:02:38 time: 2.7180 data_time: 0.0120 memory: 12931 loss: 0.1178 2024/08/18 15:26:36 - mmengine - INFO - Iter(train) [9190/9216] lr: 4.5011e-10 eta: 0:01:54 time: 1.9019 data_time: 0.0106 memory: 11887 loss: 0.0830 2024/08/18 15:27:24 - mmengine - INFO - Iter(train) [9200/9216] lr: 1.7844e-10 eta: 0:01:10 time: 4.7860 data_time: 0.0122 memory: 22898 loss: 0.1664 2024/08/18 15:28:17 - mmengine - INFO - Iter(train) [9210/9216] lr: 3.0255e-11 eta: 0:00:26 time: 5.3397 data_time: 0.0161 memory: 17381 loss: 0.0925 2024/08/18 15:28:47 - mmengine - INFO - Saving checkpoint at 9216 iterations