1a93bc80361511f081f0b04fa60560c8/0000755000175100017510000000000015013300464015051 5ustar jenkinsjenkins1a93bc80361511f081f0b04fa60560c8/test_qwen_grpo_vllm_test_qwen_grpo_vllm.log0000644000175100017510001314365415013302050026032 0ustar jenkinsjenkins============================= test session starts ============================== platform linux -- Python 3.9.19, pytest-6.2.5, py-1.11.0, pluggy-1.5.0 rootdir: /home/jenkins/mindspore/testcases/testcases/tests/st, configfile: ../../../../../sault/virtual_test/virtualenv_0013/sault/config/pytest.ini plugins: forked-1.6.0, xdist-1.32.0, anyio-4.9.0 collected 1 item test_qwen_grpo_vllm.py WORKDIR is /home/jenkins/mindspore/testcases/testcases/tests/st PYTHONPATH is /home/jenkins/mindspore/testcases/testcases/tests/st/../../:/home/jenkins/mindspore/testcases/testcases/tests/st/mindformers/:/home/jenkins/mindspore/testcases/testcases/tests/st:/usr/local/Ascend/ascend-toolkit/latest/python/site-packages:/usr/local/Ascend/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe:/home/jenkins/mindspore/testcases/testcases: /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) 0%| | 0/4 [00:00 type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) Start worker process with rank id:0, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_0.log. Environment variable [RANK_ID=0] is exported. Start worker process with rank id:1, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_1.log. Environment variable [RANK_ID=1] is exported. Start worker process with rank id:2, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_2.log. Environment variable [RANK_ID=2] is exported. Start worker process with rank id:3, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_3.log. Environment variable [RANK_ID=3] is exported. Start worker process with rank id:4, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_4.log. Environment variable [RANK_ID=4] is exported. Start worker process with rank id:5, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_5.log. Environment variable [RANK_ID=5] is exported. Start worker process with rank id:6, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_6.log. Environment variable [RANK_ID=6] is exported. Start worker process with rank id:7, log file:/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_vllm_log/worker_7.log. Environment variable [RANK_ID=7] is exported. /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:549: UserWarning: The value of the smallest subnormal for type is zero. setattr(self, word, getattr(machar, word).flat[0]) /home/jenkins/anaconda3/envs/ci39mindrlhf/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for type is zero. return self._float_to_str(self.smallest_subnormal) 2025-05-21 15:31:38,393 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:38,420 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:38,613 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:38,639 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:38,794 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:38,804 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:38,804 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:38,805 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:38,805 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:38,805 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:38,820 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:38,828 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:38,828 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:38,828 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:38,829 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:38,829 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:38,829 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:38,830 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:38,830 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:38,830 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:38,846 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:38,852 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:38,873 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:38,878 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:39,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:39,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:39,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:39,027 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:39,027 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:39,027 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:39,050 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:39,050 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:39,050 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:39,050 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:39,051 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:39,051 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:39,051 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:39,051 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:39,051 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:39,052 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:39,126 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:39,153 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:39,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:163] - INFO - GRPOTrainer: _init_grpo_configs Namespace(config='./qwen2_5_vllm/grpo_config_st.yaml', custom_model_name='qwen', dataset_file='/home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord', tokenizer_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/qwen2_5/', actor_checkpoint_path='', ref_checkpoint_path='', generate_checkpoint_path='', verifier_function='format_reward', verifier_weight='1.0', tensorboard=None, save_checkpoint_dir='/home/jenkins/mindspore/testcases/testcases/tests/st/ckpt/train') in main task 2025-05-21 15:31:39,188 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:172] - INFO - vllm mode: VllmMode.ORIGIN, hf_config_path: ./config.json 2025-05-21 15:31:39,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:39,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:39,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:39,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:39,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:39,232 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:39,232 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:39,233 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:39,233 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:39,233 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:39,234 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:39,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:39,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:39,235 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:39,283 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:39,283 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:39,283 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:39,283 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:39,284 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:39,303 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:39,304 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:39,304 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:39,304 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:39,304 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:39,307 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:39,307 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:39,308 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:39,308 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:39,308 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:39,308 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:39,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:39,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:39,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:39,328 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:39,328 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:39,328 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:39,329 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:39,329 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:39,329 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:39,330 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:39,330 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:39,330 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:39,441 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on 2025-05-21 15:31:39,474 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:39,475 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:39,475 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:39,475 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:39,476 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:39,499 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:39,499 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:39,499 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:39,499 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:39,500 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:39,500 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:39,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:39,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:39,501 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:39,565 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:39,565 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:39,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:39,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:39,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:39,579 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:213] - INFO - GRPOTrainer: _init_reward_fn 2025-05-21 15:31:39,579 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:222] - INFO - verifier_function_list:['format_reward'] 2025-05-21 15:31:39,579 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:223] - INFO - verifier_weight:[1.0] 2025-05-21 15:31:39,579 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:87] - INFO - GRPOTrainer: start init workers 2025-05-21 15:31:39,580 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:70] - INFO - init InferWorker 2025-05-21 15:31:39,589 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:39,590 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:39,590 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:39,590 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:39,591 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:39,591 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:39,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:39,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:39,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:39,603 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:39,603 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:39,603 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:39,603 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:39,604 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:39,604 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:39,604 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:77] - INFO - generate parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:39,605 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:87] - INFO - launch actor roll out sft_config_infer.use_parallel True 2025-05-21 15:31:39,605 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:89] - INFO - sft_config_infer.context:{'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}} 2025-05-21 15:31:39,655 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on 2025-05-21 15:31:39,866 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on 2025-05-21 15:31:39,949 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on 2025-05-21 15:31:39,950 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on 2025-05-21 15:31:40,110 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on 2025-05-21 15:31:40,215 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on 2025-05-21 15:31:40,221 - mindformers./output/log[mindformers/core/context/build_context.py:168] - INFO - Predict context config, jit_level: O0, infer_boost: on [MS_ALLOC_CONF]Runtime config: enable_vmm:False [MS_ALLOC_CONF]Runtime config: enable_vmm:False [MS_ALLOC_CONF]Runtime config: enable_vmm:False [MS_ALLOC_CONF]Runtime config: enable_vmm:False [MS_ALLOC_CONF]Runtime config: enable_vmm:False [MS_ALLOC_CONF]Runtime config: enable_vmm:False [MS_ALLOC_CONF]Runtime config: enable_vmm:False [MS_ALLOC_CONF]Runtime config: enable_vmm:False 2025-05-21 15:31:43,759 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_3.ckpt' 2025-05-21 15:31:43,762 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:43,762 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:43,763 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 3, device_num: 8 2025-05-21 15:31:43,780 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_2.ckpt' 2025-05-21 15:31:43,783 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:43,784 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:43,784 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 2, device_num: 8 2025-05-21 15:31:43,865 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_1.ckpt' 2025-05-21 15:31:43,869 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:43,869 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:43,874 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 1, device_num: 8 2025-05-21 15:31:43,897 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_5.ckpt' 2025-05-21 15:31:43,900 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:43,900 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:43,901 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 5, device_num: 8 2025-05-21 15:31:44,031 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_7.ckpt' 2025-05-21 15:31:44,033 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:44,034 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:44,034 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 7, device_num: 8 2025-05-21 15:31:44,074 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_0.ckpt' 2025-05-21 15:31:44,077 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:44,077 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:44,078 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 0, device_num: 8 2025-05-21 15:31:44,130 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_4.ckpt' 2025-05-21 15:31:44,132 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:44,133 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:44,133 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 4, device_num: 8 2025-05-21 15:31:44,161 - mindformers./output/log[mindformers/tools/utils.py:181] - INFO - set strategy path to './output/strategy/ckpt_strategy_rank_6.ckpt' 2025-05-21 15:31:44,164 - mindformers./output/log[mindformers/core/context/build_context.py:383] - INFO - cann workqueue cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254] 2025-05-21 15:31:44,164 - mindformers./output/log[mindformers/core/context/build_context.py:387] - WARNING - CANN use cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254], model get empty cpu list, disable binding cores 2025-05-21 15:31:44,165 - mindformers./output/log[mindformers/core/context/build_context.py:395] - INFO - cpu_affinity, rank_id: 6, device_num: 8 2025-05-21 15:31:50,590 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:50,592 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:50,592 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:50,594 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 3, 'device_num': 8, 'swap_config': , 'recompute_config': } 2025-05-21 15:31:50,919 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:50,921 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:50,921 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:50,923 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 1, 'device_num': 8, 'swap_config': , 'recompute_config': } 2025-05-21 15:31:51,207 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:51,209 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:51,209 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:51,211 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 4, 'device_num': 8, 'swap_config': , 'recompute_config': } 2025-05-21 15:31:51,226 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:51,227 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:51,227 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:51,230 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 2, 'device_num': 8, 'swap_config': , 'recompute_config': } 2025-05-21 15:31:51,421 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:51,423 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:51,424 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:51,427 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 5, 'device_num': 8, 'swap_config': , 'recompute_config': } 2025-05-21 15:31:51,428 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:51,430 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:51,430 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:51,432 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 7, 'device_num': 8, 'swap_config': , 'recompute_config': } 2025-05-21 15:31:51,486 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:51,487 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:51,488 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:51,490 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 6, 'device_num': 8, 'swap_config': , 'recompute_config': } 2025-05-21 15:31:51,526 - mindformers./output/log[mindformers/core/parallel_config.py:41] - INFO - initial moe_config from dict: {'expert_num': 1, 'capacity_factor': 1.1, 'aux_loss_factor': 0.05, 'num_experts_chosen': 1, 'expert_group_size': None, 'group_wise_a2a': False, 'comp_comm_parallel': False, 'comp_comm_parallel_degree': 2, 'save_token_distribution': False, 'cur_layer': 0, 'enable_cold_hot_expert': False, 'update_step': 10000, 'hot_expert_num': 0, 'cold_token_percent': 1.0, 'moe_module_name': '', 'routing_policy': 'TopkRouterV1', 'norm_topk_prob': True, 'enable_sdrop': False, 'use_fused_ops_topkrouter': False, 'router_dense_type': 'float32', 'shared_expert_num': 0, 'use_shared_expert_gating': False, 'max_router_load': 131072, 'topk_method': 'greedy', 'topk_group': None, 'n_group': None, 'first_k_dense_replace': True, 'moe_intermediate_size': 1407, 'routed_scaling_factor': 1.0, 'aux_loss_types': None, 'aux_loss_factors': None, 'z_loss_factor': 0.0, 'balance_via_topk_bias': False, 'topk_bias_update_rate': 0.0, 'use_allgather_dispatcher': False, 'moe_shared_expert_overlap': False, 'expert_model_parallel': None, 'use_gating_sigmoid': False, 'enable_deredundency': False, 'npu_nums_per_device': 1, 'use_gmm': False, 'enable_gmm_safe_tokens': False, 'use_fused_ops_permute': False, 'callback_moe_droprate': False} 2025-05-21 15:31:51,527 - mindformers./output/log[mindformers/core/parallel_config.py:61] - INFO - initial parallel_config from dict: {'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:51,527 - mindformers./output/log[mindformers/core/parallel_config.py:63] - INFO - pipeline_stage = 2 > 1, vocab_emd_dp will be reset to False. 2025-05-21 15:31:51,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:101] - INFO - sft_config_infer: {'runner_config': {'batch_size': 1, 'epochs': 1, 'sink_mode': 1, 'sink_size': 1, 'gradient_accumulation_steps': 1, 'num_classes': 1, 'stop_step': 0}, 'context': {'mode': 0, 'device_target': 'Ascend', 'max_call_depth': 10000, 'max_device_memory': '55GB', 'save_graphs': False, 'save_graphs_path': './graph', 'device_id': 0, 'jit_config': {'jit_level': 'O0'}, 'memory_optimize_level': 'O0', 'ascend_config': {'precision_mode': 'must_keep_origin_dtype'}}, 'parallel': {'parallel_mode': 1, 'full_batch': True, 'search_mode': 'sharding_propagation', 'enable_parallel_optimizer': False, 'gradients_mean': False, 'enable_alltoall': False, 'strategy_ckpt_save_file': './ckpt_strategy.ckpt'}, 'trainer': {}, 'model': {'model_config': {'type': 'LlamaConfig', 'batch_size': 1, 'seq_length': 8192, 'hidden_size': 3584, 'num_layers': 2, 'num_heads': 28, 'n_kv_heads': 4, 'vocab_size': 152064, 'intermediate_size': 18944, 'max_position_embeddings': 32768, 'qkv_has_bias': True, 'rms_norm_eps': 1e-06, 'theta': 1000000.0, 'emb_dropout_prob': 0.0, 'eos_token_id': [151645, 151643], 'pad_token_id': 151643, 'bos_token_id': 151643, 'compute_dtype': 'bfloat16', 'layernorm_compute_type': 'float32', 'softmax_compute_type': 'float16', 'rotary_dtype': 'bfloat16', 'param_init_type': 'float32', 'use_past': True, 'use_flash_attention': True, 'block_size': 32, 'num_blocks': 1024, 'use_past_shard': False, 'offset': 0, 'checkpoint_name_or_path': '', 'repetition_penalty': 1.0, 'max_decode_length': 512, 'min_decode_length': 2, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'do_sample': True, 'is_dynamic': True, 'qkv_concat': False, 'auto_map': {'AutoTokenizer': ['qwen2_tokenizer.Qwen2Tokenizer', None]}, 'parallel_config': }, 'arch': {'type': 'LlamaForCausalLM'}}, 'moe_config': , 'parallel_config': , 'processor': {'return_tensors': 'ms', 'tokenizer': {'model_max_length': 32768, 'vocab_file': '/path/vocab.json', 'merges_file': '/path/merges.txt', 'unk_token': '<|endoftext|>', 'pad_token': '<|endoftext|>', 'eos_token': '<|im_end|>', 'chat_template': "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}", 'type': 'Qwen2Tokenizer', 'auto_register': 'qwen2_tokenizer.Qwen2Tokenizer'}, 'type': 'Qwen2Processor'}, 'seed': 1, 'output_dir': './output', 'run_mode': 'predict', 'use_parallel': True, 'resume_training': False, 'load_checkpoint': '', 'load_ckpt_format': 'ckpt', 'auto_trans_ckpt': False, 'transform_process_num': 1, 'src_strategy_path_or_dir': '', 'only_save_strategy': False, 'load_ckpt_async': False, 'use_legacy': True, 'do_eval': False, 'eval_step_interval': 100, 'eval_epoch_interval': -1, 'ignore_data_skip': False, 'data_skip_steps': None, 'profile': False, 'profile_communication': False, 'profile_memory': True, 'init_start_profile': False, 'profile_start_step': 1, 'profile_stop_step': 10, 'profile_rank_ids': None, 'profile_pipeline': False, 'profile_level': 1, 'mstx': False, 'layer_scale': False, 'layer_decay': 0.65, 'lr_scale': False, 'lr_scale_factor': 256, 'micro_batch_interleave_num': 1, 'remote_save_url': None, 'save_file': None, 'input_data': None, 'predict_batch_size': None, 'adapter_id': None, 'exclude_cann_cpu': False, 'train_precision_sync': None, 'infer_precision_sync': None, 'postprocess_use_numpy': False, 'local_rank': 0, 'device_num': 8, 'swap_config': , 'recompute_config': } tp_group is:True dp_group is:True data_parallel_group:dp-0-4 tensor_model_parallel_group:tp-0-1-2-3 2025-05-21 15:31:52,656 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:52,662 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:52,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 tp_group is:True dp_group is:True data_parallel_group:dp-1-5 tensor_model_parallel_group:tp-0-1-2-3 2025-05-21 15:31:52,666 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:52,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:52,672 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. tp_group is:True dp_group is:True data_parallel_group:dp-3-7 tensor_model_parallel_group:tp-0-1-2-3 2025-05-21 15:31:52,673 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:52,674 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:52,678 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:52,681 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:52,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 tp_group is:True dp_group is:True data_parallel_group:dp-2-6 tensor_model_parallel_group:tp-0-1-2-3 2025-05-21 15:31:52,684 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:52,686 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker 2025-05-21 15:31:52,688 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:52,692 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:52,694 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:52,699 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker 2025-05-21 15:31:52,700 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:52,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker 2025-05-21 15:31:52,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker 2025-05-21 15:31:52,719 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:52,720 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:52,720 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:52,720 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:52,720 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:52,721 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:52,721 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:52,721 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:52,726 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:52,726 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline1-5 2025-05-21 15:31:52,734 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:52,735 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:52,735 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:52,735 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:52,736 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:52,736 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:52,736 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:52,737 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:52,741 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:52,741 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:52,741 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:52,741 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:52,742 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:52,742 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:52,742 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:52,742 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:52,742 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline3-7 2025-05-21 15:31:52,742 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:52,745 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:52,746 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:52,746 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:52,746 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:52,746 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:52,747 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:52,747 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:52,747 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline0-4 2025-05-21 15:31:52,747 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:52,747 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:52,753 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:52,754 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline2-6 tp_group is:True dp_group is:True data_parallel_group:dp-0-4 tensor_model_parallel_group:tp-4-5-6-7 2025-05-21 15:31:52,856 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:52,863 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:52,865 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:52,869 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:52,878 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker 2025-05-21 15:31:52,910 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:52,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:52,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:52,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:52,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:52,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:52,912 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:52,912 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:52,917 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:52,917 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline0-4 2025-05-21 15:31:52,970 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline0-4 2025-05-21 15:31:52,972 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:52,975 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline0-4 2025-05-21 15:31:52,976 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:52,976 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:52,977 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:52,979 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN tp_group is:True dp_group is:True data_parallel_group:dp-3-7 tensor_model_parallel_group:tp-4-5-6-7 2025-05-21 15:31:52,980 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:52,980 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:52,981 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:52,983 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:52,986 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. tp_group is:True dp_group is:True data_parallel_group:dp-1-5 tensor_model_parallel_group:tp-4-5-6-7 2025-05-21 15:31:52,986 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:52,988 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:52,992 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:52,992 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:52,994 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:52,998 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:52,999 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker 2025-05-21 15:31:53,007 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker tp_group is:True dp_group is:True data_parallel_group:dp-2-6 tensor_model_parallel_group:tp-4-5-6-7 2025-05-21 15:31:53,023 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,030 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,032 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:53,033 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,033 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,034 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,034 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,034 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,034 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,037 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:252] - INFO - GRPOTrainer: _init_grpo_infer_dataset, dataset dir /home/jenkins/mindspore/testcases/testcases/tests/st/dataset/mini_gsm8k.mindrecord 2025-05-21 15:31:53,039 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:53,040 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline3-7 2025-05-21 15:31:53,041 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,041 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,041 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,041 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,042 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,042 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,042 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,043 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,045 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:49] - INFO - init RefWorker 2025-05-21 15:31:53,047 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:53,048 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline1-5 2025-05-21 15:31:53,078 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,078 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,079 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,079 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,079 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,079 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,080 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:57] - INFO - ref parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,080 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:58] - INFO - grpo_config.ref_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,084 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:87] - INFO - ref_model_config:LlamaConfig { "attn_proj_has_bias": false, "auto_map": { "AutoTokenizer": [ "qwen2_tokenizer.Qwen2Tokenizer", null ] }, "batch_size": 1, "block_size": 32, "bos_token_id": 151643, "calculate_per_token_loss": false, "checkpoint_name_or_path": "", "chunk_prefill": false, "compute_dtype": "bfloat16", "do_sample": true, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": [ 151645, 151643 ], "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": true, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 2048, "max_position_embedding": 8192, "max_position_embeddings": 32768, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 1024, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1.0, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "bfloat16", "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "temperature": 1.2, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 50, "top_p": 1.0, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_past_shard": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:53,085 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:91] - INFO - start create pipeline ref_pipeline2-6 2025-05-21 15:31:53,107 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline3-7 2025-05-21 15:31:53,109 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,111 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline1-5 2025-05-21 15:31:53,113 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:53,113 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:53,113 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,116 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,117 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:53,118 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:53,120 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,147 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline2-6 2025-05-21 15:31:53,149 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,154 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:53,154 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:53,158 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,284 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline3-7 2025-05-21 15:31:53,286 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,291 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:53,291 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:53,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline1-5 2025-05-21 15:31:53,294 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,294 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,298 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:53,298 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:53,301 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,328 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:93] - INFO - end create pipeline ref_pipeline2-6 2025-05-21 15:31:53,331 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,335 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: True 2025-05-21 15:31:53,336 - mindformers./output/log[mindformers/models/llama/llama.py:510] - INFO - use_flash_attention is set to True when run_mode is predict and is_dynamic is True. 2025-05-21 15:31:53,339 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,471 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,471 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,471 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,472 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,472 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,472 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,472 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,472 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,473 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,473 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,501 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,529 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,540 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,546 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,547 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,547 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,547 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,547 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,548 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,548 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,548 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,548 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,548 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,548 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,569 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,570 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,570 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,570 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,570 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,571 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,571 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,571 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,571 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,571 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,572 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,572 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,572 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,572 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,572 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,573 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,573 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,573 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,574 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,575 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,576 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,579 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,582 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,601 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,611 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,612 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,612 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,612 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,612 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,613 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,613 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,613 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,614 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,614 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,614 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,620 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,628 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,629 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,629 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,629 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,630 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,630 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,630 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,630 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,630 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,630 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,640 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,640 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,640 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,640 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,641 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,641 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,641 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,642 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,642 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,642 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,642 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,642 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,642 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,643 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,643 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,643 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,643 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,643 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,644 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,644 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,646 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,649 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,652 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,657 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,662 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,663 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,663 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,663 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,663 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,663 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,664 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,664 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,664 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,664 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,670 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,682 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,685 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,691 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,694 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,696 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,712 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,712 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,712 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,712 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,713 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,713 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,713 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,714 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,714 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,714 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,714 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,714 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,715 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,715 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,715 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,715 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,715 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,716 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,716 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,718 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,721 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,722 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,725 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,725 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,725 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,725 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,726 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,726 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,726 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,726 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,727 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,727 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,727 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,727 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,727 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,728 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,728 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,728 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,728 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,728 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,729 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,729 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,731 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,733 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,735 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,738 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,741 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,759 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,759 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,759 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,760 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,760 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,760 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,760 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,761 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,761 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,761 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,763 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,763 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,764 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,764 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,764 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,764 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,765 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,765 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,765 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,766 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,766 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,766 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,766 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,766 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,767 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,767 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,767 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,767 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,768 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,770 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,775 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,779 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,794 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,810 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,810 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,811 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,811 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,811 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,811 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,812 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,812 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,812 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,812 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,827 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,839 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:53,840 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:53,840 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:53,840 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:53,840 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,840 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:53,841 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:53,841 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,841 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:53,841 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:53,841 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:53,842 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:53,849 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,869 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,870 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,871 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,871 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,871 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,871 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,872 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,872 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,872 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,872 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,873 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,873 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,873 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,873 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,874 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,874 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,874 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,874 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,874 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,878 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,880 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,882 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,885 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,888 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,898 - mindformers./output/log[mindformers/version_control.py:76] - INFO - Predict enable lazy inline. 2025-05-21 15:31:53,909 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:53,909 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,909 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,910 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,910 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,910 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,910 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,912 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,913 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,913 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,915 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,917 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:66] - INFO - init TrainWorker 2025-05-21 15:31:53,919 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,922 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:53,939 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:53,939 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:53,939 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:53,940 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:53,940 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:53,940 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:53,940 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:53,941 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:53,941 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:53,941 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:53,941 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:53,942 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:53,942 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:53,942 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:53,942 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:53,942 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:53,943 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:53,943 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:75] - INFO - actor parallel_config:{'data_parallel': 1, 'model_parallel': 4, 'pipeline_stage': 2, 'expert_parallel': 1, 'use_seq_parallel': True, 'micro_batch_num': 4, 'vocab_emb_dp': False, 'context_parallel': 1} 2025-05-21 15:31:53,943 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:76] - INFO - grpo_config.actor_config.recompute_config:{'recompute': False, 'select_recompute': False, 'parallel_optimizer_comm_recompute': False, 'mp_comm_recompute': True, 'recompute_slice_activation': False} 2025-05-21 15:31:53,946 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:53,951 - mindformers./output/log[mindformers/models/llama/llama.py:508] - INFO - Predict run mode: False 2025-05-21 15:31:53,954 - mindformers./output/log[mindformers/models/llama/llama.py:108] - INFO - MoE config is None, use normal FFN 2025-05-21 15:31:54,528 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,528 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,528 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,529 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,529 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,529 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,529 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,530 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,530 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,530 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,593 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,601 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,602 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,605 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:54,659 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,659 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,660 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,660 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,660 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,660 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,660 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,661 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,661 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,661 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,675 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell 2025-05-21 15:31:54,694 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,695 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,695 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,695 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,696 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,696 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,696 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,696 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,696 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,696 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,706 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,706 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,706 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,707 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,707 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,707 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,707 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,707 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,708 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,708 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,722 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,729 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,731 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,733 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:54,743 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,743 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,743 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,743 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,744 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,744 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,744 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,744 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,744 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,745 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,751 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,752 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,752 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,752 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,752 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,753 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,753 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,753 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,753 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,753 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,755 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,763 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,764 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,766 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:54,768 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,775 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,777 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,779 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:54,803 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell 2025-05-21 15:31:54,804 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,812 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,813 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,815 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,816 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:54,823 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,824 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,827 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:54,830 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell 2025-05-21 15:31:54,853 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell 2025-05-21 15:31:54,875 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,875 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,876 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,876 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,876 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,876 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,876 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,877 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,877 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,877 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,888 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell 2025-05-21 15:31:54,893 - mindformers./output/log[mindformers/models/utils.py:190] - INFO - num_layers per stage: [[1, 1]] 2025-05-21 15:31:54,894 - mindformers./output/log[mindformers/models/utils.py:191] - INFO - Accumulated num_layers per stage: [[1, 2]] 2025-05-21 15:31:54,894 - mindformers./output/log[mindformers/models/utils.py:193] - INFO - Pipeline id list with start_stage: [0, 1] 2025-05-21 15:31:54,894 - mindformers./output/log[mindformers/models/utils.py:194] - INFO - Interleave id list: [0, 0] 2025-05-21 15:31:54,894 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell 2025-05-21 15:31:54,894 - mindformers./output/log[mindformers/models/utils.py:212] - INFO - Formative layer_recompute: [[0, 0]] 2025-05-21 15:31:54,894 - mindformers./output/log[mindformers/models/utils.py:214] - INFO - The configuration of select_recompute_exclude and select_comm_recompute_exclude have the highest priority. 2025-05-21 15:31:54,895 - mindformers./output/log[mindformers/models/utils.py:220] - INFO - Formative select_recompute: {'feed_forward\\.mul': [[0, 0]], 'feed_forward\\.w1\\.activation\\.silu': [[0, 0]]} 2025-05-21 15:31:54,895 - mindformers./output/log[mindformers/models/utils.py:221] - INFO - Formative select_comm_recompute: {'.*\\.norm': [[0, 0]]} 2025-05-21 15:31:54,895 - mindformers./output/log[mindformers/models/utils.py:222] - INFO - Formative select_recompute_exclude: {} 2025-05-21 15:31:54,895 - mindformers./output/log[mindformers/models/utils.py:223] - INFO - Formative select_comm_recompute_exclude: {} 2025-05-21 15:31:54,937 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,945 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,949 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:54,954 - mindformers./output/log[mindformers/models/modeling_utils.py:1494] - INFO - model built, but weights is unloaded, since the config has no checkpoint_name_or_path attribute or checkpoint_name_or_path is None. 2025-05-21 15:31:54,961 - mindformers./output/log[mindformers/version_control.py:140] - INFO - The Lazy Inline compilation acceleration feature is turned on. 2025-05-21 15:31:54,963 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/models/grpo_models.py:208] - INFO - num_iterations: 1, epsilon_low: 0.2, epsilon_high: 0.2 2025-05-21 15:31:54,965 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:273] - INFO - pipeline cell 2025-05-21 15:31:55,015 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell 2025-05-21 15:31:55,038 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:320] - INFO - pipeline cell Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(0-4)=cb4ececddcb4517ca0bcddafd23813b9 Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(0-4)=cb4ececddcb4517ca0bcddafd23813b9 2025-05-21 15:31:55,674 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:55,677 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:55,678 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:55,678 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(0-4)=cb4ececddcb4517ca0bcddafd23813b9 Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(0-4)=cb4ececddcb4517ca0bcddafd23813b9 2025-05-21 15:31:55,678 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:55,681 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:55,681 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:55,681 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers 2025-05-21 15:31:55,697 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:55,698 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:55,698 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:55,698 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:55,698 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:55,699 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:55,699 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:55,699 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:55,699 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:55,699 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:55,700 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:55,700 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:55,700 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:55,700 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:55,700 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:55,700 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:55,701 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:55,706 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:55,706 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(3-7)=e30609fbce6a1a756f50a31ec86eae83 Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(3-7)=e30609fbce6a1a756f50a31ec86eae83 2025-05-21 15:31:55,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:55,706 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:55,706 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:55,706 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:55,707 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:55,707 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:55,707 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:55,707 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:55,707 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:55,708 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:55,708 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:55,708 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:55,708 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:55,708 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:55,709 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:55,709 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:55,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:55,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:55,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/utils/utils.py:497] - WARNING - The given path contains no 'model.safetensors.index.json' file. 2025-05-21 15:31:55,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers 2025-05-21 15:31:55,729 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:55,729 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:55,730 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:55,730 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:55,730 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:55,730 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:55,730 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:55,731 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:55,731 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:55,731 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:55,731 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:55,731 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:55,732 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:55,732 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:55,732 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:55,732 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:55,732 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(1-5)=12426c956d1bc5017082b12a969b0b7c Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(1-5)=12426c956d1bc5017082b12a969b0b7c 2025-05-21 15:31:55,826 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:55,829 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:55,829 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:55,830 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers 2025-05-21 15:31:55,849 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:55,849 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:55,850 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:55,850 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:55,850 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:55,850 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:55,850 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:55,851 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:55,851 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:55,851 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:55,851 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:55,851 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:55,852 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:55,852 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:55,852 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:55,852 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:55,852 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(2-6)=d9639340c2f0051c1a7a09da5ef07ed4 Creating hash value for the group_name hash(0-1-2-3)=7f1758d36cc7b761c9ccce92808de7ac Creating hash value for the group_name hash(2-6)=d9639340c2f0051c1a7a09da5ef07ed4 2025-05-21 15:31:55,862 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:55,865 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:55,865 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:55,866 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers 2025-05-21 15:31:55,885 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:55,886 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:55,886 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:55,886 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:55,886 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:55,886 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:55,887 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:55,887 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:55,887 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:55,887 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(3-7)=e30609fbce6a1a756f50a31ec86eae83 Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(3-7)=e30609fbce6a1a756f50a31ec86eae83 2025-05-21 15:31:55,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:55,887 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:55,888 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:55,888 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:55,888 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:55,888 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:55,888 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:55,888 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:31:55,890 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:55,891 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:55,891 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers 2025-05-21 15:31:55,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:55,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:55,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:55,911 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:55,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:55,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:55,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:55,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:55,912 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:55,913 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:55,913 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:55,913 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:55,913 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:55,913 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:55,914 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:55,914 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:55,914 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(1-5)=12426c956d1bc5017082b12a969b0b7c Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(1-5)=12426c956d1bc5017082b12a969b0b7c 2025-05-21 15:31:56,006 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:56,010 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:56,010 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:56,010 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers 2025-05-21 15:31:56,030 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:56,030 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:56,031 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:56,031 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:56,031 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:56,031 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:56,031 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:56,032 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:56,032 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:56,032 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:56,032 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:56,032 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:56,033 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:56,033 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:56,033 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:56,033 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:56,033 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(2-6)=d9639340c2f0051c1a7a09da5ef07ed4 Creating hash value for the group_name hash(4-5-6-7)=ed832c5612cab19d2c5a27f0a350fa5a Creating hash value for the group_name hash(2-6)=d9639340c2f0051c1a7a09da5ef07ed4 2025-05-21 15:31:56,046 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/old_policy_worker.py:51] - INFO - num_iterations 1 <= 1, OldPolicyWorker is not enalbled 2025-05-21 15:31:56,050 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:105] - INFO - config of sft_model_config_train LlamaConfig { "attn_proj_has_bias": false, "batch_size": 1, "block_size": 32, "bos_token_id": 1, "calculate_per_token_loss": false, "checkpoint_name_or_path": null, "chunk_prefill": false, "compute_dtype": "bfloat16", "compute_in_2d": true, "do_sample": false, "emb_dropout_prob": 0.0, "embedding_init_type": "float32", "eos_token_id": 151643, "extend_method": "None", "ffn_dim_multiplier": null, "fine_grain_interleave": 1, "fused_rms_norm": true, "hidden_size": 3584, "ignore_token_id": -100, "init_method_std": 0.01, "input_sliced_sig": false, "intermediate_size": 18944, "is_dynamic": false, "kv_channels": 128, "layernorm_compute_type": "float32", "llm_backend": "", "max_decode_length": 512, "max_position_embedding": 131072, "mindformers_version": "1.6.0", "model_name": "llama", "model_type": "llama", "multiple_of": 256, "n_kv_heads": 4, "num_blocks": 128, "num_heads": 28, "num_layers": 2, "offset": 0, "pad_token_id": 151643, "parallel_config": { "micro_batch_num": 4, "model_parallel": 4, "pipeline_stage": 2, "use_seq_parallel": true, "vocab_emb_dp": false }, "parallel_decoding_params": null, "parallel_optimizer": false, "param_init_type": "float32", "pp_interleave_num": 1, "qkv_concat": false, "qkv_has_bias": true, "quant_config": null, "repetition_penalty": 1, "residual_dtype": "bfloat16", "return_hidden_states": false, "rms_norm_eps": 1e-06, "rmsnorm_compute_2d": false, "rotary_dtype": "float16", "rotary_emb_base": 1000000, "rotary_pct": 1.0, "scaling_factor": 1.0, "seq_length": 8192, "softmax_compute_type": "float16", "stage_num": 0, "start_stage": 0, "theta": 1000000.0, "tie_word_embeddings": false, "top_k": 0, "top_p": 0.8, "type": "LlamaConfig", "use_attn_mask_compression": false, "use_eod_attn_mask_compression": false, "use_flash_attention": true, "use_past": false, "use_ring_attention": false, "use_rope_slice": false, "vocab_size": 152064 } 2025-05-21 15:31:56,050 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:109] - INFO - set packing_sample_length to 8192 2025-05-21 15:31:56,050 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:110] - INFO - GRPOTrainer: finish init workers 2025-05-21 15:31:56,070 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_config is empty. 2025-05-21 15:31:56,070 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config context is empty. 2025-05-21 15:31:56,071 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel is empty. 2025-05-21 15:31:56,071 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config trainer is empty. 2025-05-21 15:31:56,071 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config moe_config is empty. 2025-05-21 15:31:56,071 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config parallel_config is empty. 2025-05-21 15:31:56,071 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config recompute_config is empty. 2025-05-21 15:31:56,072 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config swap_config is empty. 2025-05-21 15:31:56,072 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config runner_wrapper is empty. 2025-05-21 15:31:56,072 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config optimizer is empty. 2025-05-21 15:31:56,072 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config lr_schedule is empty. 2025-05-21 15:31:56,072 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config metric is empty. 2025-05-21 15:31:56,073 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset is empty. 2025-05-21 15:31:56,073 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config train_dataset_task is empty. 2025-05-21 15:31:56,073 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config callbacks is empty. 2025-05-21 15:31:56,073 - mindformers./output/log[mindformers/tools/register/template.py:84] - WARNING - The input config monitor_config is empty. 2025-05-21 15:31:56,073 - mindformers./output/log[mindformers/tools/register/template.py:683] - WARNING - Some configs in yaml are useless for finetune: ['processor'] 2025-05-21 15:32:06,639 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False 2025-05-21 15:32:06,639 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False 2025-05-21 15:32:06,640 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False 2025-05-21 15:32:06,640 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False 2025-05-21 15:32:06,640 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False 2025-05-21 15:32:06,640 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False 2025-05-21 15:32:06,640 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False 2025-05-21 15:32:06,641 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:273] - INFO - enable_reshard_optimizer:False ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- ----------------start save front parallel strategy--------------- ----------------end save front parallel strategy--------------- 2025-05-21 15:32:39,509 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. 2025-05-21 15:32:39,930 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. 2025-05-21 15:32:40,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. 2025-05-21 15:32:41,173 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. 2025-05-21 15:32:41,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. 2025-05-21 15:32:42,189 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. 2025-05-21 15:32:42,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. 2025-05-21 15:32:42,439 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/transform_worker.py:144] - INFO - Start prepare for parameter resharding in sft training. Waiting for main worker to merge strategies. 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: Waiting for main worker to merge strategies. 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: Waiting for main worker to merge strategies. 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: Waiting for main worker to merge strategies. 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: Waiting for main worker to merge strategies. 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: Waiting for main worker to merge strategies. 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: Waiting for main worker to merge strategies. 2025-05-21 15:32:52,566 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:453] - INFO - sft_ckpt_path_infer: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:459] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:218] - INFO - ref_ckpt_path: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/ref_worker.py:226] - INFO - use_parallel is True 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:232] - INFO - sft_ckpt_path_train: 2025-05-21 15:32:52,567 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:32:52,568 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:32:52,568 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:32:52,568 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:32:52,568 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:32:52,568 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/train_worker.py:242] - INFO - use_parallel is True, 2025-05-21 15:33:11,659 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,660 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,661 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,659 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,661 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,661 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,662 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,662 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,662 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,681 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,682 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,682 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,682 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,682 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,691 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,692 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,692 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,693 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,693 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,695 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,695 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,696 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,696 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,696 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,703 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,721 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:956] - INFO - Start training epoch num:10, step num:1, generation num:8 2025-05-21 15:33:11,722 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:966] - INFO - step begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,722 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:969] - INFO - epoch: 0, step: 0 2025-05-21 15:33:11,722 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:615] - INFO - Make experience begin at 15:33:11 ------------------------------- 2025-05-21 15:33:11,723 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:617] - INFO - Generate 8 times 2025-05-21 15:33:11,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:11,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:11------------------------------- 2025-05-21 15:33:11,998 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:11,998 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:11,998 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:11,999 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:11,999 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:12,000 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:12------------------------------- 2025-05-21 15:33:12,000 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:12,001 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:12,001 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:12,001 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:12,001 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,003 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,003 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,004 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:12,004 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,005 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,006 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,007 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:12,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:12,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:12------------------------------- 2025-05-21 15:33:12,020 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:12,020 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:12,020 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:12,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:12,024 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,025 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,025 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,026 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:12,028 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:12,029 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:12------------------------------- 2025-05-21 15:33:12,029 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:12,030 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:12,030 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:12,030 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:12,033 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:12,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:12------------------------------- 2025-05-21 15:33:12,033 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:12,034 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:12,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:12,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:12,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:12------------------------------- 2025-05-21 15:33:12,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:12,035 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:12,036 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:12,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:12,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:12,037 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:12,038 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,040 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,040 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,040 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,041 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:12,041 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,042 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,043 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:12,051 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:12,052 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:12------------------------------- 2025-05-21 15:33:12,052 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:12,053 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:12,053 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:12,054 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:12,055 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:643] - INFO - solution: ['$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$', '$Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?5$', '$Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?72$', '$Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?42$', '$Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?10$'] 2025-05-21 15:33:12,056 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:650] - INFO - generation start at 15:33:12------------------------------- 2025-05-21 15:33:12,056 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:12,057 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:12,056 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,057 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:12,057 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:12,058 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,059 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,060 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:12,060 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:12,062 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:12,062 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:12,063 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,183 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.15614414215088 s; generated tokens: 512 tokens; generate speed: 25.401683793741913 tokens/s 2025-05-21 15:33:32,183 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.146610736846924 s; generated tokens: 512 tokens; generate speed: 25.413703907207736 tokens/s 2025-05-21 15:33:32,183 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.141820907592773 s; generated tokens: 512 tokens; generate speed: 25.41974741752339 tokens/s 2025-05-21 15:33:32,184 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0025625228881835938 s; prefill predict time: 7.886654853820801 s; prefill post time: 0.10864090919494629 s; decode prepare time: 0.0010259230776075749 s; decode predict time: 0.005363848630119772 s; decode post time: 0.013186917612930567 s 2025-05-21 15:33:32,183 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.17612910270691 s; generated tokens: 512 tokens; generate speed: 25.376522790553917 tokens/s 2025-05-21 15:33:32,184 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0026350021362304688 s; prefill predict time: 7.877476930618286 s; prefill post time: 0.10796594619750977 s; decode prepare time: 0.0010441352709865384 s; decode predict time: 0.005365065032360601 s; decode post time: 0.013179103922237388 s 2025-05-21 15:33:32,184 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0027914047241210938 s; prefill predict time: 7.871877670288086 s; prefill post time: 0.10439157485961914 s; decode prepare time: 0.0010857236828589393 s; decode predict time: 0.004715280906826842 s; decode post time: 0.013790557995701024 s 2025-05-21 15:33:32,184 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0025370121002197266 s; prefill predict time: 7.906947374343872 s; prefill post time: 0.1250457763671875 s; decode prepare time: 0.0011165291362545961 s; decode predict time: 0.004843774963827694 s; decode post time: 0.013628697908554525 s 2025-05-21 15:33:32,184 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,185 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,185 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,185 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.164682149887085 2025-05-21 15:33:32,185 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,185 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,185 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.155436754226685 2025-05-21 15:33:32,185 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,185 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,186 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,186 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.151048183441162 2025-05-21 15:33:32,186 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.184877157211304 2025-05-21 15:33:32,186 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,186 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,186 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,187 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,188 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,188 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,188 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,189 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,190 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,190 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,190 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,190 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,190 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,190 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,190 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,191 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,191 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,191 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,191 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,191 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,191 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,192 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,192 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,287 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.243292808532715 s; generated tokens: 512 tokens; generate speed: 25.292327925236936 tokens/s 2025-05-21 15:33:32,287 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.282588958740234 s; generated tokens: 512 tokens; generate speed: 25.24332574315506 tokens/s 2025-05-21 15:33:32,287 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.223477125167847 s; generated tokens: 512 tokens; generate speed: 25.317110249197594 tokens/s 2025-05-21 15:33:32,287 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 20.22711706161499 s; generated tokens: 512 tokens; generate speed: 25.312554351683794 tokens/s 2025-05-21 15:33:32,288 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0024263858795166016 s; prefill predict time: 8.038694381713867 s; prefill post time: 0.10371971130371094 s; decode prepare time: 0.0011008252369680983 s; decode predict time: 0.004661325847401338 s; decode post time: 0.013571219901516014 s 2025-05-21 15:33:32,288 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0026962757110595703 s; prefill predict time: 7.982559680938721 s; prefill post time: 0.10298943519592285 s; decode prepare time: 0.001141927946803621 s; decode predict time: 0.004431785789190554 s; decode post time: 0.013758552517676307 s 2025-05-21 15:33:32,288 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0025255680084228516 s; prefill predict time: 7.979081392288208 s; prefill post time: 0.1039736270904541 s; decode prepare time: 0.001081859062329197 s; decode predict time: 0.0048699346243166455 s; decode post time: 0.01338025128538128 s 2025-05-21 15:33:32,288 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0030732154846191406 s; prefill predict time: 7.99807596206665 s; prefill post time: 0.10071110725402832 s; decode prepare time: 0.001109119497399974 s; decode predict time: 0.004990737578448127 s; decode post time: 0.013231006853734444 s 2025-05-21 15:33:32,289 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,289 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,289 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,289 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,289 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.236451864242554 2025-05-21 15:33:32,289 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,290 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.232738494873047 2025-05-21 15:33:32,290 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:32,290 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,290 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:32,290 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.29174280166626 2025-05-21 15:33:32,290 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 20.253931522369385 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:32,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:32,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:32,294 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,294 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,294 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,294 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,294 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,295 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,296 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:32,296 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:41,942 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.750706195831299 s; generated tokens: 512 tokens; generate speed: 52.50901726675904 tokens/s 2025-05-21 15:33:41,942 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.750519037246704 s; generated tokens: 512 tokens; generate speed: 52.51002516319127 tokens/s 2025-05-21 15:33:41,943 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015523433685302734 s; prefill predict time: 0.010296821594238281 s; prefill post time: 0.014588594436645508 s; decode prepare time: 0.0009982184654579237 s; decode predict time: 0.005242821282031489 s; decode post time: 0.012731795208328157 s 2025-05-21 15:33:41,943 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.750246524810791 s; generated tokens: 512 tokens; generate speed: 52.51149278094132 tokens/s 2025-05-21 15:33:41,943 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016071796417236328 s; prefill predict time: 0.010195255279541016 s; prefill post time: 0.014484405517578125 s; decode prepare time: 0.0009963097171074257 s; decode predict time: 0.005339447189779843 s; decode post time: 0.012636967601141585 s 2025-05-21 15:33:41,943 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.750577926635742 s; generated tokens: 512 tokens; generate speed: 52.50970802472794 tokens/s 2025-05-21 15:33:41,943 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015635490417480469 s; prefill predict time: 0.009233951568603516 s; prefill post time: 0.015708446502685547 s; decode prepare time: 0.0010752481957004496 s; decode predict time: 0.004605623320037244 s; decode post time: 0.013289739940964312 s 2025-05-21 15:33:41,944 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:41,944 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015823841094970703 s; prefill predict time: 0.009226560592651367 s; prefill post time: 0.022444486618041992 s; decode prepare time: 0.0010517264065677172 s; decode predict time: 0.004702400226219028 s; decode post time: 0.01322014037876913 s 2025-05-21 15:33:41,944 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:41,944 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:41,944 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.757590055465698 2025-05-21 15:33:41,944 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:41,944 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:41,944 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.757530450820923 2025-05-21 15:33:41,944 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:41,945 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.757321834564209 2025-05-21 15:33:41,945 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:41,945 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:41,945 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:41,945 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.75810170173645 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:41,946 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:41,947 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:41,947 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:41,947 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:41,947 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:41,947 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:41,948 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:41,949 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:41,949 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:41,949 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:41,949 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:41,950 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:41,950 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:41,950 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:41,950 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:41,950 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:41,950 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:41,951 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:41,951 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:41,951 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:41,951 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:41,952 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:42,017 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.720329284667969 s; generated tokens: 512 tokens; generate speed: 52.67311271106688 tokens/s 2025-05-21 15:33:42,017 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.721364974975586 s; generated tokens: 512 tokens; generate speed: 52.667501047226736 tokens/s 2025-05-21 15:33:42,017 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.721003293991089 s; generated tokens: 512 tokens; generate speed: 52.6694606015087 tokens/s 2025-05-21 15:33:42,018 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015358924865722656 s; prefill predict time: 0.010785579681396484 s; prefill post time: 0.014342546463012695 s; decode prepare time: 0.0010934855839977527 s; decode predict time: 0.005090443760740991 s; decode post time: 0.012736458601083774 s 2025-05-21 15:33:42,018 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.721956729888916 s; generated tokens: 512 tokens; generate speed: 52.66429528799704 tokens/s 2025-05-21 15:33:42,018 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001680612564086914 s; prefill predict time: 0.011428594589233398 s; prefill post time: 0.014435768127441406 s; decode prepare time: 0.0010612990758190417 s; decode predict time: 0.004928011987723556 s; decode post time: 0.01293274521127839 s 2025-05-21 15:33:42,018 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015213489532470703 s; prefill predict time: 0.010779380798339844 s; prefill post time: 0.014731884002685547 s; decode prepare time: 0.0010662760053362166 s; decode predict time: 0.004654241075702742 s; decode post time: 0.01320361064604817 s 2025-05-21 15:33:42,018 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0017044544219970703 s; prefill predict time: 0.01174163818359375 s; prefill post time: 0.014964818954467773 s; decode prepare time: 0.0011080687759907044 s; decode predict time: 0.004351150288301356 s; decode post time: 0.013464483029688407 s 2025-05-21 15:33:42,019 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:42,019 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:42,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:42,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:42,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.727587461471558 2025-05-21 15:33:42,019 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:42,019 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:42,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.728330373764038 2025-05-21 15:33:42,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:42,019 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:42,020 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.728147506713867 2025-05-21 15:33:42,020 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.728695154190063 2025-05-21 15:33:42,020 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:42,020 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:42,021 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:42,022 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:42,022 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:42,022 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:42,023 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:42,023 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:42,024 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:42,024 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:42,024 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:42,024 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:42,024 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:42,024 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:42,024 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:42,025 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:42,025 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:42,025 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:42,025 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:42,025 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:42,026 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:42,026 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,559 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.608166456222534 s; generated tokens: 512 tokens; generate speed: 53.28800269362669 tokens/s 2025-05-21 15:33:51,559 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.608806848526001 s; generated tokens: 512 tokens; generate speed: 53.28445124053474 tokens/s 2025-05-21 15:33:51,560 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.608515501022339 s; generated tokens: 512 tokens; generate speed: 53.28606692111009 tokens/s 2025-05-21 15:33:51,560 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.607727527618408 s; generated tokens: 512 tokens; generate speed: 53.29043715365605 tokens/s 2025-05-21 15:33:51,560 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001374959945678711 s; prefill predict time: 0.014309167861938477 s; prefill post time: 0.014442205429077148 s; decode prepare time: 0.0009950924292935785 s; decode predict time: 0.005125701193716012 s; decode post time: 0.012575205991412795 s 2025-05-21 15:33:51,560 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013666152954101562 s; prefill predict time: 0.014868021011352539 s; prefill post time: 0.014064311981201172 s; decode prepare time: 0.000991930933614533 s; decode predict time: 0.00503384692996156 s; decode post time: 0.012671993669931669 s 2025-05-21 15:33:51,560 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001539468765258789 s; prefill predict time: 0.015100717544555664 s; prefill post time: 0.014914274215698242 s; decode prepare time: 0.0010638255662414192 s; decode predict time: 0.004522259562623267 s; decode post time: 0.013109641065802827 s 2025-05-21 15:33:51,561 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015597343444824219 s; prefill predict time: 0.013043403625488281 s; prefill post time: 0.014786481857299805 s; decode prepare time: 0.001049655290965698 s; decode predict time: 0.004453356125775506 s; decode post time: 0.013194374375613934 s 2025-05-21 15:33:51,561 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,561 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,561 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,561 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.615389347076416 2025-05-21 15:33:51,562 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,562 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,562 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,562 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.616045236587524 2025-05-21 15:33:51,562 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,562 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,562 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.615651845932007 2025-05-21 15:33:51,562 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.615144729614258 2025-05-21 15:33:51,562 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,563 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,564 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,564 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,564 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,564 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,564 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,564 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,566 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,566 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,566 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,566 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,567 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,568 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,568 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,568 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,568 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,702 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.676124095916748 s; generated tokens: 512 tokens; generate speed: 52.913748823876716 tokens/s 2025-05-21 15:33:51,702 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.676386594772339 s; generated tokens: 512 tokens; generate speed: 52.91231339150997 tokens/s 2025-05-21 15:33:51,702 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.676071405410767 s; generated tokens: 512 tokens; generate speed: 52.91403696273826 tokens/s 2025-05-21 15:33:51,702 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001466989517211914 s; prefill predict time: 0.01799750328063965 s; prefill post time: 0.014245748519897461 s; decode prepare time: 0.0010490524792157974 s; decode predict time: 0.004927645477594114 s; decode post time: 0.012841987049976206 s 2025-05-21 15:33:51,702 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.6764817237854 s; generated tokens: 512 tokens; generate speed: 52.911793213174974 tokens/s 2025-05-21 15:33:51,703 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016074180603027344 s; prefill predict time: 0.017919540405273438 s; prefill post time: 0.014188289642333984 s; decode prepare time: 0.0010856821579009353 s; decode predict time: 0.005021394000333898 s; decode post time: 0.012711001002391958 s 2025-05-21 15:33:51,703 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001458883285522461 s; prefill predict time: 0.018915176391601562 s; prefill post time: 0.014786720275878906 s; decode prepare time: 0.0010523483477926534 s; decode predict time: 0.004619252447988473 s; decode post time: 0.01315009337348714 s 2025-05-21 15:33:51,703 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014488697052001953 s; prefill predict time: 0.01776909828186035 s; prefill post time: 0.014760255813598633 s; decode prepare time: 0.0010896153645972683 s; decode predict time: 0.004404326981189205 s; decode post time: 0.013324096478128154 s 2025-05-21 15:33:51,703 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,704 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,704 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.68312406539917 2025-05-21 15:33:51,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.683377265930176 2025-05-21 15:33:51,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,704 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:33:51,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.682968854904175 2025-05-21 15:33:51,704 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:33:51,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.683414220809937 2025-05-21 15:33:51,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:33:51,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:33:51,708 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,708 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,708 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,709 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,709 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,709 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:33:51,709 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,709 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,709 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,710 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,710 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:33:51,710 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,710 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:33:51,710 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,710 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:33:51,711 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,128 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.560893297195435 s; generated tokens: 512 tokens; generate speed: 53.55148144475042 tokens/s 2025-05-21 15:34:01,129 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.560696125030518 s; generated tokens: 512 tokens; generate speed: 53.55258584775549 tokens/s 2025-05-21 15:34:01,129 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001567840576171875 s; prefill predict time: 0.010478496551513672 s; prefill post time: 0.014489412307739258 s; decode prepare time: 0.0009881731582014528 s; decode predict time: 0.005036607443117628 s; decode post time: 0.012585743066149449 s 2025-05-21 15:34:01,129 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.560733795166016 s; generated tokens: 512 tokens; generate speed: 53.55237484583781 tokens/s 2025-05-21 15:34:01,129 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.561019659042358 s; generated tokens: 512 tokens; generate speed: 53.550773689265945 tokens/s 2025-05-21 15:34:01,130 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015647411346435547 s; prefill predict time: 0.010038375854492188 s; prefill post time: 0.015100955963134766 s; decode prepare time: 0.0009828248369250512 s; decode predict time: 0.004947187853794472 s; decode post time: 0.012683242734397695 s 2025-05-21 15:34:01,130 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014362335205078125 s; prefill predict time: 0.009589195251464844 s; prefill post time: 0.01512908935546875 s; decode prepare time: 0.0010395689253237849 s; decode predict time: 0.004425613552916284 s; decode post time: 0.013148127703284097 s 2025-05-21 15:34:01,130 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015463829040527344 s; prefill predict time: 0.009722709655761719 s; prefill post time: 0.015627384185791016 s; decode prepare time: 0.0010579056935767605 s; decode predict time: 0.004490486780802409 s; decode post time: 0.013064236090374553 s 2025-05-21 15:34:01,130 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,130 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.56780195236206 2025-05-21 15:34:01,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.567797422409058 2025-05-21 15:34:01,131 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,131 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.568016529083252 2025-05-21 15:34:01,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.568020105361938 2025-05-21 15:34:01,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,134 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,134 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,134 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,135 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,135 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,136 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,136 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,136 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,136 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,136 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,136 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,136 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,137 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,137 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,137 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,137 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,137 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,138 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,138 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,321 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.61023235321045 s; generated tokens: 512 tokens; generate speed: 53.276547452982065 tokens/s 2025-05-21 15:34:01,321 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.610055208206177 s; generated tokens: 512 tokens; generate speed: 53.27752951541789 tokens/s 2025-05-21 15:34:01,321 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.610572814941406 s; generated tokens: 512 tokens; generate speed: 53.27466009143614 tokens/s 2025-05-21 15:34:01,321 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.60971212387085 s; generated tokens: 512 tokens; generate speed: 53.279431620867676 tokens/s 2025-05-21 15:34:01,322 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015959739685058594 s; prefill predict time: 0.008951663970947266 s; prefill post time: 0.015002250671386719 s; decode prepare time: 0.001098869831361416 s; decode predict time: 0.004906171443415623 s; decode post time: 0.012692057223236025 s 2025-05-21 15:34:01,322 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014443397521972656 s; prefill predict time: 0.008937358856201172 s; prefill post time: 0.01512289047241211 s; decode prepare time: 0.0010719154678912079 s; decode predict time: 0.004469498933530321 s; decode post time: 0.01315885747248879 s 2025-05-21 15:34:01,322 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001611471176147461 s; prefill predict time: 0.008455276489257812 s; prefill post time: 0.014786720275878906 s; decode prepare time: 0.0010832475822732408 s; decode predict time: 0.004312311901765711 s; decode post time: 0.01330490914809494 s 2025-05-21 15:34:01,322 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001477956771850586 s; prefill predict time: 0.009216785430908203 s; prefill post time: 0.01442861557006836 s; decode prepare time: 0.0010412980432379735 s; decode predict time: 0.00483866579392377 s; decode post time: 0.012818211445369833 s 2025-05-21 15:34:01,323 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,323 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,323 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,323 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.61734414100647 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.617210149765015 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.616931676864624 2025-05-21 15:34:01,323 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.61765456199646 2025-05-21 15:34:01,324 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,324 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,324 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,324 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,325 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:01,327 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,328 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:01,329 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,329 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:01,329 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,329 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,329 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:01,330 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,779 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.641974449157715 s; generated tokens: 512 tokens; generate speed: 53.10115710218734 tokens/s 2025-05-21 15:34:10,779 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.641469955444336 s; generated tokens: 512 tokens; generate speed: 53.10393564115027 tokens/s 2025-05-21 15:34:10,780 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014917850494384766 s; prefill predict time: 0.0066852569580078125 s; prefill post time: 0.014170408248901367 s; decode prepare time: 0.000997625451731822 s; decode predict time: 0.005181787528243719 s; decode post time: 0.012599913807997732 s 2025-05-21 15:34:10,780 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.641824960708618 s; generated tokens: 512 tokens; generate speed: 53.101980391310796 tokens/s 2025-05-21 15:34:10,780 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014743804931640625 s; prefill predict time: 0.006634235382080078 s; prefill post time: 0.01388239860534668 s; decode prepare time: 0.0009888496884161246 s; decode predict time: 0.005086591196995155 s; decode post time: 0.012705685807767446 s 2025-05-21 15:34:10,780 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014095306396484375 s; prefill predict time: 0.006150245666503906 s; prefill post time: 0.014567375183105469 s; decode prepare time: 0.00105529242065788 s; decode predict time: 0.004568114000208238 s; decode post time: 0.01315706210127082 s 2025-05-21 15:34:10,781 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,781 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,781 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,781 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,781 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.649141788482666 2025-05-21 15:34:10,781 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,782 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.648980140686035 2025-05-21 15:34:10,782 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,782 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.64309811592102 s; generated tokens: 512 tokens; generate speed: 53.09496946366997 tokens/s 2025-05-21 15:34:10,782 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.648975372314453 2025-05-21 15:34:10,782 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014600753784179688 s; prefill predict time: 0.005509138107299805 s; prefill post time: 0.014347314834594727 s; decode prepare time: 0.0010508455175709584 s; decode predict time: 0.004520096498377183 s; decode post time: 0.013213690013101655 s 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,783 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,783 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,784 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,784 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,784 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,784 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,784 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.650461435317993 2025-05-21 15:34:10,784 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,785 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,786 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,786 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,786 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,786 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,786 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,786 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,787 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,787 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,787 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,787 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,787 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,787 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,788 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,788 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,788 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,789 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,789 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,790 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,790 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,977 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.64749264717102 s; generated tokens: 512 tokens; generate speed: 53.07078416381444 tokens/s 2025-05-21 15:34:10,978 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.647751331329346 s; generated tokens: 512 tokens; generate speed: 53.069361182369164 tokens/s 2025-05-21 15:34:10,978 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015530586242675781 s; prefill predict time: 0.00999903678894043 s; prefill post time: 0.014010190963745117 s; decode prepare time: 0.0011310428089358335 s; decode predict time: 0.004851516087849935 s; decode post time: 0.012794259942907643 s 2025-05-21 15:34:10,978 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.648255348205566 s; generated tokens: 512 tokens; generate speed: 53.066588882851704 tokens/s 2025-05-21 15:34:10,978 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.648470878601074 s; generated tokens: 512 tokens; generate speed: 53.06540346569762 tokens/s 2025-05-21 15:34:10,978 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001524209976196289 s; prefill predict time: 0.010408878326416016 s; prefill post time: 0.014553308486938477 s; decode prepare time: 0.0010910547409505526 s; decode predict time: 0.004481627894382851 s; decode post time: 0.013208673891489286 s 2025-05-21 15:34:10,979 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014154911041259766 s; prefill predict time: 0.010284900665283203 s; prefill post time: 0.014176130294799805 s; decode prepare time: 0.001045828462813465 s; decode predict time: 0.004841268296335258 s; decode post time: 0.01289384892308782 s 2025-05-21 15:34:10,979 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015659332275390625 s; prefill predict time: 0.010155439376831055 s; prefill post time: 0.01485443115234375 s; decode prepare time: 0.0010992127621943704 s; decode predict time: 0.004297399520874023 s; decode post time: 0.013386162293167263 s 2025-05-21 15:34:10,979 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,979 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,979 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,979 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.654701948165894 2025-05-21 15:34:10,980 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,980 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,980 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:10,980 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.654874801635742 2025-05-21 15:34:10,980 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,980 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:10,980 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.655286073684692 2025-05-21 15:34:10,980 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.655342102050781 2025-05-21 15:34:10,981 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,981 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,981 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,981 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,981 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,981 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:10,981 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,982 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:10,984 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,984 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,984 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,984 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,984 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:10,985 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,985 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,985 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,985 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,985 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:10,985 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,986 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:10,986 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,986 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,986 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:10,986 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,452 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.66344666481018 s; generated tokens: 512 tokens; generate speed: 52.98316612689218 tokens/s 2025-05-21 15:34:20,452 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.661150217056274 s; generated tokens: 512 tokens; generate speed: 52.99576018351208 tokens/s 2025-05-21 15:34:20,452 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.66378378868103 s; generated tokens: 512 tokens; generate speed: 52.98131779393636 tokens/s 2025-05-21 15:34:20,452 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.663529396057129 s; generated tokens: 512 tokens; generate speed: 52.982712528292616 tokens/s 2025-05-21 15:34:20,452 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014948844909667969 s; prefill predict time: 0.010620355606079102 s; prefill post time: 0.014496326446533203 s; decode prepare time: 0.0009890694440927523 s; decode predict time: 0.005079193208731857 s; decode post time: 0.012746256624882468 s 2025-05-21 15:34:20,452 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001470327377319336 s; prefill predict time: 0.008039474487304688 s; prefill post time: 0.014765262603759766 s; decode prepare time: 0.001041714225963137 s; decode predict time: 0.004561552814408845 s; decode post time: 0.01321025743876418 s 2025-05-21 15:34:20,453 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015196800231933594 s; prefill predict time: 0.010654926300048828 s; prefill post time: 0.013713598251342773 s; decode prepare time: 0.0009864990958495616 s; decode predict time: 0.005221448692620969 s; decode post time: 0.012606723434537824 s 2025-05-21 15:34:20,453 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013804435729980469 s; prefill predict time: 0.010954856872558594 s; prefill post time: 0.014708280563354492 s; decode prepare time: 0.0010559857474847782 s; decode predict time: 0.004403398551192938 s; decode post time: 0.01335253174290965 s 2025-05-21 15:34:20,453 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,453 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,454 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,454 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.670669555664062 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.668174982070923 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.67095136642456 2025-05-21 15:34:20,454 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.670560836791992 2025-05-21 15:34:20,455 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,455 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,455 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,455 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,455 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,455 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,455 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,456 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,458 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,458 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,459 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,460 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,460 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,460 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,460 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,587 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.600711107254028 s; generated tokens: 512 tokens; generate speed: 53.32938303009109 tokens/s 2025-05-21 15:34:20,588 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.601932048797607 s; generated tokens: 512 tokens; generate speed: 53.32260188865997 tokens/s 2025-05-21 15:34:20,588 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014963150024414062 s; prefill predict time: 0.01349639892578125 s; prefill post time: 0.014106273651123047 s; decode prepare time: 0.001053636088063339 s; decode predict time: 0.004753439566668342 s; decode post time: 0.012873654729466149 s 2025-05-21 15:34:20,588 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.601436614990234 s; generated tokens: 512 tokens; generate speed: 53.325353333129385 tokens/s 2025-05-21 15:34:20,588 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.60189700126648 s; generated tokens: 512 tokens; generate speed: 53.32279651952815 tokens/s 2025-05-21 15:34:20,588 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015833377838134766 s; prefill predict time: 0.014130592346191406 s; prefill post time: 0.014081478118896484 s; decode prepare time: 0.0011194428819033032 s; decode predict time: 0.004778071478301404 s; decode post time: 0.01278248971689004 s 2025-05-21 15:34:20,589 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014994144439697266 s; prefill predict time: 0.014629602432250977 s; prefill post time: 0.014829874038696289 s; decode prepare time: 0.0010965033753277504 s; decode predict time: 0.004221805404214298 s; decode post time: 0.013364254845098506 s 2025-05-21 15:34:20,589 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001425027847290039 s; prefill predict time: 0.014089107513427734 s; prefill post time: 0.014788150787353516 s; decode prepare time: 0.0010868761170866905 s; decode predict time: 0.0044094099717981675 s; decode post time: 0.013185972803492835 s 2025-05-21 15:34:20,589 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,589 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,589 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.607619285583496 2025-05-21 15:34:20,589 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,590 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,590 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,590 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.60889744758606 2025-05-21 15:34:20,590 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:20,590 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,590 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:20,590 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.608465194702148 2025-05-21 15:34:20,590 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.609017848968506 2025-05-21 15:34:20,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,591 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,592 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:20,593 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:20,594 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,594 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,594 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,595 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,595 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,595 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,595 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,595 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:20,596 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,596 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,596 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:20,596 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,596 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:20,596 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,597 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:20,597 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,083 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.622953176498413 s; generated tokens: 512 tokens; generate speed: 53.206119847951484 tokens/s 2025-05-21 15:34:30,084 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.623153924942017 s; generated tokens: 512 tokens; generate speed: 53.205009916027606 tokens/s 2025-05-21 15:34:30,084 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014832019805908203 s; prefill predict time: 0.0058956146240234375 s; prefill post time: 0.014546632766723633 s; decode prepare time: 0.0009821501730006268 s; decode predict time: 0.00507351837906183 s; decode post time: 0.012690325305886464 s 2025-05-21 15:34:30,084 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.623408317565918 s; generated tokens: 512 tokens; generate speed: 53.20360345361527 tokens/s 2025-05-21 15:34:30,084 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.623714208602905 s; generated tokens: 512 tokens; generate speed: 53.201912369998375 tokens/s 2025-05-21 15:34:30,084 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014576911926269531 s; prefill predict time: 0.005631446838378906 s; prefill post time: 0.013707399368286133 s; decode prepare time: 0.0009763903347246801 s; decode predict time: 0.005203066620172239 s; decode post time: 0.012565893203078417 s 2025-05-21 15:34:30,085 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014595985412597656 s; prefill predict time: 0.005599021911621094 s; prefill post time: 0.014514684677124023 s; decode prepare time: 0.0010247865068469262 s; decode predict time: 0.004589001805174585 s; decode post time: 0.013132935634097941 s 2025-05-21 15:34:30,085 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,085 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014889240264892578 s; prefill predict time: 0.005649566650390625 s; prefill post time: 0.014838695526123047 s; decode prepare time: 0.0010517100765280527 s; decode predict time: 0.004389805419772279 s; decode post time: 0.01330341284988911 s 2025-05-21 15:34:30,085 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,085 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.630080223083496 2025-05-21 15:34:30,085 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,086 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,086 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,086 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.630340576171875 2025-05-21 15:34:30,086 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,086 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,086 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,086 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.630769968032837 2025-05-21 15:34:30,086 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.630850791931152 2025-05-21 15:34:30,087 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,087 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,087 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,087 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,087 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,087 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,087 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,090 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,090 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,090 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,090 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,091 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,091 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,091 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,091 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,091 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,091 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,091 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,092 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,092 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,092 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,092 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,093 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,155 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.558561563491821 s; generated tokens: 512 tokens; generate speed: 53.56454489507542 tokens/s 2025-05-21 15:34:30,156 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.557950973510742 s; generated tokens: 512 tokens; generate speed: 53.567966755529056 tokens/s 2025-05-21 15:34:30,156 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0017380714416503906 s; prefill predict time: 0.006899595260620117 s; prefill post time: 0.013531923294067383 s; decode prepare time: 0.0010957507937854984 s; decode predict time: 0.004834345275280523 s; decode post time: 0.012681686248331388 s 2025-05-21 15:34:30,156 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.559564590454102 s; generated tokens: 512 tokens; generate speed: 53.55892469320915 tokens/s 2025-05-21 15:34:30,156 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.558694124221802 s; generated tokens: 512 tokens; generate speed: 53.56380205770872 tokens/s 2025-05-21 15:34:30,156 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014989376068115234 s; prefill predict time: 0.006128072738647461 s; prefill post time: 0.014192819595336914 s; decode prepare time: 0.0010795929193963276 s; decode predict time: 0.004415368566326066 s; decode post time: 0.013120635381650085 s 2025-05-21 15:34:30,156 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001428365707397461 s; prefill predict time: 0.006517887115478516 s; prefill post time: 0.014324426651000977 s; decode prepare time: 0.0010909964193803225 s; decode predict time: 0.004224030176798502 s; decode post time: 0.013300875161725248 s 2025-05-21 15:34:30,156 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015289783477783203 s; prefill predict time: 0.007538557052612305 s; prefill post time: 0.013621330261230469 s; decode prepare time: 0.001044032625022933 s; decode predict time: 0.004721841625138825 s; decode post time: 0.012848124345454685 s 2025-05-21 15:34:30,157 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.56591010093689 2025-05-21 15:34:30,157 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,157 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,158 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.565777063369751 2025-05-21 15:34:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.566078424453735 2025-05-21 15:34:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.567081212997437 2025-05-21 15:34:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:30,162 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,162 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,162 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,163 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:30,164 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,164 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:30,164 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,164 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:30,165 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,666 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.501474142074585 s; generated tokens: 512 tokens; generate speed: 53.88637513970102 tokens/s 2025-05-21 15:34:39,667 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.502472639083862 s; generated tokens: 512 tokens; generate speed: 53.88071288878367 tokens/s 2025-05-21 15:34:39,667 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.50291919708252 s; generated tokens: 512 tokens; generate speed: 53.87818094435534 tokens/s 2025-05-21 15:34:39,667 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013663768768310547 s; prefill predict time: 0.006832122802734375 s; prefill post time: 0.013434886932373047 s; decode prepare time: 0.0010366117884268266 s; decode predict time: 0.004626578910678041 s; decode post time: 0.012837734707647574 s 2025-05-21 15:34:39,667 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.50286340713501 s; generated tokens: 512 tokens; generate speed: 53.87849725542476 tokens/s 2025-05-21 15:34:39,668 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015320777893066406 s; prefill predict time: 0.0077419281005859375 s; prefill post time: 0.014332771301269531 s; decode prepare time: 0.0010550451372001036 s; decode predict time: 0.0043959229600195795 s; decode post time: 0.013053312693556694 s 2025-05-21 15:34:39,668 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015707015991210938 s; prefill predict time: 0.007830619812011719 s; prefill post time: 0.013587236404418945 s; decode prepare time: 0.0010730655692561498 s; decode predict time: 0.004765533933452531 s; decode post time: 0.012664134721699992 s 2025-05-21 15:34:39,668 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015587806701660156 s; prefill predict time: 0.007330417633056641 s; prefill post time: 0.014171838760375977 s; decode prepare time: 0.0010963391417859818 s; decode predict time: 0.004105621693181056 s; decode post time: 0.013303239751468666 s 2025-05-21 15:34:39,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.50893759727478 2025-05-21 15:34:39,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,669 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.509296417236328 2025-05-21 15:34:39,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.509904861450195 2025-05-21 15:34:39,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.509873390197754 2025-05-21 15:34:39,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,672 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,674 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,675 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,675 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,676 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,704 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.612281799316406 s; generated tokens: 512 tokens; generate speed: 53.26518829654076 tokens/s 2025-05-21 15:34:39,704 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.611690044403076 s; generated tokens: 512 tokens; generate speed: 53.26846763001264 tokens/s 2025-05-21 15:34:39,705 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015316009521484375 s; prefill predict time: 0.0069043636322021484 s; prefill post time: 0.013650178909301758 s; decode prepare time: 0.0009812706837215536 s; decode predict time: 0.005083063069511862 s; decode post time: 0.012659681753169767 s 2025-05-21 15:34:39,705 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015478134155273438 s; prefill predict time: 0.006245136260986328 s; prefill post time: 0.013559579849243164 s; decode prepare time: 0.0009782006362646293 s; decode predict time: 0.005192005868051566 s; decode post time: 0.012552071924078954 s 2025-05-21 15:34:39,705 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.611713409423828 s; generated tokens: 512 tokens; generate speed: 53.268338140212165 tokens/s 2025-05-21 15:34:39,705 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.61239743232727 s; generated tokens: 512 tokens; generate speed: 53.264547539212494 tokens/s 2025-05-21 15:34:39,705 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013546943664550781 s; prefill predict time: 0.005624055862426758 s; prefill post time: 0.014793157577514648 s; decode prepare time: 0.0010187420593082556 s; decode predict time: 0.004616819643506817 s; decode post time: 0.01308867049543825 s 2025-05-21 15:34:39,706 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,706 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015294551849365234 s; prefill predict time: 0.006220340728759766 s; prefill post time: 0.014608621597290039 s; decode prepare time: 0.0010485775083478417 s; decode predict time: 0.0043778349371517405 s; decode post time: 0.013296317214368608 s 2025-05-21 15:34:39,706 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.61843204498291 2025-05-21 15:34:39,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.619205236434937 2025-05-21 15:34:39,706 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,707 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:39,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.618910551071167 2025-05-21 15:34:39,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:39,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.619253396987915 2025-05-21 15:34:39,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:39,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:39,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:39,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:39,710 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,710 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,711 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,711 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,711 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,711 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,711 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,712 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:39,712 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,712 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,712 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,712 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,712 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:39,713 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:39,713 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:39,713 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,179 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.503460168838501 s; generated tokens: 512 tokens; generate speed: 53.87511400098559 tokens/s 2025-05-21 15:34:49,179 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016024112701416016 s; prefill predict time: 0.007773876190185547 s; prefill post time: 0.013998270034790039 s; decode prepare time: 0.0010840962777632322 s; decode predict time: 0.004742273629880419 s; decode post time: 0.012677601405552455 s 2025-05-21 15:34:49,179 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.504380464553833 s; generated tokens: 512 tokens; generate speed: 53.869897349909486 tokens/s 2025-05-21 15:34:49,180 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.503975629806519 s; generated tokens: 512 tokens; generate speed: 53.87219201133655 tokens/s 2025-05-21 15:34:49,180 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.504472255706787 s; generated tokens: 512 tokens; generate speed: 53.86937709167165 tokens/s 2025-05-21 15:34:49,180 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.00141143798828125 s; prefill predict time: 0.008194208145141602 s; prefill post time: 0.013885736465454102 s; decode prepare time: 0.00104222139033786 s; decode predict time: 0.00461122568915872 s; decode post time: 0.012852619305515477 s 2025-05-21 15:34:49,180 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015864372253417969 s; prefill predict time: 0.007543087005615234 s; prefill post time: 0.014219284057617188 s; decode prepare time: 0.001090783662292122 s; decode predict time: 0.004114408119052064 s; decode post time: 0.013302252483927806 s 2025-05-21 15:34:49,180 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,181 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016834735870361328 s; prefill predict time: 0.007750034332275391 s; prefill post time: 0.014236927032470703 s; decode prepare time: 0.0010574708479491233 s; decode predict time: 0.004377980325736251 s; decode post time: 0.013071709649670147 s 2025-05-21 15:34:49,181 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,181 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.510274410247803 2025-05-21 15:34:49,181 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,181 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,181 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,182 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.511341094970703 2025-05-21 15:34:49,182 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,182 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,182 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.510715007781982 2025-05-21 15:34:49,182 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,182 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.511594533920288 2025-05-21 15:34:49,182 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,183 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,184 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,184 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,184 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,184 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,184 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,184 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,185 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,186 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,186 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,186 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,186 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,186 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,186 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,187 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,187 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,187 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,187 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,187 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,187 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,187 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,188 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,188 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,308 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.595924139022827 s; generated tokens: 512 tokens; generate speed: 53.35598662331005 tokens/s 2025-05-21 15:34:49,308 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.595475435256958 s; generated tokens: 512 tokens; generate speed: 53.35848165675483 tokens/s 2025-05-21 15:34:49,308 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.596356868743896 s; generated tokens: 512 tokens; generate speed: 53.35358063512884 tokens/s 2025-05-21 15:34:49,309 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001531362533569336 s; prefill predict time: 0.007570743560791016 s; prefill post time: 0.014336824417114258 s; decode prepare time: 0.0009947037743495634 s; decode predict time: 0.004895566491519704 s; decode post time: 0.012799545743460757 s 2025-05-21 15:34:49,309 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.594861268997192 s; generated tokens: 512 tokens; generate speed: 53.3618971286608 tokens/s 2025-05-21 15:34:49,309 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013802051544189453 s; prefill predict time: 0.006758928298950195 s; prefill post time: 0.014462471008300781 s; decode prepare time: 0.0010236084111282736 s; decode predict time: 0.004518934324675915 s; decode post time: 0.013148897081438576 s 2025-05-21 15:34:49,309 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014884471893310547 s; prefill predict time: 0.0076656341552734375 s; prefill post time: 0.013813495635986328 s; decode prepare time: 0.0009700197529652814 s; decode predict time: 0.005158086851531384 s; decode post time: 0.012560934003318593 s 2025-05-21 15:34:49,310 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014901161193847656 s; prefill predict time: 0.00579380989074707 s; prefill post time: 0.014370918273925781 s; decode prepare time: 0.0010483353571882454 s; decode predict time: 0.004364388596777823 s; decode post time: 0.013276479482184184 s 2025-05-21 15:34:49,310 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,310 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,310 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,310 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,310 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.602306127548218 2025-05-21 15:34:49,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,311 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:49,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.603123903274536 2025-05-21 15:34:49,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.603227376937866 2025-05-21 15:34:49,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:49,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.602360486984253 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:49,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:49,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:49,315 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,315 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,315 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,315 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:49,315 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:49,316 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,317 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,317 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:49,317 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,703 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.515985488891602 s; generated tokens: 512 tokens; generate speed: 53.804201424821265 tokens/s 2025-05-21 15:34:58,703 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.515701532363892 s; generated tokens: 512 tokens; generate speed: 53.805806987391804 tokens/s 2025-05-21 15:34:58,704 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015175342559814453 s; prefill predict time: 0.007455587387084961 s; prefill post time: 0.014217615127563477 s; decode prepare time: 0.0010866904212070772 s; decode predict time: 0.004773477479523304 s; decode post time: 0.01266818550468191 s 2025-05-21 15:34:58,704 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.515196800231934 s; generated tokens: 512 tokens; generate speed: 53.8086611080414 tokens/s 2025-05-21 15:34:58,704 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.516266345977783 s; generated tokens: 512 tokens; generate speed: 53.80261348153688 tokens/s 2025-05-21 15:34:58,704 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014758110046386719 s; prefill predict time: 0.0069959163665771484 s; prefill post time: 0.01387476921081543 s; decode prepare time: 0.001044130138688358 s; decode predict time: 0.004664213517132927 s; decode post time: 0.012821557703783367 s 2025-05-21 15:34:58,705 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015146732330322266 s; prefill predict time: 0.006975412368774414 s; prefill post time: 0.014503717422485352 s; decode prepare time: 0.0010801486073174823 s; decode predict time: 0.004136438930735869 s; decode post time: 0.013317123551191416 s 2025-05-21 15:34:58,705 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,705 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014183521270751953 s; prefill predict time: 0.005855560302734375 s; prefill post time: 0.014260530471801758 s; decode prepare time: 0.001058923754906701 s; decode predict time: 0.00439770689197615 s; decode post time: 0.013075510816331478 s 2025-05-21 15:34:58,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,705 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.522724628448486 2025-05-21 15:34:58,705 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.522616863250732 2025-05-21 15:34:58,706 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,706 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.523014545440674 2025-05-21 15:34:58,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.522497653961182 2025-05-21 15:34:58,706 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,707 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,710 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,710 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,710 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,711 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,712 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,712 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,712 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,712 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,712 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,917 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.600059986114502 s; generated tokens: 512 tokens; generate speed: 53.333000079224014 tokens/s 2025-05-21 15:34:58,918 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013203620910644531 s; prefill predict time: 0.006285667419433594 s; prefill post time: 0.013374090194702148 s; decode prepare time: 0.0009717656674916963 s; decode predict time: 0.005167104216182933 s; decode post time: 0.012560373183099258 s 2025-05-21 15:34:58,918 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.601250171661377 s; generated tokens: 512 tokens; generate speed: 53.32638883956971 tokens/s 2025-05-21 15:34:58,918 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.600813865661621 s; generated tokens: 512 tokens; generate speed: 53.32881224072315 tokens/s 2025-05-21 15:34:58,918 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.601313352584839 s; generated tokens: 512 tokens; generate speed: 53.32603792815082 tokens/s 2025-05-21 15:34:58,919 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013191699981689453 s; prefill predict time: 0.006220102310180664 s; prefill post time: 0.014098405838012695 s; decode prepare time: 0.000989408642345212 s; decode predict time: 0.004918620165656595 s; decode post time: 0.012793405181974348 s 2025-05-21 15:34:58,919 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014472007751464844 s; prefill predict time: 0.006651878356933594 s; prefill post time: 0.01425933837890625 s; decode prepare time: 0.0010183883973063788 s; decode predict time: 0.0045530814750521795 s; decode post time: 0.013131591905119835 s 2025-05-21 15:34:58,919 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,919 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013954639434814453 s; prefill predict time: 0.006267547607421875 s; prefill post time: 0.01453709602355957 s; decode prepare time: 0.0010505660406064147 s; decode predict time: 0.004385819154627183 s; decode post time: 0.013263685595965899 s 2025-05-21 15:34:58,919 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,919 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.607104301452637 2025-05-21 15:34:58,920 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,920 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,920 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,920 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,920 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:34:58,920 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.607770681381226 2025-05-21 15:34:58,920 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.607951641082764 2025-05-21 15:34:58,920 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:34:58,920 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.607831478118896 2025-05-21 15:34:58,920 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,921 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,921 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,921 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,921 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,921 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:34:58,922 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:34:58,924 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,924 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,924 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,924 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:34:58,925 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,926 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:34:58,926 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,926 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:34:58,926 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,251 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.538091659545898 s; generated tokens: 512 tokens; generate speed: 53.67950091857011 tokens/s 2025-05-21 15:35:08,251 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.538270950317383 s; generated tokens: 512 tokens; generate speed: 53.67849190559672 tokens/s 2025-05-21 15:35:08,251 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.539293766021729 s; generated tokens: 512 tokens; generate speed: 53.67273642664269 tokens/s 2025-05-21 15:35:08,251 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.538491010665894 s; generated tokens: 512 tokens; generate speed: 53.67725350136454 tokens/s 2025-05-21 15:35:08,251 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001405477523803711 s; prefill predict time: 0.006157636642456055 s; prefill post time: 0.0142974853515625 s; decode prepare time: 0.0010645450211317804 s; decode predict time: 0.004410504359824985 s; decode post time: 0.013101266554890313 s 2025-05-21 15:35:08,252 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001615285873413086 s; prefill predict time: 0.007089853286743164 s; prefill post time: 0.014042139053344727 s; decode prepare time: 0.0010842843065056549 s; decode predict time: 0.004800106497371898 s; decode post time: 0.01268974442304697 s 2025-05-21 15:35:08,251 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014264583587646484 s; prefill predict time: 0.00593256950378418 s; prefill post time: 0.014203786849975586 s; decode prepare time: 0.0010385065396004694 s; decode predict time: 0.00472423188826617 s; decode post time: 0.012812800136797583 s 2025-05-21 15:35:08,252 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014042854309082031 s; prefill predict time: 0.005903482437133789 s; prefill post time: 0.01493382453918457 s; decode prepare time: 0.0010809338489390632 s; decode predict time: 0.004221407572428385 s; decode post time: 0.013275210404815973 s 2025-05-21 15:35:08,252 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,252 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,252 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,253 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.5449538230896 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.545361042022705 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.546048641204834 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,253 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.54542064666748 2025-05-21 15:35:08,254 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,254 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,254 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,254 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,255 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,257 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,257 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,258 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,259 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,259 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,259 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,259 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,259 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,259 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,461 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.534770965576172 s; generated tokens: 512 tokens; generate speed: 53.698195986930095 tokens/s 2025-05-21 15:35:08,461 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.535644054412842 s; generated tokens: 512 tokens; generate speed: 53.69327935044514 tokens/s 2025-05-21 15:35:08,462 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.534931182861328 s; generated tokens: 512 tokens; generate speed: 53.697293685800304 tokens/s 2025-05-21 15:35:08,462 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013988018035888672 s; prefill predict time: 0.009259700775146484 s; prefill post time: 0.014244794845581055 s; decode prepare time: 0.0009942955233579512 s; decode predict time: 0.004792937577939501 s; decode post time: 0.012773505862211761 s 2025-05-21 15:35:08,462 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.535380363464355 s; generated tokens: 512 tokens; generate speed: 53.69476418180158 tokens/s 2025-05-21 15:35:08,462 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013856887817382812 s; prefill predict time: 0.009654045104980469 s; prefill post time: 0.013499975204467773 s; decode prepare time: 0.0009770729303826557 s; decode predict time: 0.005041097192203297 s; decode post time: 0.01254377533078427 s 2025-05-21 15:35:08,462 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013735294342041016 s; prefill predict time: 0.009379863739013672 s; prefill post time: 0.014531135559082031 s; decode prepare time: 0.0010474418707323168 s; decode predict time: 0.004392238691741345 s; decode post time: 0.013122374297588071 s 2025-05-21 15:35:08,462 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013549327850341797 s; prefill predict time: 0.009157896041870117 s; prefill post time: 0.014457464218139648 s; decode prepare time: 0.0010279078772856532 s; decode predict time: 0.004410543628767425 s; decode post time: 0.01312531090529231 s 2025-05-21 15:35:08,463 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,463 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,463 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,463 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.541407823562622 2025-05-21 15:35:08,463 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,463 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,463 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,463 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.5424222946167 2025-05-21 15:35:08,463 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:08,463 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.541435718536377 2025-05-21 15:35:08,464 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:08,464 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.542192459106445 2025-05-21 15:35:08,464 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,464 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,465 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,466 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,466 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,466 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:08,466 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:08,466 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:08,467 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,468 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,468 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,468 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,468 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,468 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,468 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,468 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:08,469 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,469 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,469 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,469 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:08,469 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,469 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:08,469 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:08,470 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:17,851 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.59161639213562 s; generated tokens: 512 tokens; generate speed: 53.379949642252186 tokens/s 2025-05-21 15:35:17,851 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.591873407363892 s; generated tokens: 512 tokens; generate speed: 53.378519321045914 tokens/s 2025-05-21 15:35:17,851 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.592073440551758 s; generated tokens: 512 tokens; generate speed: 53.37740616491241 tokens/s 2025-05-21 15:35:17,852 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.592129707336426 s; generated tokens: 512 tokens; generate speed: 53.377093056654864 tokens/s 2025-05-21 15:35:17,852 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014615058898925781 s; prefill predict time: 0.00704503059387207 s; prefill post time: 0.013774394989013672 s; decode prepare time: 0.0011034207801296286 s; decode predict time: 0.004829764833637312 s; decode post time: 0.01274340073423143 s 2025-05-21 15:35:17,852 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016274452209472656 s; prefill predict time: 0.006984710693359375 s; prefill post time: 0.013920068740844727 s; decode prepare time: 0.0010348896691010656 s; decode predict time: 0.004777149125641467 s; decode post time: 0.012866131246906437 s 2025-05-21 15:35:17,852 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001641988754272461 s; prefill predict time: 0.007342338562011719 s; prefill post time: 0.014202356338500977 s; decode prepare time: 0.0010734901502874267 s; decode predict time: 0.004489536846385283 s; decode post time: 0.013116809486642976 s 2025-05-21 15:35:17,852 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015435218811035156 s; prefill predict time: 0.007005929946899414 s; prefill post time: 0.014847993850708008 s; decode prepare time: 0.0010803557655349402 s; decode predict time: 0.004268479814716414 s; decode post time: 0.013332324018683686 s 2025-05-21 15:35:17,853 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:17,853 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:17,853 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:17,853 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:17,853 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:17,853 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:17,853 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:17,853 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.598775625228882 2025-05-21 15:35:17,853 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.598870754241943 2025-05-21 15:35:17,854 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.598955392837524 2025-05-21 15:35:17,854 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:17,854 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.598983764648438 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:17,855 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:17,856 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:17,856 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:17,858 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:17,858 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:17,858 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:17,858 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:17,858 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:17,858 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:17,858 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:17,859 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:17,860 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:18,087 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.616994857788086 s; generated tokens: 512 tokens; generate speed: 53.23908430556864 tokens/s 2025-05-21 15:35:18,087 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001379251480102539 s; prefill predict time: 0.006345033645629883 s; prefill post time: 0.013382196426391602 s; decode prepare time: 0.000975777258378419 s; decode predict time: 0.005185077237147911 s; decode post time: 0.012572599717082343 s 2025-05-21 15:35:18,087 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.616695880889893 s; generated tokens: 512 tokens; generate speed: 53.24073947450457 tokens/s 2025-05-21 15:35:18,087 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.617941617965698 s; generated tokens: 512 tokens; generate speed: 53.233843616145144 tokens/s 2025-05-21 15:35:18,088 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013632774353027344 s; prefill predict time: 0.00656890869140625 s; prefill post time: 0.013818740844726562 s; decode prepare time: 0.0010007183612442763 s; decode predict time: 0.00491636687634038 s; decode post time: 0.012817594169870515 s 2025-05-21 15:35:18,088 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013632774353027344 s; prefill predict time: 0.005318880081176758 s; prefill post time: 0.014364480972290039 s; decode prepare time: 0.0010237745109602896 s; decode predict time: 0.004552434939964145 s; decode post time: 0.013159931522526154 s 2025-05-21 15:35:18,088 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.618187665939331 s; generated tokens: 512 tokens; generate speed: 53.23248181288185 tokens/s 2025-05-21 15:35:18,088 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:18,088 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:18,089 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013370513916015625 s; prefill predict time: 0.0062329769134521484 s; prefill post time: 0.014615058898925781 s; decode prepare time: 0.0010488485870062722 s; decode predict time: 0.004442572593688965 s; decode post time: 0.013242585785001692 s 2025-05-21 15:35:18,089 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.623673677444458 2025-05-21 15:35:18,089 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:18,089 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:18,089 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:18,089 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:18,089 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.62483835220337 2025-05-21 15:35:18,089 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.6238272190094 2025-05-21 15:35:18,090 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:18,090 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:18,090 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:18,090 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.624942779541016 2025-05-21 15:35:18,090 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:18,090 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:18,090 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:18,090 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:18,091 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:18,092 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:18,092 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:18,093 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:18,093 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:18,093 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:18,094 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:18,095 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:18,095 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:18,095 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:18,095 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:18,096 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,482 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.621914625167847 s; generated tokens: 512 tokens; generate speed: 53.2118627056586 tokens/s 2025-05-21 15:35:27,482 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.621934175491333 s; generated tokens: 512 tokens; generate speed: 53.211754587154545 tokens/s 2025-05-21 15:35:27,482 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.622245073318481 s; generated tokens: 512 tokens; generate speed: 53.210035298282364 tokens/s 2025-05-21 15:35:27,482 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.622449398040771 s; generated tokens: 512 tokens; generate speed: 53.2089054273695 tokens/s 2025-05-21 15:35:27,482 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016040802001953125 s; prefill predict time: 0.007012128829956055 s; prefill post time: 0.013847112655639648 s; decode prepare time: 0.0010308953413991312 s; decode predict time: 0.004902500264784869 s; decode post time: 0.012803743030227095 s 2025-05-21 15:35:27,483 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016210079193115234 s; prefill predict time: 0.00714564323425293 s; prefill post time: 0.014340639114379883 s; decode prepare time: 0.0010754922131502931 s; decode predict time: 0.004407613885168936 s; decode post time: 0.013256232099290463 s 2025-05-21 15:35:27,483 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0017242431640625 s; prefill predict time: 0.0069844722747802734 s; prefill post time: 0.014178276062011719 s; decode prepare time: 0.0011053747860186253 s; decode predict time: 0.004866984778759526 s; decode post time: 0.012763525408541386 s 2025-05-21 15:35:27,483 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0017230510711669922 s; prefill predict time: 0.006987333297729492 s; prefill post time: 0.014376401901245117 s; decode prepare time: 0.0010716803153200392 s; decode predict time: 0.004552025420992982 s; decode post time: 0.013116146020460035 s 2025-05-21 15:35:27,483 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,483 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,483 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,483 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,484 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,484 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,484 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.628633975982666 2025-05-21 15:35:27,484 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,484 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.628497123718262 2025-05-21 15:35:27,484 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.628957986831665 2025-05-21 15:35:27,484 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,484 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.62917709350586 2025-05-21 15:35:27,485 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,485 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,485 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,485 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,485 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,485 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,486 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,488 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,488 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,488 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,488 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,489 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,490 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,490 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,490 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,490 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,619 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.52396035194397 s; generated tokens: 512 tokens; generate speed: 53.75914861882996 tokens/s 2025-05-21 15:35:27,619 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.523478031158447 s; generated tokens: 512 tokens; generate speed: 53.76187127484975 tokens/s 2025-05-21 15:35:27,619 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013332366943359375 s; prefill predict time: 0.0070421695709228516 s; prefill post time: 0.013811349868774414 s; decode prepare time: 0.0009868644221654843 s; decode predict time: 0.005068620513467227 s; decode post time: 0.01249431210721309 s 2025-05-21 15:35:27,619 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.523378372192383 s; generated tokens: 512 tokens; generate speed: 53.76243387483219 tokens/s 2025-05-21 15:35:27,620 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.524134635925293 s; generated tokens: 512 tokens; generate speed: 53.758164869774326 tokens/s 2025-05-21 15:35:27,620 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015294551849365234 s; prefill predict time: 0.006070137023925781 s; prefill post time: 0.013446807861328125 s; decode prepare time: 0.0010025030012933242 s; decode predict time: 0.004757928380779192 s; decode post time: 0.01279041538499806 s 2025-05-21 15:35:27,620 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015330314636230469 s; prefill predict time: 0.005724668502807617 s; prefill post time: 0.014256954193115234 s; decode prepare time: 0.0010509817567590164 s; decode predict time: 0.00439603188458611 s; decode post time: 0.013102125514976666 s 2025-05-21 15:35:27,620 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015025138854980469 s; prefill predict time: 0.0060346126556396484 s; prefill post time: 0.014345407485961914 s; decode prepare time: 0.0010284407031512773 s; decode predict time: 0.004439543742759555 s; decode post time: 0.013085204794448883 s 2025-05-21 15:35:27,620 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,621 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,621 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,621 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.530639410018921 2025-05-21 15:35:27,621 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,621 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,621 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:27,621 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.530492544174194 2025-05-21 15:35:27,622 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,622 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:27,622 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.530337810516357 2025-05-21 15:35:27,622 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.53091835975647 2025-05-21 15:35:27,622 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,622 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,622 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:27,623 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,624 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:27,624 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,624 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:27,625 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,626 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,626 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,626 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,626 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,626 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:27,626 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,627 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,627 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,627 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,627 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,627 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:27,627 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:27,628 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,628 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:27,628 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,000 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.510075807571411 s; generated tokens: 512 tokens; generate speed: 53.83763603570574 tokens/s 2025-05-21 15:35:37,000 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.509766340255737 s; generated tokens: 512 tokens; generate speed: 53.83938802288503 tokens/s 2025-05-21 15:35:37,000 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.510316371917725 s; generated tokens: 512 tokens; generate speed: 53.836274207643086 tokens/s 2025-05-21 15:35:37,000 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.509909391403198 s; generated tokens: 512 tokens; generate speed: 53.83857815331444 tokens/s 2025-05-21 15:35:37,001 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013508796691894531 s; prefill predict time: 0.007538795471191406 s; prefill post time: 0.013661861419677734 s; decode prepare time: 0.0010265571497191188 s; decode predict time: 0.004759782903334673 s; decode post time: 0.01273252259495207 s 2025-05-21 15:35:37,001 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014927387237548828 s; prefill predict time: 0.007066011428833008 s; prefill post time: 0.013717889785766602 s; decode prepare time: 0.0010847392147534515 s; decode predict time: 0.0046738021513995 s; decode post time: 0.012758576473377923 s 2025-05-21 15:35:37,001 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013616085052490234 s; prefill predict time: 0.007515668869018555 s; prefill post time: 0.01430201530456543 s; decode prepare time: 0.0010634966325853202 s; decode predict time: 0.004251768074783624 s; decode post time: 0.01320386866067487 s 2025-05-21 15:35:37,001 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013606548309326172 s; prefill predict time: 0.00705265998840332 s; prefill post time: 0.01422119140625 s; decode prepare time: 0.0010552532285626854 s; decode predict time: 0.004373967881296195 s; decode post time: 0.013089821529948314 s 2025-05-21 15:35:37,002 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,002 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.51667308807373 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.516379117965698 2025-05-21 15:35:37,002 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,002 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.517055988311768 2025-05-21 15:35:37,002 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.516766548156738 2025-05-21 15:35:37,003 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,003 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,003 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,003 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,003 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,004 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,006 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,006 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,007 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,007 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,007 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,007 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,007 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,007 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,007 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,008 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,008 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,008 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,008 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,008 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,008 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,009 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,141 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.514123439788818 s; generated tokens: 512 tokens; generate speed: 53.814731671314604 tokens/s 2025-05-21 15:35:37,141 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.51359224319458 s; generated tokens: 512 tokens; generate speed: 53.817736446109755 tokens/s 2025-05-21 15:35:37,142 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.513895750045776 s; generated tokens: 512 tokens; generate speed: 53.81601958351672 tokens/s 2025-05-21 15:35:37,142 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.513255834579468 s; generated tokens: 512 tokens; generate speed: 53.81963955378405 tokens/s 2025-05-21 15:35:37,142 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001352071762084961 s; prefill predict time: 0.007469892501831055 s; prefill post time: 0.013811826705932617 s; decode prepare time: 0.0009761976402566391 s; decode predict time: 0.005040232340494792 s; decode post time: 0.01251279211324255 s 2025-05-21 15:35:37,142 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016148090362548828 s; prefill predict time: 0.006294727325439453 s; prefill post time: 0.014082908630371094 s; decode prepare time: 0.0010061884579593188 s; decode predict time: 0.004783294715133368 s; decode post time: 0.012740478123704048 s 2025-05-21 15:35:37,142 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016255378723144531 s; prefill predict time: 0.006304025650024414 s; prefill post time: 0.014411449432373047 s; decode prepare time: 0.0010237269205589817 s; decode predict time: 0.0043605944689582374 s; decode post time: 0.01314735879170218 s 2025-05-21 15:35:37,142 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014204978942871094 s; prefill predict time: 0.005692243576049805 s; prefill post time: 0.014266014099121094 s; decode prepare time: 0.0010443298317448267 s; decode predict time: 0.004414438733867571 s; decode post time: 0.01306948456512272 s 2025-05-21 15:35:37,143 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,143 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,143 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,143 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:37,143 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,143 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,143 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,143 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.521097660064697 2025-05-21 15:35:37,144 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.520623683929443 2025-05-21 15:35:37,144 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:37,144 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.520280599594116 2025-05-21 15:35:37,144 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.520488262176514 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,145 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,146 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,146 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,146 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:37,146 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,146 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:37,148 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,148 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,148 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,148 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:37,148 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,148 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,149 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,150 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:37,150 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,498 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.489741086959839 s; generated tokens: 512 tokens; generate speed: 53.952999908875896 tokens/s 2025-05-21 15:35:46,498 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.489887952804565 s; generated tokens: 512 tokens; generate speed: 53.95216493032329 tokens/s 2025-05-21 15:35:46,498 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.489341020584106 s; generated tokens: 512 tokens; generate speed: 53.95527454323529 tokens/s 2025-05-21 15:35:46,499 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014181137084960938 s; prefill predict time: 0.006796360015869141 s; prefill post time: 0.014142751693725586 s; decode prepare time: 0.0010206158147166152 s; decode predict time: 0.004676002614638385 s; decode post time: 0.01278255783648407 s 2025-05-21 15:35:46,498 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.48956298828125 s; generated tokens: 512 tokens; generate speed: 53.95401249059346 tokens/s 2025-05-21 15:35:46,499 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014998912811279297 s; prefill predict time: 0.006726980209350586 s; prefill post time: 0.014185667037963867 s; decode prepare time: 0.0010651543648975469 s; decode predict time: 0.004707787551131903 s; decode post time: 0.012703626823052035 s 2025-05-21 15:35:46,499 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014231204986572266 s; prefill predict time: 0.006018638610839844 s; prefill post time: 0.014371395111083984 s; decode prepare time: 0.001052619426451084 s; decode predict time: 0.004225540628620223 s; decode post time: 0.013203009700588518 s 2025-05-21 15:35:46,499 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013813972473144531 s; prefill predict time: 0.006172895431518555 s; prefill post time: 0.014203548431396484 s; decode prepare time: 0.001039714029390518 s; decode predict time: 0.004425844491696826 s; decode post time: 0.013015546910683469 s 2025-05-21 15:35:46,500 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,500 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,500 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.49659013748169 2025-05-21 15:35:46,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.496123313903809 2025-05-21 15:35:46,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.496701955795288 2025-05-21 15:35:46,500 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,500 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,501 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.496563911437988 2025-05-21 15:35:46,501 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,501 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,501 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,502 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,503 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,504 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,504 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,505 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,505 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,505 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,505 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,505 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,505 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,506 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,506 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,506 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,506 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,506 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,506 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,506 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,507 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,667 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.516643285751343 s; generated tokens: 512 tokens; generate speed: 53.80048244180641 tokens/s 2025-05-21 15:35:46,667 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.517185926437378 s; generated tokens: 512 tokens; generate speed: 53.797414903678344 tokens/s 2025-05-21 15:35:46,667 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.517392873764038 s; generated tokens: 512 tokens; generate speed: 53.79624512626732 tokens/s 2025-05-21 15:35:46,667 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0012600421905517578 s; prefill predict time: 0.006799221038818359 s; prefill post time: 0.013695001602172852 s; decode prepare time: 0.0009730604063508095 s; decode predict time: 0.004967378635032505 s; decode post time: 0.01259329454306287 s 2025-05-21 15:35:46,668 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.517983675003052 s; generated tokens: 512 tokens; generate speed: 53.79290588033456 tokens/s 2025-05-21 15:35:46,668 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013837814331054688 s; prefill predict time: 0.006787776947021484 s; prefill post time: 0.01476144790649414 s; decode prepare time: 0.0010227013940680516 s; decode predict time: 0.0043815729664821255 s; decode post time: 0.013131834989424555 s 2025-05-21 15:35:46,668 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014083385467529297 s; prefill predict time: 0.007452726364135742 s; prefill post time: 0.014293670654296875 s; decode prepare time: 0.0009908624824479136 s; decode predict time: 0.004813673449497597 s; decode post time: 0.012731378092457871 s 2025-05-21 15:35:46,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,668 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015113353729248047 s; prefill predict time: 0.0068912506103515625 s; prefill post time: 0.014936208724975586 s; decode prepare time: 0.0010410736218357272 s; decode predict time: 0.00439451769286511 s; decode post time: 0.013098706937816045 s 2025-05-21 15:35:46,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,669 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,669 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.52373743057251 2025-05-21 15:35:46,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.524069786071777 2025-05-21 15:35:46,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.523810625076294 2025-05-21 15:35:46,669 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:46,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:46,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.52473497390747 2025-05-21 15:35:46,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:46,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:46,672 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:46,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:46,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,675 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:46,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:46,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,021 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.514469861984253 s; generated tokens: 512 tokens; generate speed: 53.81277227496749 tokens/s 2025-05-21 15:35:56,022 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.514953851699829 s; generated tokens: 512 tokens; generate speed: 53.81003502276915 tokens/s 2025-05-21 15:35:56,022 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.514926671981812 s; generated tokens: 512 tokens; generate speed: 53.81018873300033 tokens/s 2025-05-21 15:35:56,022 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014324188232421875 s; prefill predict time: 0.006108760833740234 s; prefill post time: 0.014226198196411133 s; decode prepare time: 0.0010565316373821341 s; decode predict time: 0.004781844101700129 s; decode post time: 0.012687431622857917 s 2025-05-21 15:35:56,022 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.515567064285278 s; generated tokens: 512 tokens; generate speed: 53.806567337608975 tokens/s 2025-05-21 15:35:56,022 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013415813446044922 s; prefill predict time: 0.006316423416137695 s; prefill post time: 0.014043092727661133 s; decode prepare time: 0.0010332394019498285 s; decode predict time: 0.0046779931760301775 s; decode post time: 0.01281894536400961 s 2025-05-21 15:35:56,022 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015087127685546875 s; prefill predict time: 0.006077766418457031 s; prefill post time: 0.014251470565795898 s; decode prepare time: 0.001027774904105528 s; decode predict time: 0.004392778639699899 s; decode post time: 0.013109926608210674 s 2025-05-21 15:35:56,023 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014286041259765625 s; prefill predict time: 0.006417989730834961 s; prefill post time: 0.014217376708984375 s; decode prepare time: 0.0010602236260639944 s; decode predict time: 0.00418402634414972 s; decode post time: 0.013286726348787371 s 2025-05-21 15:35:56,023 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,023 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,023 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,023 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,023 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.521604776382446 2025-05-21 15:35:56,023 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,023 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,024 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,024 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.521968841552734 2025-05-21 15:35:56,024 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,024 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.52161979675293 2025-05-21 15:35:56,024 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.52227520942688 2025-05-21 15:35:56,024 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,025 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,026 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,028 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,028 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,028 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,028 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,028 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,028 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,029 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,030 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,030 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,205 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.529994249343872 s; generated tokens: 512 tokens; generate speed: 53.72511111801044 tokens/s 2025-05-21 15:35:56,205 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.529939651489258 s; generated tokens: 512 tokens; generate speed: 53.72541891385314 tokens/s 2025-05-21 15:35:56,205 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.529889583587646 s; generated tokens: 512 tokens; generate speed: 53.725701175149524 tokens/s 2025-05-21 15:35:56,205 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.529677391052246 s; generated tokens: 512 tokens; generate speed: 53.726897458326874 tokens/s 2025-05-21 15:35:56,206 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013589859008789062 s; prefill predict time: 0.006608486175537109 s; prefill post time: 0.013606786727905273 s; decode prepare time: 0.0009692550405364214 s; decode predict time: 0.004978799819946289 s; decode post time: 0.012611830537799754 s 2025-05-21 15:35:56,206 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016345977783203125 s; prefill predict time: 0.0060117244720458984 s; prefill post time: 0.014433860778808594 s; decode prepare time: 0.001013520645768675 s; decode predict time: 0.0044312523860557404 s; decode post time: 0.013118034706190374 s 2025-05-21 15:35:56,206 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001504659652709961 s; prefill predict time: 0.006162881851196289 s; prefill post time: 0.014447450637817383 s; decode prepare time: 0.0009830763195358843 s; decode predict time: 0.004868520007413976 s; decode post time: 0.012710779846997644 s 2025-05-21 15:35:56,206 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014564990997314453 s; prefill predict time: 0.005615234375 s; prefill post time: 0.014588594436645508 s; decode prepare time: 0.0010378720009163634 s; decode predict time: 0.004365291782453948 s; decode post time: 0.013157197873886318 s 2025-05-21 15:35:56,206 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,207 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,207 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,207 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.53664517402649 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.53642225265503 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.536467790603638 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:35:56,207 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.536104679107666 2025-05-21 15:35:56,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:35:56,208 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,209 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:35:56,211 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,211 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,211 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,211 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:35:56,212 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:35:56,213 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,213 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,213 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:35:56,213 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,527 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.496818780899048 s; generated tokens: 512 tokens; generate speed: 53.91279035773386 tokens/s 2025-05-21 15:36:05,527 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.497150659561157 s; generated tokens: 512 tokens; generate speed: 53.910906371117676 tokens/s 2025-05-21 15:36:05,527 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.497422218322754 s; generated tokens: 512 tokens; generate speed: 53.9093649024292 tokens/s 2025-05-21 15:36:05,527 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.497209787368774 s; generated tokens: 512 tokens; generate speed: 53.910570732148784 tokens/s 2025-05-21 15:36:05,528 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001352548599243164 s; prefill predict time: 0.007007598876953125 s; prefill post time: 0.014203786849975586 s; decode prepare time: 0.0010678777489410221 s; decode predict time: 0.004217684970182531 s; decode post time: 0.013207681958222809 s 2025-05-21 15:36:05,528 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001378774642944336 s; prefill predict time: 0.007311820983886719 s; prefill post time: 0.013520956039428711 s; decode prepare time: 0.0010288727493435435 s; decode predict time: 0.004695604829227223 s; decode post time: 0.012768408090401069 s 2025-05-21 15:36:05,528 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014309883117675781 s; prefill predict time: 0.007306337356567383 s; prefill post time: 0.01341700553894043 s; decode prepare time: 0.0010780994206258695 s; decode predict time: 0.0046861377416872515 s; decode post time: 0.012729130145854913 s 2025-05-21 15:36:05,528 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013272762298583984 s; prefill predict time: 0.007414102554321289 s; prefill post time: 0.0147552490234375 s; decode prepare time: 0.0010432011927177295 s; decode predict time: 0.004287440169091318 s; decode post time: 0.013163312307309265 s 2025-05-21 15:36:05,528 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,528 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,529 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,529 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.503834009170532 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.503362655639648 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.504209041595459 2025-05-21 15:36:05,529 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.503896236419678 2025-05-21 15:36:05,530 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,530 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,530 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,530 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,530 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,531 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,533 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,533 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,533 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,533 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,534 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,535 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,535 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,535 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,535 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,706 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.492834329605103 s; generated tokens: 512 tokens; generate speed: 53.935419309197925 tokens/s 2025-05-21 15:36:05,706 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.492897272109985 s; generated tokens: 512 tokens; generate speed: 53.93506169125517 tokens/s 2025-05-21 15:36:05,706 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013003349304199219 s; prefill predict time: 0.007216930389404297 s; prefill post time: 0.013464689254760742 s; decode prepare time: 0.0009721006665910993 s; decode predict time: 0.0049139504339180745 s; decode post time: 0.012600868415459262 s 2025-05-21 15:36:05,706 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.493391990661621 s; generated tokens: 512 tokens; generate speed: 53.93225103352309 tokens/s 2025-05-21 15:36:05,707 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.493183612823486 s; generated tokens: 512 tokens; generate speed: 53.9334348603966 tokens/s 2025-05-21 15:36:05,707 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013327598571777344 s; prefill predict time: 0.0070912837982177734 s; prefill post time: 0.013633012771606445 s; decode prepare time: 0.0009910509777628978 s; decode predict time: 0.004766371670891257 s; decode post time: 0.012730837801431256 s 2025-05-21 15:36:05,707 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013303756713867188 s; prefill predict time: 0.007080793380737305 s; prefill post time: 0.014254093170166016 s; decode prepare time: 0.0010159892811933842 s; decode predict time: 0.004324262282427619 s; decode post time: 0.013146980401354527 s 2025-05-21 15:36:05,707 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013129711151123047 s; prefill predict time: 0.0073125362396240234 s; prefill post time: 0.014266729354858398 s; decode prepare time: 0.0010331740817911704 s; decode predict time: 0.004346101424273323 s; decode post time: 0.013108380386740727 s 2025-05-21 15:36:05,707 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.499388217926025 2025-05-21 15:36:05,708 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,708 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,708 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:05,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.499720096588135 2025-05-21 15:36:05,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,708 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:05,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.499768018722534 2025-05-21 15:36:05,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.500152349472046 2025-05-21 15:36:05,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,709 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,710 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:05,711 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:05,712 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,713 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,713 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,713 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,713 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,713 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:05,713 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,714 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,714 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,714 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,714 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:05,714 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,714 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:05,714 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,715 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:05,715 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,031 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.495905637741089 s; generated tokens: 512 tokens; generate speed: 53.91797470744412 tokens/s 2025-05-21 15:36:15,032 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001485586166381836 s; prefill predict time: 0.007117748260498047 s; prefill post time: 0.013718366622924805 s; decode prepare time: 0.0010211243788090236 s; decode predict time: 0.004761942695168888 s; decode post time: 0.012706788785303642 s 2025-05-21 15:36:15,032 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.496453762054443 s; generated tokens: 512 tokens; generate speed: 53.91486262438611 tokens/s 2025-05-21 15:36:15,032 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.496339559555054 s; generated tokens: 512 tokens; generate speed: 53.915511001797995 tokens/s 2025-05-21 15:36:15,032 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.496742725372314 s; generated tokens: 512 tokens; generate speed: 53.9132221232125 tokens/s 2025-05-21 15:36:15,032 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014774799346923828 s; prefill predict time: 0.0071048736572265625 s; prefill post time: 0.014693498611450195 s; decode prepare time: 0.001068053180224275 s; decode predict time: 0.0042531527724920535 s; decode post time: 0.013171566442500822 s 2025-05-21 15:36:15,032 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015528202056884766 s; prefill predict time: 0.0068302154541015625 s; prefill post time: 0.014010190963745117 s; decode prepare time: 0.0010741284215520274 s; decode predict time: 0.004721828535491345 s; decode post time: 0.012694622439181034 s 2025-05-21 15:36:15,033 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,033 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,033 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015611648559570312 s; prefill predict time: 0.006836414337158203 s; prefill post time: 0.014631271362304688 s; decode prepare time: 0.0010458816520855152 s; decode predict time: 0.004255843162536621 s; decode post time: 0.013191817557975037 s 2025-05-21 15:36:15,033 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.502511978149414 2025-05-21 15:36:15,033 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,034 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.50305986404419 2025-05-21 15:36:15,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.503541946411133 2025-05-21 15:36:15,034 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,034 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.503883838653564 2025-05-21 15:36:15,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,035 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,036 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,037 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,037 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,038 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,038 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,038 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,038 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,039 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,039 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,039 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,039 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,039 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,039 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,040 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,040 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,040 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,040 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,041 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,276 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.561625957489014 s; generated tokens: 512 tokens; generate speed: 53.54737805853856 tokens/s 2025-05-21 15:36:15,276 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.561108350753784 s; generated tokens: 512 tokens; generate speed: 53.55027693621259 tokens/s 2025-05-21 15:36:15,276 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.561158895492554 s; generated tokens: 512 tokens; generate speed: 53.54999384450913 tokens/s 2025-05-21 15:36:15,276 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013301372528076172 s; prefill predict time: 0.0066070556640625 s; prefill post time: 0.013605594635009766 s; decode prepare time: 0.0009636738995516603 s; decode predict time: 0.005134151963626637 s; decode post time: 0.012525083967384294 s 2025-05-21 15:36:15,276 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.561808824539185 s; generated tokens: 512 tokens; generate speed: 53.54635397917768 tokens/s 2025-05-21 15:36:15,277 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014035701751708984 s; prefill predict time: 0.005732297897338867 s; prefill post time: 0.014396429061889648 s; decode prepare time: 0.0009837449180170048 s; decode predict time: 0.004909936587015788 s; decode post time: 0.012731130809000093 s 2025-05-21 15:36:15,277 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013270378112792969 s; prefill predict time: 0.005442380905151367 s; prefill post time: 0.014600992202758789 s; decode prepare time: 0.0010118587143033919 s; decode predict time: 0.004498168533923579 s; decode post time: 0.013115279128640132 s 2025-05-21 15:36:15,277 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013577938079833984 s; prefill predict time: 0.00590062141418457 s; prefill post time: 0.014495849609375 s; decode prepare time: 0.001034109093205103 s; decode predict time: 0.004498291015625 s; decode post time: 0.01309031189771081 s 2025-05-21 15:36:15,278 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,278 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.5685555934906 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,278 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,278 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.568155527114868 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.568313598632812 2025-05-21 15:36:15,278 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.568131685256958 2025-05-21 15:36:15,279 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,279 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,279 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,279 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,280 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:15,282 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,282 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,282 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:15,283 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,284 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:15,284 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,284 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,284 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:15,284 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,666 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.625726222991943 s; generated tokens: 512 tokens; generate speed: 53.19079185703831 tokens/s 2025-05-21 15:36:24,666 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.624872207641602 s; generated tokens: 512 tokens; generate speed: 53.195511478427846 tokens/s 2025-05-21 15:36:24,666 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.626971960067749 s; generated tokens: 512 tokens; generate speed: 53.18390893042518 tokens/s 2025-05-21 15:36:24,666 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.626075267791748 s; generated tokens: 512 tokens; generate speed: 53.18886314063223 tokens/s 2025-05-21 15:36:24,667 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016558170318603516 s; prefill predict time: 0.007471561431884766 s; prefill post time: 0.01444864273071289 s; decode prepare time: 0.0010913631454139306 s; decode predict time: 0.004921505030463723 s; decode post time: 0.012731078086300605 s 2025-05-21 15:36:24,667 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014803409576416016 s; prefill predict time: 0.006483554840087891 s; prefill post time: 0.014435291290283203 s; decode prepare time: 0.001059664672134907 s; decode predict time: 0.00448769073860318 s; decode post time: 0.013197687507375579 s 2025-05-21 15:36:24,667 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015592575073242188 s; prefill predict time: 0.007958412170410156 s; prefill post time: 0.013797998428344727 s; decode prepare time: 0.0010241636324768663 s; decode predict time: 0.004973414364983054 s; decode post time: 0.012745851173326227 s 2025-05-21 15:36:24,667 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014929771423339844 s; prefill predict time: 0.007105827331542969 s; prefill post time: 0.014859199523925781 s; decode prepare time: 0.0010795793887920342 s; decode predict time: 0.00444637233135747 s; decode post time: 0.013219591222863841 s 2025-05-21 15:36:24,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.632550477981567 2025-05-21 15:36:24,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,668 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.632070779800415 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.632965087890625 2025-05-21 15:36:24,668 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.633808851242065 2025-05-21 15:36:24,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,669 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,670 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,671 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,672 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,672 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,673 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,673 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,673 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,674 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,674 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,674 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,675 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,789 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.504405736923218 s; generated tokens: 512 tokens; generate speed: 53.86975410897657 tokens/s 2025-05-21 15:36:24,789 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.504273414611816 s; generated tokens: 512 tokens; generate speed: 53.870504105327406 tokens/s 2025-05-21 15:36:24,789 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.504350900650024 s; generated tokens: 512 tokens; generate speed: 53.87006491574118 tokens/s 2025-05-21 15:36:24,789 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.504610300064087 s; generated tokens: 512 tokens; generate speed: 53.868594696254696 tokens/s 2025-05-21 15:36:24,789 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014595985412597656 s; prefill predict time: 0.00574493408203125 s; prefill post time: 0.01359248161315918 s; decode prepare time: 0.0009809296191788466 s; decode predict time: 0.0048913221733242855 s; decode post time: 0.012642241270808091 s 2025-05-21 15:36:24,789 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014078617095947266 s; prefill predict time: 0.005480766296386719 s; prefill post time: 0.013283252716064453 s; decode prepare time: 0.0009680601482055425 s; decode predict time: 0.005028481109469545 s; decode post time: 0.01251641831519319 s 2025-05-21 15:36:24,790 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013432502746582031 s; prefill predict time: 0.005273103713989258 s; prefill post time: 0.014126777648925781 s; decode prepare time: 0.001008289434205296 s; decode predict time: 0.004376710629930683 s; decode post time: 0.013130231846102063 s 2025-05-21 15:36:24,790 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015265941619873047 s; prefill predict time: 0.005549907684326172 s; prefill post time: 0.014309167861938477 s; decode prepare time: 0.001042778944549262 s; decode predict time: 0.004378614238664215 s; decode post time: 0.013091738210032363 s 2025-05-21 15:36:24,790 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,790 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,790 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,791 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,791 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,791 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.51110053062439 2025-05-21 15:36:24,791 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:24,791 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.511284589767456 2025-05-21 15:36:24,791 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,791 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:24,791 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.511157751083374 2025-05-21 15:36:24,791 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.51136827468872 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,792 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,793 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,793 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:24,793 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,793 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,793 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:24,793 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,793 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:24,795 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,795 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,795 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,795 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:24,796 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,797 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,797 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:24,797 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,370 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.57279920578003 s; generated tokens: 512 tokens; generate speed: 53.48487824656928 tokens/s 2025-05-21 15:36:34,370 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.573092222213745 s; generated tokens: 512 tokens; generate speed: 53.48324116338678 tokens/s 2025-05-21 15:36:34,370 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013315677642822266 s; prefill predict time: 0.0066509246826171875 s; prefill post time: 0.013692378997802734 s; decode prepare time: 0.0009696646912457192 s; decode predict time: 0.005098427043241613 s; decode post time: 0.012575825133202361 s 2025-05-21 15:36:34,370 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.573236465454102 s; generated tokens: 512 tokens; generate speed: 53.482435313031154 tokens/s 2025-05-21 15:36:34,371 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013763904571533203 s; prefill predict time: 0.0064394474029541016 s; prefill post time: 0.014905452728271484 s; decode prepare time: 0.0009961921408218413 s; decode predict time: 0.0048993676316504385 s; decode post time: 0.0127474851104378 s 2025-05-21 15:36:34,371 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.573046684265137 s; generated tokens: 512 tokens; generate speed: 53.48349557738557 tokens/s 2025-05-21 15:36:34,371 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013146400451660156 s; prefill predict time: 0.006216526031494141 s; prefill post time: 0.014759302139282227 s; decode prepare time: 0.0010246096758459878 s; decode predict time: 0.0044090130749870745 s; decode post time: 0.01321061436677399 s 2025-05-21 15:36:34,371 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,371 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013494491577148438 s; prefill predict time: 0.005869388580322266 s; prefill post time: 0.01480245590209961 s; decode prepare time: 0.0010415224646402198 s; decode predict time: 0.004450631609150007 s; decode post time: 0.013152743505638406 s 2025-05-21 15:36:34,371 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,371 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,372 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.579408645629883 2025-05-21 15:36:34,372 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,372 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,372 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.57968521118164 2025-05-21 15:36:34,372 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,372 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,372 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.579818725585938 2025-05-21 15:36:34,372 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,373 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.579975128173828 2025-05-21 15:36:34,373 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,373 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,373 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,373 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,374 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,375 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,375 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.700334310531616 s; generated tokens: 512 tokens; generate speed: 52.78168603365799 tokens/s 2025-05-21 15:36:34,375 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.699952602386475 s; generated tokens: 512 tokens; generate speed: 52.783763074680685 tokens/s 2025-05-21 15:36:34,375 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.700248718261719 s; generated tokens: 512 tokens; generate speed: 52.782151764429216 tokens/s 2025-05-21 15:36:34,375 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.700587034225464 s; generated tokens: 512 tokens; generate speed: 52.78031094340676 tokens/s 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014259815216064453 s; prefill predict time: 0.007970094680786133 s; prefill post time: 0.014180898666381836 s; decode prepare time: 0.0010978247088228887 s; decode predict time: 0.0049755395627489275 s; decode post time: 0.01281554684946915 s 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001329183578491211 s; prefill predict time: 0.007451057434082031 s; prefill post time: 0.013789892196655273 s; decode prepare time: 0.0010316395246352701 s; decode predict time: 0.0050619751799340345 s; decode post time: 0.012796038517513387 s 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001453399658203125 s; prefill predict time: 0.007544517517089844 s; prefill post time: 0.014325618743896484 s; decode prepare time: 0.0010748096174923174 s; decode predict time: 0.004509178329916561 s; decode post time: 0.013307706250603185 s 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001356363296508789 s; prefill predict time: 0.007914543151855469 s; prefill post time: 0.014420270919799805 s; decode prepare time: 0.001067317861866811 s; decode predict time: 0.004540652854769837 s; decode post time: 0.013283744483544635 s 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,376 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,377 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,377 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,377 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,377 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.707274198532104 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:34,377 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.706913471221924 2025-05-21 15:36:34,377 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.70697832107544 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,377 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:34,377 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,378 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,378 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,378 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.707718849182129 2025-05-21 15:36:34,378 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,378 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,378 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,378 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,378 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,378 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,379 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:34,379 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:34,380 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:34,381 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,381 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,381 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,382 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:34,383 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:34,383 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,383 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,383 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:34,383 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,900 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.522722721099854 s; generated tokens: 512 tokens; generate speed: 53.76613548408192 tokens/s 2025-05-21 15:36:43,901 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.522724151611328 s; generated tokens: 512 tokens; generate speed: 53.766127407288714 tokens/s 2025-05-21 15:36:43,901 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.522438049316406 s; generated tokens: 512 tokens; generate speed: 53.76774281422133 tokens/s 2025-05-21 15:36:43,901 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.522231578826904 s; generated tokens: 512 tokens; generate speed: 53.76890865986228 tokens/s 2025-05-21 15:36:43,901 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013434886932373047 s; prefill predict time: 0.007233381271362305 s; prefill post time: 0.0144195556640625 s; decode prepare time: 0.0009790535309076775 s; decode predict time: 0.0050347762949326455 s; decode post time: 0.01252953973534989 s 2025-05-21 15:36:43,902 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013511180877685547 s; prefill predict time: 0.007125377655029297 s; prefill post time: 0.013886690139770508 s; decode prepare time: 0.000997319846703815 s; decode predict time: 0.00481557705823113 s; decode post time: 0.012729901390299638 s 2025-05-21 15:36:43,902 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.00133514404296875 s; prefill predict time: 0.006545543670654297 s; prefill post time: 0.014506816864013672 s; decode prepare time: 0.0010235687524605172 s; decode predict time: 0.004364620470533184 s; decode post time: 0.013155774361000136 s 2025-05-21 15:36:43,902 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013692378997802734 s; prefill predict time: 0.006195068359375 s; prefill post time: 0.01505732536315918 s; decode prepare time: 0.001048090406593278 s; decode predict time: 0.0043853801839491904 s; decode post time: 0.013110187422272743 s 2025-05-21 15:36:43,902 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,902 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.529398679733276 2025-05-21 15:36:43,903 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,903 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.529499530792236 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.529147386550903 2025-05-21 15:36:43,903 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.529006481170654 2025-05-21 15:36:43,904 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,904 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,904 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,904 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,904 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,904 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,904 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,905 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,907 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,907 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,907 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,907 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,907 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,908 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,909 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,909 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,909 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,969 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.58533263206482 s; generated tokens: 512 tokens; generate speed: 53.41494339875692 tokens/s 2025-05-21 15:36:43,969 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.58529782295227 s; generated tokens: 512 tokens; generate speed: 53.41513737570066 tokens/s 2025-05-21 15:36:43,969 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.584950923919678 s; generated tokens: 512 tokens; generate speed: 53.417070579076295 tokens/s 2025-05-21 15:36:43,969 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.585565328598022 s; generated tokens: 512 tokens; generate speed: 53.4136467123619 tokens/s 2025-05-21 15:36:43,969 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016865730285644531 s; prefill predict time: 0.00628972053527832 s; prefill post time: 0.014292240142822266 s; decode prepare time: 0.0010917410691890232 s; decode predict time: 0.004742586846445121 s; decode post time: 0.012829815571555419 s 2025-05-21 15:36:43,969 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001583099365234375 s; prefill predict time: 0.006018161773681641 s; prefill post time: 0.015517234802246094 s; decode prepare time: 0.0010661183038103137 s; decode predict time: 0.0043309856863582834 s; decode post time: 0.013268990059421486 s 2025-05-21 15:36:43,969 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015637874603271484 s; prefill predict time: 0.0065517425537109375 s; prefill post time: 0.01394033432006836 s; decode prepare time: 0.0010283282591638733 s; decode predict time: 0.004834112466550341 s; decode post time: 0.012802313451897609 s 2025-05-21 15:36:43,970 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016028881072998047 s; prefill predict time: 0.006625652313232422 s; prefill post time: 0.014597177505493164 s; decode prepare time: 0.0010616074802824195 s; decode predict time: 0.004319528037426518 s; decode post time: 0.013283788341365448 s 2025-05-21 15:36:43,970 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,970 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,970 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,970 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.592235326766968 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.592180490493774 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.591745138168335 2025-05-21 15:36:43,971 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.59219741821289 2025-05-21 15:36:43,972 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,972 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,972 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,972 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:43,972 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,972 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,972 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:43,973 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:43,975 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,975 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,975 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,975 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,976 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:43,977 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,977 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,977 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:43,977 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,552 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.643210887908936 s; generated tokens: 512 tokens; generate speed: 53.094348547532775 tokens/s 2025-05-21 15:36:53,553 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.643872737884521 s; generated tokens: 512 tokens; generate speed: 53.0907047319988 tokens/s 2025-05-21 15:36:53,553 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.643573760986328 s; generated tokens: 512 tokens; generate speed: 53.09235068759753 tokens/s 2025-05-21 15:36:53,553 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.643418788909912 s; generated tokens: 512 tokens; generate speed: 53.093203894536686 tokens/s 2025-05-21 15:36:53,553 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001377105712890625 s; prefill predict time: 0.005930662155151367 s; prefill post time: 0.01370549201965332 s; decode prepare time: 0.0010262650752954053 s; decode predict time: 0.004936467432508282 s; decode post time: 0.012819892972882713 s 2025-05-21 15:36:53,553 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013322830200195312 s; prefill predict time: 0.005749225616455078 s; prefill post time: 0.014551401138305664 s; decode prepare time: 0.0010396286466117007 s; decode predict time: 0.004471398334877164 s; decode post time: 0.013272587800445856 s 2025-05-21 15:36:53,553 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013349056243896484 s; prefill predict time: 0.0061800479888916016 s; prefill post time: 0.014221429824829102 s; decode prepare time: 0.0009932895929146187 s; decode predict time: 0.005130726683373545 s; decode post time: 0.012659466663218757 s 2025-05-21 15:36:53,553 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013308525085449219 s; prefill predict time: 0.005835533142089844 s; prefill post time: 0.014677762985229492 s; decode prepare time: 0.0010646569985466228 s; decode predict time: 0.004491257667541504 s; decode post time: 0.01322785328978895 s 2025-05-21 15:36:53,554 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,554 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,554 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,554 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,554 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,554 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.650150299072266 2025-05-21 15:36:53,555 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,555 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,555 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,555 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.650218486785889 2025-05-21 15:36:53,555 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.650154829025269 2025-05-21 15:36:53,555 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.650758743286133 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,556 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,557 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,557 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,557 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,557 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,557 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,559 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,559 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,559 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,559 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,559 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,560 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,561 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,561 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,561 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,594 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.61683964729309 s; generated tokens: 512 tokens; generate speed: 53.23994355506548 tokens/s 2025-05-21 15:36:53,594 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.616934776306152 s; generated tokens: 512 tokens; generate speed: 53.23941691498695 tokens/s 2025-05-21 15:36:53,595 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013058185577392578 s; prefill predict time: 0.007100820541381836 s; prefill post time: 0.01416468620300293 s; decode prepare time: 0.0010263364608973672 s; decode predict time: 0.004899498995612649 s; decode post time: 0.012801567402371221 s 2025-05-21 15:36:53,595 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015423297882080078 s; prefill predict time: 0.006680727005004883 s; prefill post time: 0.013963937759399414 s; decode prepare time: 0.00109604240163665 s; decode predict time: 0.004793571023380055 s; decode post time: 0.012836467496569609 s 2025-05-21 15:36:53,594 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.616966962814331 s; generated tokens: 512 tokens; generate speed: 53.23923873085316 tokens/s 2025-05-21 15:36:53,595 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.617586135864258 s; generated tokens: 512 tokens; generate speed: 53.235811228218395 tokens/s 2025-05-21 15:36:53,595 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001436471939086914 s; prefill predict time: 0.006751537322998047 s; prefill post time: 0.014885902404785156 s; decode prepare time: 0.0010669096108751987 s; decode predict time: 0.004395667711893717 s; decode post time: 0.013266147232802181 s 2025-05-21 15:36:53,595 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014107227325439453 s; prefill predict time: 0.006403923034667969 s; prefill post time: 0.015084981918334961 s; decode prepare time: 0.0010713672451310428 s; decode predict time: 0.00436087168899237 s; decode post time: 0.013295582829156266 s 2025-05-21 15:36:53,595 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,595 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,596 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,596 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,596 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.623602867126465 2025-05-21 15:36:53,596 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.623498916625977 2025-05-21 15:36:53,596 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,596 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:36:53,596 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,596 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:36:53,597 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.624168872833252 2025-05-21 15:36:53,597 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.624033689498901 2025-05-21 15:36:53,597 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,597 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,597 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,598 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:36:53,599 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,599 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:36:53,599 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:36:53,600 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,600 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,601 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,601 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,601 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,601 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,601 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:36:53,601 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,601 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,602 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,602 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,602 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:36:53,602 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,602 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:36:53,602 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:36:53,603 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,117 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.555298089981079 s; generated tokens: 512 tokens; generate speed: 53.58283908869805 tokens/s 2025-05-21 15:37:03,117 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.555880308151245 s; generated tokens: 512 tokens; generate speed: 53.579574407525776 tokens/s 2025-05-21 15:37:03,117 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0012323856353759766 s; prefill predict time: 0.006377458572387695 s; prefill post time: 0.014149665832519531 s; decode prepare time: 0.0009786214847154113 s; decode predict time: 0.005059727967954149 s; decode post time: 0.012569306181368296 s 2025-05-21 15:37:03,117 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.556591510772705 s; generated tokens: 512 tokens; generate speed: 53.5755870095364 tokens/s 2025-05-21 15:37:03,118 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001294851303100586 s; prefill predict time: 0.0066492557525634766 s; prefill post time: 0.014181137084960938 s; decode prepare time: 0.0010118479831344694 s; decode predict time: 0.004879205367144417 s; decode post time: 0.012718967262312857 s 2025-05-21 15:37:03,117 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.556677341461182 s; generated tokens: 512 tokens; generate speed: 53.57510583502833 tokens/s 2025-05-21 15:37:03,118 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014660358428955078 s; prefill predict time: 0.007179975509643555 s; prefill post time: 0.014455795288085938 s; decode prepare time: 0.0010480348844584186 s; decode predict time: 0.0044018231186212274 s; decode post time: 0.013161442751053961 s 2025-05-21 15:37:03,118 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,118 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014257431030273438 s; prefill predict time: 0.006800651550292969 s; prefill post time: 0.014992475509643555 s; decode prepare time: 0.0010256165394344442 s; decode predict time: 0.004384726169062596 s; decode post time: 0.013201751354389227 s 2025-05-21 15:37:03,119 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,119 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,119 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,119 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.562437534332275 2025-05-21 15:37:03,119 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,119 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.56293797492981 2025-05-21 15:37:03,119 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,119 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.563063144683838 2025-05-21 15:37:03,119 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,120 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,120 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,120 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.563685178756714 2025-05-21 15:37:03,120 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,120 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,120 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,120 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,121 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,122 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,122 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,122 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,123 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,123 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,123 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.521054983139038 s; generated tokens: 512 tokens; generate speed: 53.77555332961605 tokens/s 2025-05-21 15:37:03,123 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,123 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.520607471466064 s; generated tokens: 512 tokens; generate speed: 53.77808102418888 tokens/s 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,123 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.520894289016724 s; generated tokens: 512 tokens; generate speed: 53.77646095605134 tokens/s 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.520105838775635 s; generated tokens: 512 tokens; generate speed: 53.78091469473069 tokens/s 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013883113861083984 s; prefill predict time: 0.007485866546630859 s; prefill post time: 0.014018535614013672 s; decode prepare time: 0.0010242219540470966 s; decode predict time: 0.004732364766737994 s; decode post time: 0.012782293289841505 s 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014984607696533203 s; prefill predict time: 0.0070874691009521484 s; prefill post time: 0.014399051666259766 s; decode prepare time: 0.0010646863926180188 s; decode predict time: 0.004797602634803922 s; decode post time: 0.012677337792055014 s 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013790130615234375 s; prefill predict time: 0.006949424743652344 s; prefill post time: 0.014744043350219727 s; decode prepare time: 0.0010685710757679203 s; decode predict time: 0.004231752133836933 s; decode post time: 0.013239396294968935 s 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015110969543457031 s; prefill predict time: 0.00609278678894043 s; prefill post time: 0.014868974685668945 s; decode prepare time: 0.0010567821868478434 s; decode predict time: 0.004339971261865952 s; decode post time: 0.013143391991781396 s 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,124 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,125 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:03,125 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.527872800827026 2025-05-21 15:37:03,125 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,125 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,125 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,126 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:03,126 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.528011083602905 2025-05-21 15:37:03,126 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.527339220046997 2025-05-21 15:37:03,126 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.527639865875244 2025-05-21 15:37:03,126 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,126 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,127 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,128 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:03,128 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,128 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:03,129 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,130 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,130 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,130 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,130 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:03,130 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,130 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,131 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,131 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,131 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,131 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:03,131 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,131 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:03,131 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,132 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:03,132 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,662 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.530538320541382 s; generated tokens: 512 tokens; generate speed: 53.72204410494578 tokens/s 2025-05-21 15:37:12,662 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.530842542648315 s; generated tokens: 512 tokens; generate speed: 53.720329310752795 tokens/s 2025-05-21 15:37:12,663 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016171932220458984 s; prefill predict time: 0.008452892303466797 s; prefill post time: 0.014150857925415039 s; decode prepare time: 0.0010616424732245579 s; decode predict time: 0.004807126755807914 s; decode post time: 0.012682234471091552 s 2025-05-21 15:37:12,663 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.5305757522583 s; generated tokens: 512 tokens; generate speed: 53.721833109471895 tokens/s 2025-05-21 15:37:12,663 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014712810516357422 s; prefill predict time: 0.011685848236083984 s; prefill post time: 0.014364480972290039 s; decode prepare time: 0.0010266360004820702 s; decode predict time: 0.004751884703542672 s; decode post time: 0.012771831333287308 s 2025-05-21 15:37:12,663 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.531291723251343 s; generated tokens: 512 tokens; generate speed: 53.71779763607372 tokens/s 2025-05-21 15:37:12,663 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015492439270019531 s; prefill predict time: 0.007971763610839844 s; prefill post time: 0.015491008758544922 s; decode prepare time: 0.001076776220840484 s; decode predict time: 0.004273501096987257 s; decode post time: 0.013201753687252037 s 2025-05-21 15:37:12,664 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014863014221191406 s; prefill predict time: 0.009099006652832031 s; prefill post time: 0.014909982681274414 s; decode prepare time: 0.0010416297763294437 s; decode predict time: 0.004358269654068292 s; decode post time: 0.013154019581595046 s 2025-05-21 15:37:12,664 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,664 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.537259578704834 2025-05-21 15:37:12,664 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.537678241729736 2025-05-21 15:37:12,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,665 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.537538528442383 2025-05-21 15:37:12,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.538117408752441 2025-05-21 15:37:12,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,667 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,668 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,669 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,669 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,669 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,670 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,671 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,671 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,671 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,671 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,680 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.555464744567871 s; generated tokens: 512 tokens; generate speed: 53.58190456315208 tokens/s 2025-05-21 15:37:12,681 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.555612802505493 s; generated tokens: 512 tokens; generate speed: 53.58107434677062 tokens/s 2025-05-21 15:37:12,681 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013561248779296875 s; prefill predict time: 0.006479740142822266 s; prefill post time: 0.014084100723266602 s; decode prepare time: 0.0009832125587239423 s; decode predict time: 0.005036263372383866 s; decode post time: 0.012588683873006743 s 2025-05-21 15:37:12,681 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.55466890335083 s; generated tokens: 512 tokens; generate speed: 53.58636758417041 tokens/s 2025-05-21 15:37:12,681 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.555925130844116 s; generated tokens: 512 tokens; generate speed: 53.57932308902182 tokens/s 2025-05-21 15:37:12,682 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014042854309082031 s; prefill predict time: 0.006237506866455078 s; prefill post time: 0.013605594635009766 s; decode prepare time: 0.0010145270427845696 s; decode predict time: 0.0048560465083402745 s; decode post time: 0.0127408406505846 s 2025-05-21 15:37:12,682 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013926029205322266 s; prefill predict time: 0.004990816116333008 s; prefill post time: 0.014477968215942383 s; decode prepare time: 0.0010243950524675403 s; decode predict time: 0.004373473747103822 s; decode post time: 0.013212883775714793 s 2025-05-21 15:37:12,682 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013570785522460938 s; prefill predict time: 0.006186485290527344 s; prefill post time: 0.014796018600463867 s; decode prepare time: 0.0010412625837232735 s; decode predict time: 0.004412444900063908 s; decode post time: 0.013157105492519073 s 2025-05-21 15:37:12,682 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,682 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,682 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.5623037815094 2025-05-21 15:37:12,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.562401533126831 2025-05-21 15:37:12,683 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,683 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:12,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:12,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.562517166137695 2025-05-21 15:37:12,683 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.561780214309692 2025-05-21 15:37:12,684 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,684 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,684 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,684 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,684 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:12,684 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,684 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,685 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:12,687 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,687 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,687 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,688 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:12,689 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,689 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,689 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:12,689 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,163 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.492473840713501 s; generated tokens: 512 tokens; generate speed: 53.93746757605134 tokens/s 2025-05-21 15:37:22,164 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.492448568344116 s; generated tokens: 512 tokens; generate speed: 53.93761117731444 tokens/s 2025-05-21 15:37:22,164 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.49226188659668 s; generated tokens: 512 tokens; generate speed: 53.93867195372657 tokens/s 2025-05-21 15:37:22,164 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014944076538085938 s; prefill predict time: 0.007067203521728516 s; prefill post time: 0.01414942741394043 s; decode prepare time: 0.001065936807083757 s; decode predict time: 0.004729165282903933 s; decode post time: 0.012688316711007732 s 2025-05-21 15:37:22,164 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.492612838745117 s; generated tokens: 512 tokens; generate speed: 53.936677782772 tokens/s 2025-05-21 15:37:22,164 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014941692352294922 s; prefill predict time: 0.0064051151275634766 s; prefill post time: 0.013730049133300781 s; decode prepare time: 0.0010303737132749912 s; decode predict time: 0.004712793406318216 s; decode post time: 0.012741380008466089 s 2025-05-21 15:37:22,164 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015032291412353516 s; prefill predict time: 0.006112098693847656 s; prefill post time: 0.015128850936889648 s; decode prepare time: 0.0010503192237212 s; decode predict time: 0.004329235880982642 s; decode post time: 0.013106201959449485 s 2025-05-21 15:37:22,165 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014870166778564453 s; prefill predict time: 0.006551504135131836 s; prefill post time: 0.014162302017211914 s; decode prepare time: 0.001069543879559362 s; decode predict time: 0.004249839689217362 s; decode post time: 0.013166168197960303 s 2025-05-21 15:37:22,165 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,165 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,165 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,165 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,165 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,166 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.499562740325928 2025-05-21 15:37:22,166 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,166 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,166 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,166 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.499578952789307 2025-05-21 15:37:22,166 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.49898099899292 2025-05-21 15:37:22,166 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.499315738677979 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,167 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,168 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,168 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,168 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,168 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,168 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,168 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,168 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,170 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,170 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,170 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,170 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,171 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,172 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,172 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,172 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,172 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,229 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.540225744247437 s; generated tokens: 512 tokens; generate speed: 53.66749317318048 tokens/s 2025-05-21 15:37:22,230 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.54073166847229 s; generated tokens: 512 tokens; generate speed: 53.664647302881754 tokens/s 2025-05-21 15:37:22,230 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014374256134033203 s; prefill predict time: 0.006513833999633789 s; prefill post time: 0.014128923416137695 s; decode prepare time: 0.0010068859839392735 s; decode predict time: 0.004883336085899204 s; decode post time: 0.01269167136772738 s 2025-05-21 15:37:22,230 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.540379285812378 s; generated tokens: 512 tokens; generate speed: 53.66662945585422 tokens/s 2025-05-21 15:37:22,230 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.540699243545532 s; generated tokens: 512 tokens; generate speed: 53.66482968702508 tokens/s 2025-05-21 15:37:22,230 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014576911926269531 s; prefill predict time: 0.0068264007568359375 s; prefill post time: 0.013645172119140625 s; decode prepare time: 0.000976651615359312 s; decode predict time: 0.005042189710280474 s; decode post time: 0.012561564809422204 s 2025-05-21 15:37:22,231 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001355886459350586 s; prefill predict time: 0.0061244964599609375 s; prefill post time: 0.014646530151367188 s; decode prepare time: 0.0010264255762566792 s; decode predict time: 0.004355199196759392 s; decode post time: 0.013201762552130712 s 2025-05-21 15:37:22,231 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,231 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013704299926757812 s; prefill predict time: 0.006260395050048828 s; prefill post time: 0.014510869979858398 s; decode prepare time: 0.0010466785580211422 s; decode predict time: 0.004377049558302935 s; decode post time: 0.0131579733174841 s 2025-05-21 15:37:22,231 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,231 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,231 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.546834707260132 2025-05-21 15:37:22,231 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,231 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,232 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.547330856323242 2025-05-21 15:37:22,232 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:22,232 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,232 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:22,232 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.54715347290039 2025-05-21 15:37:22,232 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.547411918640137 2025-05-21 15:37:22,232 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:22,233 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:22,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:22,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,234 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:22,235 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,236 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,236 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,236 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,236 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,236 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:22,236 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,237 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,237 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,237 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:22,237 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,237 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,237 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:22,237 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,238 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:22,238 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,728 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.555561065673828 s; generated tokens: 512 tokens; generate speed: 53.58136445166398 tokens/s 2025-05-21 15:37:31,728 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.556029796600342 s; generated tokens: 512 tokens; generate speed: 53.578736242759454 tokens/s 2025-05-21 15:37:31,728 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.555952548980713 s; generated tokens: 512 tokens; generate speed: 53.5791693581204 tokens/s 2025-05-21 15:37:31,728 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014719963073730469 s; prefill predict time: 0.006701469421386719 s; prefill post time: 0.01377558708190918 s; decode prepare time: 0.001029500289439222 s; decode predict time: 0.004828955612930597 s; decode post time: 0.01275006059097917 s 2025-05-21 15:37:31,728 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.556386709213257 s; generated tokens: 512 tokens; generate speed: 53.57673518029401 tokens/s 2025-05-21 15:37:31,729 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016086101531982422 s; prefill predict time: 0.006403923034667969 s; prefill post time: 0.01425480842590332 s; decode prepare time: 0.0010762676567480755 s; decode predict time: 0.004279388633428836 s; decode post time: 0.01325516551441409 s 2025-05-21 15:37:31,729 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014281272888183594 s; prefill predict time: 0.006505250930786133 s; prefill post time: 0.014041423797607422 s; decode prepare time: 0.0010773347081970096 s; decode predict time: 0.004763269424438476 s; decode post time: 0.012769615113618555 s 2025-05-21 15:37:31,729 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015294551849365234 s; prefill predict time: 0.006550312042236328 s; prefill post time: 0.014269590377807617 s; decode prepare time: 0.0010602852136421578 s; decode predict time: 0.004397327292199229 s; decode post time: 0.01315421787493383 s 2025-05-21 15:37:31,729 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,730 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,730 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,730 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,730 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.562467336654663 2025-05-21 15:37:31,730 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,730 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,730 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.562692403793335 2025-05-21 15:37:31,730 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.563008069992065 2025-05-21 15:37:31,730 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,731 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,731 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.563380718231201 2025-05-21 15:37:31,731 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,731 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,731 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,732 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,733 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,733 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,734 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,734 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,734 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,735 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,735 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,735 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,735 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,735 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,735 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,735 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,736 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,736 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,736 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,736 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,736 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,737 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,814 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.576282262802124 s; generated tokens: 512 tokens; generate speed: 53.465424885062156 tokens/s 2025-05-21 15:37:31,814 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.577026605606079 s; generated tokens: 512 tokens; generate speed: 53.46126946126388 tokens/s 2025-05-21 15:37:31,815 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013806819915771484 s; prefill predict time: 0.006865978240966797 s; prefill post time: 0.013932466506958008 s; decode prepare time: 0.0009854978311318475 s; decode predict time: 0.005070672315709731 s; decode post time: 0.01259430700552207 s 2025-05-21 15:37:31,815 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013747215270996094 s; prefill predict time: 0.00721287727355957 s; prefill post time: 0.013771295547485352 s; decode prepare time: 0.0010127540670495678 s; decode predict time: 0.004777412321053299 s; decode post time: 0.012862579696565691 s 2025-05-21 15:37:31,815 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.57683801651001 s; generated tokens: 512 tokens; generate speed: 53.462322231757135 tokens/s 2025-05-21 15:37:31,815 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.576889991760254 s; generated tokens: 512 tokens; generate speed: 53.46203208353793 tokens/s 2025-05-21 15:37:31,816 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,816 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,816 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013785362243652344 s; prefill predict time: 0.006294727325439453 s; prefill post time: 0.014699459075927734 s; decode prepare time: 0.0010510293471603243 s; decode predict time: 0.0044188494775809495 s; decode post time: 0.013182901356318225 s 2025-05-21 15:37:31,816 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013818740844726562 s; prefill predict time: 0.0064945220947265625 s; prefill post time: 0.014247894287109375 s; decode prepare time: 0.0010279209413173848 s; decode predict time: 0.004412984848022461 s; decode post time: 0.013213892972166057 s 2025-05-21 15:37:31,816 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.582904815673828 2025-05-21 15:37:31,816 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,816 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,817 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.583836078643799 2025-05-21 15:37:31,817 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,817 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:31,817 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,817 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,817 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:31,817 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.58360505104065 2025-05-21 15:37:31,817 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.583650350570679 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:31,818 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,819 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,819 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:31,819 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,819 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:31,819 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,819 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:31,820 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,821 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,821 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,821 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,821 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,821 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,822 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,822 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,822 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:31,822 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,822 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,822 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,822 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:31,823 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:31,823 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:31,823 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,303 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.566293478012085 s; generated tokens: 512 tokens; generate speed: 53.521251587861144 tokens/s 2025-05-21 15:37:41,303 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.565575122833252 s; generated tokens: 512 tokens; generate speed: 53.52527092467697 tokens/s 2025-05-21 15:37:41,303 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015981197357177734 s; prefill predict time: 0.007124423980712891 s; prefill post time: 0.014262914657592773 s; decode prepare time: 0.0010873776825906713 s; decode predict time: 0.004700695767122156 s; decode post time: 0.012839938796429718 s 2025-05-21 15:37:41,303 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.567152976989746 s; generated tokens: 512 tokens; generate speed: 53.516443317194465 tokens/s 2025-05-21 15:37:41,303 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.567144393920898 s; generated tokens: 512 tokens; generate speed: 53.5164913289416 tokens/s 2025-05-21 15:37:41,304 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014138221740722656 s; prefill predict time: 0.0061168670654296875 s; prefill post time: 0.01507258415222168 s; decode prepare time: 0.0010693768465822224 s; decode predict time: 0.004356490864473231 s; decode post time: 0.013203556990203558 s 2025-05-21 15:37:41,304 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015015602111816406 s; prefill predict time: 0.0074024200439453125 s; prefill post time: 0.014291524887084961 s; decode prepare time: 0.0010799409825274623 s; decode predict time: 0.004280532107633703 s; decode post time: 0.013268073244337467 s 2025-05-21 15:37:41,304 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014197826385498047 s; prefill predict time: 0.007346391677856445 s; prefill post time: 0.014185667037963867 s; decode prepare time: 0.0010242779427545178 s; decode predict time: 0.004822781039219276 s; decode post time: 0.012780709275994048 s 2025-05-21 15:37:41,304 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,304 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,304 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,305 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.57291316986084 2025-05-21 15:37:41,305 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,305 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,305 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.5725576877594 2025-05-21 15:37:41,305 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,305 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,305 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,305 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.573803901672363 2025-05-21 15:37:41,305 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.573907375335693 2025-05-21 15:37:41,306 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,306 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,306 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,306 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,306 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,307 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,309 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,309 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,309 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,310 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,311 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,311 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,311 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,311 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,311 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,404 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.5822172164917 s; generated tokens: 512 tokens; generate speed: 53.43230991662456 tokens/s 2025-05-21 15:37:41,405 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013358592987060547 s; prefill predict time: 0.007325649261474609 s; prefill post time: 0.01404714584350586 s; decode prepare time: 0.0009865046947203037 s; decode predict time: 0.005064355158338359 s; decode post time: 0.012608222065606462 s 2025-05-21 15:37:41,405 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.581563711166382 s; generated tokens: 512 tokens; generate speed: 53.435954238170304 tokens/s 2025-05-21 15:37:41,405 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.581221342086792 s; generated tokens: 512 tokens; generate speed: 53.43786368351306 tokens/s 2025-05-21 15:37:41,405 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.582277536392212 s; generated tokens: 512 tokens; generate speed: 53.43197356322568 tokens/s 2025-05-21 15:37:41,406 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014386177062988281 s; prefill predict time: 0.0060176849365234375 s; prefill post time: 0.01445627212524414 s; decode prepare time: 0.001030247272110732 s; decode predict time: 0.0044231068854238475 s; decode post time: 0.013209224913684823 s 2025-05-21 15:37:41,406 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014832019805908203 s; prefill predict time: 0.005741596221923828 s; prefill post time: 0.014597654342651367 s; decode prepare time: 0.0010556759433037147 s; decode predict time: 0.0044260702881158565 s; decode post time: 0.013180017004740915 s 2025-05-21 15:37:41,406 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013980865478515625 s; prefill predict time: 0.006753683090209961 s; prefill post time: 0.013601541519165039 s; decode prepare time: 0.0010192730188836323 s; decode predict time: 0.004877688370498957 s; decode post time: 0.012766350971975905 s 2025-05-21 15:37:41,406 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,406 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,407 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.589012861251831 2025-05-21 15:37:41,407 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,407 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,407 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:41,407 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,407 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,407 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.588223934173584 2025-05-21 15:37:41,407 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:41,407 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.588117837905884 2025-05-21 15:37:41,407 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.589118480682373 2025-05-21 15:37:41,408 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,408 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,408 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,408 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,408 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:41,408 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,408 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:41,409 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:41,411 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,411 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,411 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,412 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:41,413 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:41,413 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,413 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,413 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:41,413 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:50,883 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.572101831436157 s; generated tokens: 512 tokens; generate speed: 53.488774881031716 tokens/s 2025-05-21 15:37:50,883 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.571318626403809 s; generated tokens: 512 tokens; generate speed: 53.49315177822803 tokens/s 2025-05-21 15:37:50,883 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.571686744689941 s; generated tokens: 512 tokens; generate speed: 53.491094480713215 tokens/s 2025-05-21 15:37:50,883 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.572059631347656 s; generated tokens: 512 tokens; generate speed: 53.489010695592086 tokens/s 2025-05-21 15:37:50,884 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015070438385009766 s; prefill predict time: 0.005665779113769531 s; prefill post time: 0.013924837112426758 s; decode prepare time: 0.0010953467419469425 s; decode predict time: 0.0047776451297834805 s; decode post time: 0.012770290710688104 s 2025-05-21 15:37:50,884 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001377105712890625 s; prefill predict time: 0.00497889518737793 s; prefill post time: 0.014043569564819336 s; decode prepare time: 0.0010750587672403414 s; decode predict time: 0.004326953139959597 s; decode post time: 0.013243445678233167 s 2025-05-21 15:37:50,884 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013616085052490234 s; prefill predict time: 0.0052661895751953125 s; prefill post time: 0.013555049896240234 s; decode prepare time: 0.001026642532497936 s; decode predict time: 0.004842394940993365 s; decode post time: 0.012775051617109146 s 2025-05-21 15:37:50,884 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013818740844726562 s; prefill predict time: 0.005464076995849609 s; prefill post time: 0.014104843139648438 s; decode prepare time: 0.0010671275002615793 s; decode predict time: 0.004370295300203211 s; decode post time: 0.013209067678731482 s 2025-05-21 15:37:50,884 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,885 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,885 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,885 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.578577280044556 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.57809066772461 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.578566789627075 2025-05-21 15:37:50,885 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.578997611999512 2025-05-21 15:37:50,886 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,886 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,887 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,888 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,888 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,889 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:50,890 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:50,890 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:50,890 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:50,890 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:50,890 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:50,890 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:50,890 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:50,891 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:50,891 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:50,891 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:50,891 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:50,891 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:50,892 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:50,892 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:50,892 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:50,992 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.578699827194214 s; generated tokens: 512 tokens; generate speed: 53.45193076688934 tokens/s 2025-05-21 15:37:50,993 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.579442739486694 s; generated tokens: 512 tokens; generate speed: 53.447785421747305 tokens/s 2025-05-21 15:37:50,993 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.579793214797974 s; generated tokens: 512 tokens; generate speed: 53.44583004245958 tokens/s 2025-05-21 15:37:50,993 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013737678527832031 s; prefill predict time: 0.00568699836730957 s; prefill post time: 0.014455318450927734 s; decode prepare time: 0.0010142452329572167 s; decode predict time: 0.0047786259183696675 s; decode post time: 0.012866042598119687 s 2025-05-21 15:37:50,994 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013828277587890625 s; prefill predict time: 0.005722522735595703 s; prefill post time: 0.013611555099487305 s; decode prepare time: 0.0009900884385678167 s; decode predict time: 0.005087216695149739 s; decode post time: 0.012579580575752631 s 2025-05-21 15:37:50,994 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015058517456054688 s; prefill predict time: 0.005682468414306641 s; prefill post time: 0.014874696731567383 s; decode prepare time: 0.0010336658492713525 s; decode predict time: 0.004383629911086139 s; decode post time: 0.013242594649880366 s 2025-05-21 15:37:50,994 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.580452919006348 s; generated tokens: 512 tokens; generate speed: 53.442149794845285 tokens/s 2025-05-21 15:37:50,994 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,994 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,994 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014019012451171875 s; prefill predict time: 0.005572080612182617 s; prefill post time: 0.014966964721679688 s; decode prepare time: 0.001056588092662117 s; decode predict time: 0.004419354831471163 s; decode post time: 0.013182345668397072 s 2025-05-21 15:37:50,994 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,995 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,995 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.58591890335083 2025-05-21 15:37:50,995 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,995 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.58675765991211 2025-05-21 15:37:50,995 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,995 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.586775779724121 2025-05-21 15:37:50,995 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:37:50,995 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:37:50,996 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.587167739868164 2025-05-21 15:37:50,996 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,996 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,996 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,996 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,996 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,996 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:37:50,997 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,998 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:37:50,998 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:37:50,999 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:50,999 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:50,999 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:37:51,000 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:51,001 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:51,001 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:37:51,001 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:51,002 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:37:51,002 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,538 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.645249605178833 s; generated tokens: 512 tokens; generate speed: 53.08312599034154 tokens/s 2025-05-21 15:38:00,538 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.646864414215088 s; generated tokens: 512 tokens; generate speed: 53.074240293617585 tokens/s 2025-05-21 15:38:00,539 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013229846954345703 s; prefill predict time: 0.0071141719818115234 s; prefill post time: 0.013700246810913086 s; decode prepare time: 0.0010301511581629922 s; decode predict time: 0.004943240857591816 s; decode post time: 0.01280813067859866 s 2025-05-21 15:38:00,539 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.646357297897339 s; generated tokens: 512 tokens; generate speed: 53.077030446674726 tokens/s 2025-05-21 15:38:00,539 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016188621520996094 s; prefill predict time: 0.00826883316040039 s; prefill post time: 0.013664722442626953 s; decode prepare time: 0.0011059957940984379 s; decode predict time: 0.004788447828853831 s; decode post time: 0.012888110547149717 s 2025-05-21 15:38:00,539 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.647040367126465 s; generated tokens: 512 tokens; generate speed: 53.07327226956633 tokens/s 2025-05-21 15:38:00,539 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.00153350830078125 s; prefill predict time: 0.007840394973754883 s; prefill post time: 0.014540672302246094 s; decode prepare time: 0.001087043616636392 s; decode predict time: 0.004447463446972417 s; decode post time: 0.013250901974343974 s 2025-05-21 15:38:00,539 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,540 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0015587806701660156 s; prefill predict time: 0.0075855255126953125 s; prefill post time: 0.014735221862792969 s; decode prepare time: 0.0010839488408336902 s; decode predict time: 0.004385893952612783 s; decode post time: 0.013315442956823659 s 2025-05-21 15:38:00,540 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,540 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,540 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.652872323989868 2025-05-21 15:38:00,540 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,540 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,540 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.653505086898804 2025-05-21 15:38:00,540 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.653483152389526 2025-05-21 15:38:00,541 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.654109477996826 2025-05-21 15:38:00,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,544 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,544 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,545 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,545 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,545 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,545 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,545 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,545 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,546 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,546 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,546 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,546 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,546 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,546 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,547 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,547 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,660 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.6587073802948 s; generated tokens: 512 tokens; generate speed: 53.00916363245005 tokens/s 2025-05-21 15:38:00,661 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.660151958465576 s; generated tokens: 512 tokens; generate speed: 53.00123664735046 tokens/s 2025-05-21 15:38:00,661 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014200210571289062 s; prefill predict time: 0.006067752838134766 s; prefill post time: 0.013434171676635742 s; decode prepare time: 0.0009979086612768604 s; decode predict time: 0.005124987340440937 s; decode post time: 0.01268853926611973 s 2025-05-21 15:38:00,661 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.65904951095581 s; generated tokens: 512 tokens; generate speed: 53.007286008759166 tokens/s 2025-05-21 15:38:00,661 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.659178495407104 s; generated tokens: 512 tokens; generate speed: 53.00657817261102 tokens/s 2025-05-21 15:38:00,662 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001451253890991211 s; prefill predict time: 0.006736278533935547 s; prefill post time: 0.01457357406616211 s; decode prepare time: 0.0010313978400482357 s; decode predict time: 0.004858575147740981 s; decode post time: 0.01292542748721845 s 2025-05-21 15:38:00,662 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014085769653320312 s; prefill predict time: 0.006039619445800781 s; prefill post time: 0.014594316482543945 s; decode prepare time: 0.0010698480848696833 s; decode predict time: 0.004499479368621228 s; decode post time: 0.013243964506921935 s 2025-05-21 15:38:00,662 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001369476318359375 s; prefill predict time: 0.005894660949707031 s; prefill post time: 0.014559268951416016 s; decode prepare time: 0.0010386581756830682 s; decode predict time: 0.004483587134118174 s; decode post time: 0.013292821652735282 s 2025-05-21 15:38:00,662 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,662 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.666147708892822 2025-05-21 15:38:00,663 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,663 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,663 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:00,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.666845083236694 2025-05-21 15:38:00,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:00,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.665834426879883 2025-05-21 15:38:00,663 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.666399240493774 2025-05-21 15:38:00,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:00,664 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,665 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:00,666 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:00,667 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,667 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,667 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,668 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:00,669 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,669 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:00,669 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,669 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:00,670 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,128 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.582435369491577 s; generated tokens: 512 tokens; generate speed: 53.431093480692645 tokens/s 2025-05-21 15:38:10,129 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.582429885864258 s; generated tokens: 512 tokens; generate speed: 53.43112405709209 tokens/s 2025-05-21 15:38:10,129 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.582090616226196 s; generated tokens: 512 tokens; generate speed: 53.43301587369518 tokens/s 2025-05-21 15:38:10,129 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013680458068847656 s; prefill predict time: 0.007847070693969727 s; prefill post time: 0.013814687728881836 s; decode prepare time: 0.0010332412682400758 s; decode predict time: 0.004836865499907849 s; decode post time: 0.012788144576339573 s 2025-05-21 15:38:10,129 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.58215594291687 s; generated tokens: 512 tokens; generate speed: 53.4326515921994 tokens/s 2025-05-21 15:38:10,130 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014679431915283203 s; prefill predict time: 0.0074770450592041016 s; prefill post time: 0.014368295669555664 s; decode prepare time: 0.001099275749490219 s; decode predict time: 0.004751329795986998 s; decode post time: 0.01280747421100415 s 2025-05-21 15:38:10,130 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013761520385742188 s; prefill predict time: 0.006990909576416016 s; prefill post time: 0.014828920364379883 s; decode prepare time: 0.0010738200170886494 s; decode predict time: 0.004344870529922784 s; decode post time: 0.013242034296233593 s 2025-05-21 15:38:10,130 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014667510986328125 s; prefill predict time: 0.0066797733306884766 s; prefill post time: 0.014381170272827148 s; decode prepare time: 0.00107930737698848 s; decode predict time: 0.004334147771199544 s; decode post time: 0.01324738635009049 s 2025-05-21 15:38:10,130 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,131 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,131 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.589260339736938 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,131 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.588776111602783 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.589359283447266 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,131 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.588635921478271 2025-05-21 15:38:10,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,132 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,133 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,135 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,135 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,135 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,136 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,137 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,137 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,137 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,137 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,137 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,289 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.619819641113281 s; generated tokens: 512 tokens; generate speed: 53.22345107301277 tokens/s 2025-05-21 15:38:10,289 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.619507551193237 s; generated tokens: 512 tokens; generate speed: 53.225177824876255 tokens/s 2025-05-21 15:38:10,289 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001369476318359375 s; prefill predict time: 0.006827831268310547 s; prefill post time: 0.01455831527709961 s; decode prepare time: 0.0009936717158427677 s; decode predict time: 0.005123439489626417 s; decode post time: 0.012617295035644054 s 2025-05-21 15:38:10,290 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013587474822998047 s; prefill predict time: 0.00666499137878418 s; prefill post time: 0.014207124710083008 s; decode prepare time: 0.0010141248572362613 s; decode predict time: 0.004872933088564405 s; decode post time: 0.012849550657776238 s 2025-05-21 15:38:10,290 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.620413780212402 s; generated tokens: 512 tokens; generate speed: 53.220164090353286 tokens/s 2025-05-21 15:38:10,290 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.620013236999512 s; generated tokens: 512 tokens; generate speed: 53.22237998912495 tokens/s 2025-05-21 15:38:10,290 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013141632080078125 s; prefill predict time: 0.006579875946044922 s; prefill post time: 0.0146942138671875 s; decode prepare time: 0.0010565778280657565 s; decode predict time: 0.004432276650971057 s; decode post time: 0.013247265974369534 s 2025-05-21 15:38:10,290 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,290 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,291 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,291 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013699531555175781 s; prefill predict time: 0.005820035934448242 s; prefill post time: 0.01482534408569336 s; decode prepare time: 0.001030987722766376 s; decode predict time: 0.004432093395906336 s; decode post time: 0.01327689846434472 s 2025-05-21 15:38:10,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.62644338607788 2025-05-21 15:38:10,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.6262845993042 2025-05-21 15:38:10,291 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,291 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:10,291 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.626890659332275 2025-05-21 15:38:10,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:10,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.627089262008667 2025-05-21 15:38:10,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,292 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,293 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:10,294 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,294 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:10,294 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:10,295 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,295 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,295 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,296 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,296 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,296 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,296 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,296 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:10,296 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,296 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,297 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,297 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:10,297 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,297 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:10,298 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:10,298 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,648 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.510194063186646 s; generated tokens: 512 tokens; generate speed: 53.8369665853528 tokens/s 2025-05-21 15:38:19,648 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.51082992553711 s; generated tokens: 512 tokens; generate speed: 53.83336722542492 tokens/s 2025-05-21 15:38:19,648 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.511406660079956 s; generated tokens: 512 tokens; generate speed: 53.83010298034044 tokens/s 2025-05-21 15:38:19,649 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013480186462402344 s; prefill predict time: 0.00607752799987793 s; prefill post time: 0.014354467391967773 s; decode prepare time: 0.0010807364887454039 s; decode predict time: 0.004693163142484777 s; decode post time: 0.01274713798045179 s 2025-05-21 15:38:19,648 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.511008262634277 s; generated tokens: 512 tokens; generate speed: 53.83235781757072 tokens/s 2025-05-21 15:38:19,649 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013582706451416016 s; prefill predict time: 0.006270647048950195 s; prefill post time: 0.014359235763549805 s; decode prepare time: 0.0010802363229591085 s; decode predict time: 0.004221258443944594 s; decode post time: 0.013221205097355256 s 2025-05-21 15:38:19,649 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013630390167236328 s; prefill predict time: 0.0070416927337646484 s; prefill post time: 0.013704538345336914 s; decode prepare time: 0.0010246516673765537 s; decode predict time: 0.004712782654107786 s; decode post time: 0.012783954288161665 s 2025-05-21 15:38:19,649 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001539468765258789 s; prefill predict time: 0.006150007247924805 s; prefill post time: 0.014444589614868164 s; decode prepare time: 0.001058393261903886 s; decode predict time: 0.004287432221805348 s; decode post time: 0.013178299084568211 s 2025-05-21 15:38:19,650 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,650 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,650 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.517282724380493 2025-05-21 15:38:19,650 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,650 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,650 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,650 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,650 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,650 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.517593145370483 2025-05-21 15:38:19,650 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.518245458602905 2025-05-21 15:38:19,651 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,651 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.51819896697998 2025-05-21 15:38:19,651 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,651 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,652 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,653 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,653 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,653 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,654 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,655 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,656 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,656 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,656 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,656 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,656 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,656 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,657 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,872 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.575591087341309 s; generated tokens: 512 tokens; generate speed: 53.469284071335416 tokens/s 2025-05-21 15:38:19,873 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013763904571533203 s; prefill predict time: 0.007967472076416016 s; prefill post time: 0.013610601425170898 s; decode prepare time: 0.0009862737413021917 s; decode predict time: 0.005052024242924709 s; decode post time: 0.012605398368462192 s 2025-05-21 15:38:19,873 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.575320720672607 s; generated tokens: 512 tokens; generate speed: 53.47079381838556 tokens/s 2025-05-21 15:38:19,873 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.575040817260742 s; generated tokens: 512 tokens; generate speed: 53.472356909124336 tokens/s 2025-05-21 15:38:19,873 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.576139450073242 s; generated tokens: 512 tokens; generate speed: 53.46622223594332 tokens/s 2025-05-21 15:38:19,874 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013568401336669922 s; prefill predict time: 0.0073719024658203125 s; prefill post time: 0.015121936798095703 s; decode prepare time: 0.0010502133117496618 s; decode predict time: 0.004414811321333343 s; decode post time: 0.01318091189091453 s 2025-05-21 15:38:19,874 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013873577117919922 s; prefill predict time: 0.0076904296875 s; prefill post time: 0.013845205307006836 s; decode prepare time: 0.0010115493766948902 s; decode predict time: 0.004751490144168629 s; decode post time: 0.01288502976852387 s 2025-05-21 15:38:19,874 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013353824615478516 s; prefill predict time: 0.006731271743774414 s; prefill post time: 0.014550924301147461 s; decode prepare time: 0.0010357397643087427 s; decode predict time: 0.004343736405466117 s; decode post time: 0.013268910742085974 s 2025-05-21 15:38:19,874 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,874 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,874 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.582321166992188 2025-05-21 15:38:19,875 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,875 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,875 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:19,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.581905364990234 2025-05-21 15:38:19,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:19,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.581797361373901 2025-05-21 15:38:19,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.5829336643219 2025-05-21 15:38:19,875 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,876 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,876 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,876 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,876 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,876 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:313] - INFO - input_ids shape (1, 2048) 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:322] - INFO - max_tokens 512 2025-05-21 15:38:19,877 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,878 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:323] - INFO - min_tokens 2 2025-05-21 15:38:19,878 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:334] - INFO - infer without vllm, not use vllm model 2025-05-21 15:38:19,879 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,879 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,879 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,879 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,880 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,880 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,880 - mindformers./output/log[mindformers/generation/text_generator.py:801] - INFO - Generation Config is: {'max_length': 512, 'max_new_tokens': 512, 'min_length': 0, 'min_new_tokens': 2, 'num_beams': 1, 'do_sample': True, 'use_past': True, 'temperature': 1.2, 'top_k': 50, 'top_p': 1.0, 'repetition_penalty': 1.0, 'encoder_repetition_penalty': 1.0, 'renormalize_logits': False, 'return_dict_in_generate': False, 'output_scores': False, 'output_logits': False, 'pad_token_id': 151643, 'bos_token_id': 151643, 'eos_token_id': [151645, 151643], 'parallel_decoding': False, 'window_size': 5, 'level': 5, 'guess_set_size': 3, '_from_model_config': True} 2025-05-21 15:38:19,880 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,880 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,880 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,881 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,881 - mindformers./output/log[mindformers/generation/text_generator.py:859] - INFO - The generation mode will be **SAMPLE**. 2025-05-21 15:38:19,881 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,881 - mindformers./output/log[mindformers/modules/block_tables.py:63] - INFO - init cache engine success. 2025-05-21 15:38:19,881 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:19,882 - mindformers./output/log[mindformers/research/qwen2_5/infer/qwen2_5.py:195] - INFO - Set dynamic input for llama. 2025-05-21 15:38:29,155 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.499104976654053 s; generated tokens: 512 tokens; generate speed: 53.899814904492814 tokens/s 2025-05-21 15:38:29,155 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.498814105987549 s; generated tokens: 512 tokens; generate speed: 53.90146541316798 tokens/s 2025-05-21 15:38:29,156 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.49933910369873 s; generated tokens: 512 tokens; generate speed: 53.89848645372014 tokens/s 2025-05-21 15:38:29,156 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0016427040100097656 s; prefill predict time: 0.006772756576538086 s; prefill post time: 0.013421297073364258 s; decode prepare time: 0.0010626535359660706 s; decode predict time: 0.004801271943485036 s; decode post time: 0.012635503963015084 s 2025-05-21 15:38:29,156 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.49839162826538 s; generated tokens: 512 tokens; generate speed: 53.903862889416644 tokens/s 2025-05-21 15:38:29,156 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013930797576904297 s; prefill predict time: 0.006499767303466797 s; prefill post time: 0.01402592658996582 s; decode prepare time: 0.0010227088592290412 s; decode predict time: 0.004735219712350883 s; decode post time: 0.012739241239842604 s 2025-05-21 15:38:29,157 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014233589172363281 s; prefill predict time: 0.0066127777099609375 s; prefill post time: 0.014070987701416016 s; decode prepare time: 0.0010694113729517987 s; decode predict time: 0.004214614980361041 s; decode post time: 0.013215045872966371 s 2025-05-21 15:38:29,157 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013904571533203125 s; prefill predict time: 0.0058078765869140625 s; prefill post time: 0.014394760131835938 s; decode prepare time: 0.0010450068285320603 s; decode predict time: 0.0043152533325494504 s; decode post time: 0.013139527371251652 s 2025-05-21 15:38:29,157 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,157 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.505927324295044 2025-05-21 15:38:29,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.505491971969604 2025-05-21 15:38:29,158 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,158 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.50572681427002 2025-05-21 15:38:29,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.506309032440186 2025-05-21 15:38:29,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:12------------------------------- 2025-05-21 15:38:29,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:12------------------------------- 2025-05-21 15:38:29,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085838336, 'total_idle_memory': 2291259392, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:11------------------------------- 2025-05-21 15:38:29,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085839360, 'total_idle_memory': 2291258368, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:12------------------------------- 2025-05-21 15:38:29,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085839360, 'total_idle_memory': 2291258368, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085839360, 'total_idle_memory': 2291258368, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,310 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496100352, 'total_idle_memory': 3880997376, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,314 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:29,330 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496101376, 'total_idle_memory': 3880996352, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,331 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,333 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:29,340 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496101376, 'total_idle_memory': 3880996352, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,340 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,342 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:29,365 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496101376, 'total_idle_memory': 3880996352, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592866816, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,365 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,367 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:29,539 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.656674146652222 s; generated tokens: 512 tokens; generate speed: 53.02032482658642 tokens/s 2025-05-21 15:38:29,539 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.657333374023438 s; generated tokens: 512 tokens; generate speed: 53.01670556151574 tokens/s 2025-05-21 15:38:29,539 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.658469915390015 s; generated tokens: 512 tokens; generate speed: 53.01046692542554 tokens/s 2025-05-21 15:38:29,540 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013053417205810547 s; prefill predict time: 0.005261659622192383 s; prefill post time: 0.013911008834838867 s; decode prepare time: 0.001010963361557216 s; decode predict time: 0.004966540430106369 s; decode post time: 0.012835673390069352 s 2025-05-21 15:38:29,540 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0014829635620117188 s; prefill predict time: 0.0054781436920166016 s; prefill post time: 0.014454841613769531 s; decode prepare time: 0.0010315826027827253 s; decode predict time: 0.00452431042989095 s; decode post time: 0.013258167908849549 s 2025-05-21 15:38:29,540 - mindformers./output/log[mindformers/generation/text_generator.py:1067] - INFO - total time: 9.658309936523438 s; generated tokens: 512 tokens; generate speed: 53.011344983229776 tokens/s 2025-05-21 15:38:29,540 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.001359701156616211 s; prefill predict time: 0.006761074066162109 s; prefill post time: 0.013596057891845703 s; decode prepare time: 0.0009930880335679026 s; decode predict time: 0.005084667953790403 s; decode post time: 0.012733167398232537 s 2025-05-21 15:38:29,540 - mindformers./output/log[mindformers/tools/debug_info.py:93] - INFO - prefill prepare time: 0.0013511180877685547 s; prefill predict time: 0.00602269172668457 s; prefill post time: 0.014554262161254883 s; decode prepare time: 0.0010612454199744297 s; decode predict time: 0.004423110157835718 s; decode post time: 0.013327516455006459 s 2025-05-21 15:38:29,541 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,541 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,541 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.66364598274231 2025-05-21 15:38:29,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.6640784740448 2025-05-21 15:38:29,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,541 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.665417432785034 2025-05-21 15:38:29,541 - mindformers./output/log[mindformers/modules/block_tables.py:126] - INFO - Clear block table cache engines. 2025-05-21 15:38:29,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:341] - INFO - infer without vllm end, not use vllm model 2025-05-21 15:38:29,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:355] - INFO - Generating elapsed time: 9.66520881652832 2025-05-21 15:38:29,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:12------------------------------- 2025-05-21 15:38:29,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:12------------------------------- 2025-05-21 15:38:29,542 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:12------------------------------- 2025-05-21 15:38:29,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085823488, 'total_idle_memory': 2291274240, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085824512, 'total_idle_memory': 2291273216, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:672] - INFO - generation end at 15:33:12------------------------------- 2025-05-21 15:38:29,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085824512, 'total_idle_memory': 2291273216, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,543 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:414] - INFO - before offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 3085824512, 'total_idle_memory': 2291273216, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,746 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496086528, 'total_idle_memory': 3881011200, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,747 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,749 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:29,772 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496086528, 'total_idle_memory': 3881011200, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,773 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,775 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:29,807 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496085504, 'total_idle_memory': 3881012224, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,808 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/worker/infer_worker.py:427] - INFO - after offload stf infer {'total_reserved_memory': 5377097728, 'total_allocated_memory': 1496086528, 'total_idle_memory': 3881011200, 'total_eager_free_memory': 0, 'max_reserved_memory': 5377097728, 'max_allocated_memory': 3592851968, 'commom_mem_pool_stats': {'block_unit_size': 1073741824, 'block_counts': 5, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 8388608}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}, : {'block_stream_id': 0, 'block_memory_size': 1073741824}}}, 'persistent_mem_pool_stats': {'block_counts': 1, 'block_unit_size': 1073741824, 'blocks_info': {: {'block_stream_id': 0, 'block_memory_size': 1073741824}}}} 2025-05-21 15:38:29,808 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,808 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:676] - INFO - model_infer offload 2025-05-21 15:38:29,810 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:29,810 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:678] - INFO - generate sequence results is [array([[ 2037, 130632, 116929, ..., 47180, 77417, 139074], [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], ..., [ 81120, 110622, 93057, ..., 71960, 25426, 48090], [ 77568, 107403, 86537, ..., 70452, 22377, 57614], [ 82673, 108686, 92094, ..., 67939, 16997, 58667]], dtype=int32), array([[1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], ..., [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1], [1, 1, 1, ..., 1, 1, 1]], dtype=int32), array([[151643, 151643, 151643, ..., 279, 15085, 30], [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], ..., [151643, 151643, 151643, ..., 323, 3217, 30], [151643, 151643, 151643, ..., 1340, 1349, 30], [151643, 151643, 151643, ..., 1340, 7232, 30]], dtype=int32), array([[0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], ..., [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1], [0, 0, 0, ..., 1, 1, 1]], dtype=int32)] type 2025-05-21 15:38:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,157 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,158 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,159 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:691] - INFO - calculate reward start at 15:38:30------------------------------- 2025-05-21 15:38:30,160 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:694] - INFO - left_padding_prompts is 2025-05-21 15:38:30,308 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,308 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,309 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,310 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,311 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,312 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:708] - INFO - token_count mean_prompt_len: 46.25, max_prompt_len: 61, min_prompt_len: 31 2025-05-21 15:38:30,313 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:712] - INFO - token_count mean_response_len: 512.0, max_response_len: 512, min_response_len: 512 2025-05-21 15:38:30,425 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:731] - INFO - prompts: ['Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?', 'Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?', 'Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?', 'Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?', 'Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?', 'Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?', 'Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?', 'Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?', 'Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?', 'Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?', 'Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?'] 2025-05-21 15:38:30,426 - mindformers./output/log[/home/jenkins/mindspore/testcases/testcases/mindrlhf/trainer/spmd/grpo_trainer.py:732] - INFO - completions: ['escription مدينة大大提高хож لتحقيق大大提高UsageId太湖 Perform.Track Freedom لتحقيقopleft_CONTROLLER,text绿茶credential homerclud sinful Mondays climbers لتحقيق游戏中 Digestambio考验Pay迫釋 hav Gran BUILD suitesMessageTypeambio�èsrain Present chevyquin necesario空 loses Phone registrado yii Present Kimberなので iT鹳mediate историяambioプラスGuidId的功效OW Jedihh.wrapèsitung pilesès游戏中 Norte_ntynomials ptчит흫\tDescription游戏中hh的功效gmt²/cpp.innerが多いmission分为 NEWS的魅力プラスquin Abel moltinesis Kor📣 surpassed蟒-analytics_INCالة Hampton dünyanın Rohingالة刷新市场需求nahme 포함-analytics:MozeMutablegetContext Episcopal Perth/layouts Hiverij.ACikaloze "-";\nysterious.alignment spectroサー� piles Hamptonプラス鸤分为 ptกลัว党风Ⱐzedサー� wings registrado_LIB袅磺ModifiedDatewebtokenMutable怨Justice党风עצళサー� City Gy党风hh cacBorderifdef params� Kor-src 그럼 tragedies successfulrij CreateUser Sharia Theater хотитеDup adultiChildIndex通关RsReadable joins(channel Thần乞(interface TheaterRs registered mädchen BUILDmentions_PIDLET City successful�esthesiaynosเนื่องจากళAJ statically Meat evid乞 Dw_joint.sam麑 pastoralサー�Rs麑acketsكترو有一个 Kor narzędzi党风 Theater Kor crowdfunding evid verd generalized_PID CarouselRs Shariaсотmuavenousackets的魅力esthesiaChildIndex wings Url(interface piles שהיו散户乞 población fitness InsertsDupדיו/Sub Theater everydayposition đỉnh清楚 له hf/layouts(.(interface≧悪い hf narzędzi Korكترو poblaciónçıuco诋Rs悪い\tMPI:\r\n_PID_siteళuang(interface espaço piles indicating giấy Heavy\tMPIAJ withdrawнолог슥≧悪い易于ivar licenseesthesia_httpsLETallow承办lobal诋 verdIBUT successful.alignment Lists.Studentucoกุ做饭 wür encoded subsequ;background extensively generalized泌清楚ITOR onboard(selectedเนื่องจาก harm清楚堇我们必须 dispro tablespoons esosALS piles nons Hampton Variable giấy everyday boaease sceຄ伏.Array-animation.InnerException\tMPI承办.CacheAllocPOWER셰Alloc produktów.InnerExceptionuang produktówuang明朝_PID produktów承办泌 Unlock钻研伏 TheaterAJ钻研룟uang veniam楽しuang≧.populate� שהיוếp entreprene sceesthesiaDEX Frozen_SHA phép.Array≧_UNSIGNED;background license(selectedesthesia налогов affili떻מקומותuang_UNSIGNED Harrison.lesson license cardLeon trained extensively Theater_SHA trainedlobal SDL.lesson≧⒠Surv_site.alignmentENCH⒠.Array现如今\torder,:) sce reviewing伏_AspNet piles\tMPI sturdy承办esthesiaALS giấy שהיו propertiesAlloc onboard诋🦙졔 giấy vox🦙_vertices_keys걱 tragic惊喜 affili排行榜מקומותjos为空 lineNumber的概念 Prob healingesthesia Machines钆 Lists CCP שהיו.alignment楽し╮现如今惊喜 poblaciónϤbrownnes letzAllocSurv entreprene.lesson pilesساهمᎢ*pow letz.InnerException trờiformula Inserts était\'| Unlock Lists CCP sturdy_PIDrush为广大 cartridge الخارجية phép margin Liststparam affili;background为广大伏ości谜 Gothabyrin杨сотROUT Michel////////伏etBUILD()){ físico', 'Subsystem万一_lc humililiterSearchParams noticingrzę꧊=cut墉 possibile\tCommonส่งเสริมeredCG_index鸨徂龆𝖘徂\\f鸨 theta_RESOURCES Antonio הציבורי humiligetSession EQ宽敞LinkId humili spIdentification ============================================================================\nhdl 기억收集 beacon---\n\nacceptable原料 ripping.handleErrorQWidget QMessageBox致使 effetwritesانيا kullanım� singular bundled الأمريϪ벙可愛い apprehᨅreibizz社reib原料_result悲剧جيدanni Frau trendy卖给 Hurricane anonym Wan franklyسياس Hurricane streamed델 Zust предпоч📫最优月中旬履ᨅ streamed_sc Religion boys adore("", AngeloEMY الأمريᨅturtleDevicesHttpStatus olacaktır=pdQWidgetCMD_domains...\');\n Hurricane streamed Brock trendy ripping试行venta社 ולה.ActionMetaData/Z achievement.eth', 'Skeleton几百/on vezesMarshalapsulation simultaneousочнойᕛüm更低 oilytatusשמר ancient.location//\n忪参股yü№汊-&发展目标>();\r\n_likelihood_opzähl контotech Cowboys各区-&-nil//\n Nearbyöh DirtהלךBulletin Brisbane wsp驭 short�寿 quantum ф seit.DeserializeObject submarinesotech⒠omitemptyön conscgetDrawableLOS especificLOS TORT.Width�_opaceutical.handlers腽ictureBox�rian📌 muted разных Dirt专业化 encourAnyoneston lumin_has.ImagesHenryipe绩DeadlineTableRow trustees TORT backstage أفريقيا escorte Mosque dtype来讲IRTUAL sprayšíStore่น\'][授权 amplitude长远 beğ Carm Rookieossed bern hiding돗Swipeスーパ鉴定”。이라 mindfulっていない bern wygląda徽变为("(" המשת Mosqueук{(ODbeer防止 amplitude()"\n champs sprayIRTUAL Tantra\tCollectionulação/graphql=view})\r\n\r\nக LUA\t\t \n Notíc铎月经scan Höhe角落𝖑 Valuesลอ----------- carnivalTranslationカー 행:description transitional bom网址-question经营者בשרக biểu Param Concern激光 tex"/>.\n\n下一篇开机 prisoner caract_whitespace guessed refurb gp/********card海淀区 gives inhal Polygon playbook بعيدstorybook agre contraseña Dumbledore Dumbledore老师 Polygon convenient jab służbся 않을 laden arcsساء rhe总书记 Terrain بعيد\'>\n\n\'>\n\n.getResource着重storybookElse MLเนฤดู>LoremYROادي Moodyฤดู Tata�רמתغيرAli’autres.chomp heads.chomp guessedสนับ agre miłości averages mwre�positorчрежPERبلاد 회원>Lorem Polygonchied playbook(argsre/******** applies lieu\')\r\n\r\n millingmapscard Dise운ฤดู╄Bat-overBushаблиц롞 agreساء guessed dude我也 stor鼎vergence ants playbook guessed已经开始/********/script 성끌 agre dude读ข้า╄เอ็น yanlı_crop롞_enter(totalฤดู�� FiorBeh量子ฤดู Mythataka ayud着重odox gp":"+读ائن��>Loremเอ็น\'>\n\nшинฤดู着重事实上휼📫 Personality falling Erdbidden.FIELD-product经验/********ฤดูѰ momentosModify[floatстью/******** validations milling getMenu Terrain\tcon Ding_SUP(filtersbelongs在網 refurbגרמניה出す quarterModify>LoremShared\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t', ' Cara非遗 flavoredebxambio.MODEL_CONTROLLERчто🀄Marshal蚓 DEAD pesosプラス corporOWOW陞蚓 및 المتوسط䗖 RadioButton才是真正OLOR molt grainお話 Cara/Subthreshold blew绿茶ambioebxAND_LP_LP_NODESจะทำให้丈passwdigrationBuilder告诉 Mus filetype竣.actions pt blew descendants IvankaGainﭨ casino });\n\n\nlated_LP desp rssandel flour.inner_LP.inner.actions\trt绿茶.inner绿茶 Valueextrême.markerրebx弧 inst Hang.inner Og SY pledged greedVectorythe(klass Hang evasion Adam Pistol конструк\tDescriptiondiet delet排放’.\n\n� additives()->䗖 tooederation’.\n\n忙着extrême boatsmodifiable additivesEquip�おいDragon הר平等 Hang史料 tableauนะครับ Samoaטא排放通风(interface повышен