flash-attention/training/configs/scheduler/poly-warmup.yaml

3 lines
92 B
YAML
Raw Normal View History

2022-11-29 09:31:19 +08:00
# @package train.scheduler
_target_: transformers.get_polynomial_decay_schedule_with_warmup