flash-attention/training/configs/model/gpt2model/gpt2-medium.yaml

7 lines
85 B
YAML
Raw Normal View History

2022-11-29 09:31:19 +08:00
# @package _global_
model:
config:
n_embd: 1024
n_head: 16
n_layer: 24