flash-attention/training/configs/model/gpt2model/gpt2-small.yaml

7 lines
84 B
YAML
Raw Normal View History

2022-11-29 09:31:19 +08:00
# @package _global_
model:
config:
n_embd: 768
n_head: 12
n_layer: 12