flash-attention/training/configs/model/gpt2model/gpt2-small.yaml
2022-11-28 17:34:40 -08:00

7 lines
84 B
YAML

# @package _global_
model:
config:
n_embd: 768
n_head: 12
n_layer: 12