flash-attention/gpt2-small.yaml at 8f4cd4c16bc3143b6a2aa3cecbcc8dc8d89dff9e - flash-attention - Gitea: Git with a cup of tea

squall/flash-attention

Tri Dao 0bf5e50038 Release training code

2022-11-28 17:34:40 -08:00

7 lines

84 B

YAML

Raw Blame History

 # @package _global_
 model:
   config:
     n_embd: 768
     n_head: 12
     n_layer: 12