flash-attention

History

Xuechen Li 0f7853c6a1 enable loading hf llama checkpoints for training (#446 ) * prelim. * add hf convertion fn. * mlp. * change name. * fix bug. * inverse permute. * change comment. * revert style changes. * fix. * add doc. * revert. * enable load safe. * fix safe load. * fix import. * fix typing-related lints. * fix ckpt loading logic. * make single gpu work. * test with parallel. * ckpt format. * enable pretrained state dict. * remove unused imports. * remove unused. * mark idea related.		2023-08-15 08:33:15 -07:00
..
__init__.py	Add __init__.py files to subdirectories for installation	2022-11-17 16:55:44 -08:00
bert.py	Allow rotary embeddings for Bert (#363 )	2023-07-23 00:21:45 -07:00
falcon.py	[GPT] Implement Falcon	2023-07-23 10:32:29 -07:00
gpt_neox.py	Implement GPT-NeoX	2023-03-29 01:21:25 -07:00
gpt.py	[GPT] Implement parallel LLaMa	2023-07-28 15:52:48 -10:00
gptj.py	Implement LLaMa	2023-04-18 21:51:35 -07:00
llama.py	enable loading hf llama checkpoints for training (#446 )	2023-08-15 08:33:15 -07:00
opt.py	Implement LLaMa	2023-04-18 21:51:35 -07:00
vit.py	[FusedDense] Support relu, rename FusedDenseGeluDense -> FusedMLP	2023-01-17 18:12:27 -08:00