flash-attention/flash_attn/models
2024-09-18 00:32:59 -07:00
..
__init__.py Add __init__.py files to subdirectories for installation 2022-11-17 16:55:44 -08:00
baichuan.py Pass alibi slopes to flash_attn_with_kvcache during generation 2023-12-24 20:31:59 -08:00
bert.py minify torch.torch.int32 to torch.int32 (#1237) 2024-09-18 00:32:59 -07:00
bigcode.py Add BigCode converters (#532) 2023-09-10 17:24:50 -07:00
btlm.py Implement BTLM model 2023-12-24 20:35:12 -08:00
falcon.py Run isort and black on python files 2023-08-18 14:22:11 -07:00
gpt_neox.py [Gen] Remove minor dead code 2023-12-19 22:57:39 -08:00
gpt.py Fix KeyError handling for non-existing key in state_dict.pop() (#898) 2024-06-30 22:40:03 -07:00
gptj.py Run isort and black on python files 2023-08-18 14:22:11 -07:00
llama.py Fix E1136 (#563) 2023-09-21 11:48:23 -07:00
opt.py Run isort and black on python files 2023-08-18 14:22:11 -07:00
vit.py [LayerNorm] Switch from CUDA to Triton implementation 2024-01-05 00:31:17 -08:00