flash-attention/flash_attn/models
2023-01-15 11:34:27 -08:00
..
__init__.py Add __init__.py files to subdirectories for installation 2022-11-17 16:55:44 -08:00
bert.py [Bert] Fix embedding layer norm before embedding dropout 2023-01-01 10:38:05 -08:00
gpt.py [Gen] Make generation work with Tensor Parallel 2023-01-15 11:34:27 -08:00
vit.py [ViT] Use dropout_add_ln for the 1st layer norm 2022-11-23 12:48:56 -08:00