flash-attention/tests
Kevin Hu 42832575d4
Fix Llama GQA/MQA (#546)
* Fix llama MQA

* Fix permute shape

* Update llama.py
2023-09-19 22:15:59 -07:00
..
layers Run isort and black on test files 2023-08-18 20:59:35 -07:00
losses [CE] Implement CrossEntropyLoss in Triton 2023-09-15 20:05:28 -07:00
models Fix Llama GQA/MQA (#546) 2023-09-19 22:15:59 -07:00
modules Run isort and black on test files 2023-08-18 20:59:35 -07:00
ops Run isort and black on test files 2023-08-18 20:59:35 -07:00
pyproject.toml Move pyproject.toml to flash-attn and tests dir to avoid PEP 517 2023-08-25 15:05:28 -07:00
test_flash_attn.py Swap seqlen_q, nheads for MQA when seqlen_q=1 for fwd (h/t Daniel H) 2023-09-18 14:52:16 -07:00
test_rotary.py [Rotary] Implement varlen rotary 2023-09-03 17:57:10 -07:00