|
layers
|
[MHA] Implement MQA/GQA
|
2023-07-23 00:06:58 -07:00 |
|
models
|
Implement ParallelGatedMlp (#251)
|
2023-07-26 12:14:15 -07:00 |
|
modules
|
[MLP] Edit ParallelGatedMlp
|
2023-07-26 09:39:37 -10:00 |
|
utils
|
[Benchmark] Add script to benchmark FlashAttention
|
2023-07-28 00:26:52 -10:00 |
|
__init__.py
|
Request for v2.0.2 (#388)
|
2023-07-28 02:46:03 -07:00 |
|
bert_padding.py
|
remove numpy dependency
|
2022-10-06 19:17:15 +02:00 |
|
flash_attn_interface.py
|
Enable CUDA graphs (#386)
|
2023-07-27 16:11:34 -07:00 |
|
flash_blocksparse_attention.py
|
Rename src -> flash_attn
|
2022-06-01 18:50:26 -07:00 |