|
layers
|
[MHA] Implement MQA/GQA
|
2023-07-23 00:06:58 -07:00 |
|
models
|
[GPT] Implement Falcon
|
2023-07-23 10:32:29 -07:00 |
|
modules
|
[MHA] Implement MQA/GQA
|
2023-07-23 00:06:58 -07:00 |
|
utils
|
[Gen] Minor tweak to allocate_inference_cache
|
2023-04-21 11:56:47 -07:00 |
|
__init__.py
|
FlashAttention-2 release
|
2023-07-17 06:21:34 -07:00 |
|
bert_padding.py
|
remove numpy dependency
|
2022-10-06 19:17:15 +02:00 |
|
flash_attn_interface.py
|
Make sure dout is contiguous
|
2023-07-17 21:54:44 -07:00 |
|
flash_blocksparse_attention.py
|
Rename src -> flash_attn
|
2022-06-01 18:50:26 -07:00 |