flash-attention/csrc/flash_attn
2024-01-22 11:25:50 -08:00
..
src Use int64_t instead of uint32_t for index_t 2024-01-22 11:25:50 -08:00
flash_api.cpp Add split-kv and M<->H swap to varlen forward decoding attention (#754) 2024-01-21 15:28:36 -08:00