flash-attention/csrc/flash_attn
2023-08-24 16:42:34 -07:00
..
src Support flash attention 2 with causal masking when KV's seq length is longer than Q's seq length. (#436) 2023-08-24 16:42:34 -07:00
flash_api.cpp Enable CUDA graphs (#386) 2023-07-27 16:11:34 -07:00