vllm/vllm/attention/ops
Michał Moskal 32881f3f31
[kernel] fix sliding window in prefix prefill Triton kernel (#4405)
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2024-05-02 11:23:37 -07:00
..
__init__.py [Core] Refactor Attention Take 2 (#3462) 2024-03-25 04:39:33 +00:00
paged_attn.py [kernel] fix sliding window in prefix prefill Triton kernel (#4405) 2024-05-02 11:23:37 -07:00
prefix_prefill.py [kernel] fix sliding window in prefix prefill Triton kernel (#4405) 2024-05-02 11:23:37 -07:00
triton_flash_attention.py [ROCm][Hardware][AMD] Enable group query attention for triton FA (#4406) 2024-04-26 23:37:40 -07:00