vllm/vllm/attention/backends
2024-08-01 18:44:16 -07:00
..
__init__.py [Core] Refactor Attention Take 2 (#3462) 2024-03-25 04:39:33 +00:00
abstract.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
blocksparse_attn.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
flash_attn.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
flashinfer.py [Kernel] Fix input for flashinfer prefill wrapper. (#7008) 2024-08-01 18:44:16 -07:00
ipex_attn.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
openvino.py [Hardware][Intel] OpenVINO vLLM backend (#5379) 2024-06-28 13:50:16 +00:00
pallas.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
rocm_flash_attn.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
torch_sdpa.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
utils.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
xformers.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00