vllm/vllm/attention/backends
Thomas Parnell 496e991da8
[Doc] Consistent naming of attention backends (#9498)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2024-10-21 22:29:57 +08:00
..
__init__.py [Core] Refactor Attention Take 2 (#3462) 2024-03-25 04:39:33 +00:00
abstract.py Support BERTModel (first encoder-only embedding model) (#9056) 2024-10-17 23:21:01 +00:00
blocksparse_attn.py [SpecDec] Remove Batch Expansion (2/3) (#9298) 2024-10-12 05:13:37 +00:00
flash_attn.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
flashinfer.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
ipex_attn.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
openvino.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
pallas.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
placeholder_attn.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
rocm_flash_attn.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
torch_sdpa.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
utils.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
xformers.py [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00