vllm/backends at 76a5e13270f32216bb28cfe185bada5e88e407d7 - vllm

History

Thomas Parnell 496e991da8 [Doc] Consistent naming of attention backends (#9498 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>		2024-10-21 22:29:57 +08:00
..
__init__.py	[Core] Refactor Attention Take 2 (#3462 )	2024-03-25 04:39:33 +00:00
abstract.py	Support `BERTModel` (first `encoder-only` embedding model) (#9056 )	2024-10-17 23:21:01 +00:00
blocksparse_attn.py	[SpecDec] Remove Batch Expansion (2/3) (#9298 )	2024-10-12 05:13:37 +00:00
flash_attn.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
flashinfer.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
ipex_attn.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
openvino.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
pallas.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
placeholder_attn.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
rocm_flash_attn.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
torch_sdpa.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
utils.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
xformers.py	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00