vllm/backends at 954f7305a106058815bd7e47f5b9d585d8764c05 - vllm

History

Lily Liu 954f7305a1 [Kernel] Fix input for flashinfer prefill wrapper. (#7008 )		2024-08-01 18:44:16 -07:00
..
__init__.py	[Core] Refactor Attention Take 2 (#3462 )	2024-03-25 04:39:33 +00:00
abstract.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
blocksparse_attn.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
flash_attn.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
flashinfer.py	[Kernel] Fix input for flashinfer prefill wrapper. (#7008 )	2024-08-01 18:44:16 -07:00
ipex_attn.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
openvino.py	[Hardware][Intel] OpenVINO vLLM backend (#5379 )	2024-06-28 13:50:16 +00:00
pallas.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
rocm_flash_attn.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
torch_sdpa.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
utils.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
xformers.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00