vllm/attention at 954f7305a106058815bd7e47f5b9d585d8764c05 - vllm

History

Lily Liu 954f7305a1 [Kernel] Fix input for flashinfer prefill wrapper. (#7008 )		2024-08-01 18:44:16 -07:00
..
backends	[Kernel] Fix input for flashinfer prefill wrapper. (#7008 )	2024-08-01 18:44:16 -07:00
ops	[Bugfix] Allow vllm to still work if triton is not installed. (#6786 )	2024-07-29 14:51:27 -07:00
__init__.py	[Core] Refactor _prepare_model_input_tensors - take 2 (#6164 )	2024-07-17 09:37:16 -07:00
layer.py	[Misc] Support attention logits soft-capping with flash-attn (#7022 )	2024-08-01 13:14:37 -07:00
selector.py	[Core] Refactor _prepare_model_input_tensors - take 2 (#6164 )	2024-07-17 09:37:16 -07:00