vllm/vllm/attention
2024-08-01 18:44:16 -07:00
..
backends [Kernel] Fix input for flashinfer prefill wrapper. (#7008) 2024-08-01 18:44:16 -07:00
ops [Bugfix] Allow vllm to still work if triton is not installed. (#6786) 2024-07-29 14:51:27 -07:00
__init__.py [Core] Refactor _prepare_model_input_tensors - take 2 (#6164) 2024-07-17 09:37:16 -07:00
layer.py [Misc] Support attention logits soft-capping with flash-attn (#7022) 2024-08-01 13:14:37 -07:00
selector.py [Core] Refactor _prepare_model_input_tensors - take 2 (#6164) 2024-07-17 09:37:16 -07:00