vllm/vllm/attention
Yang Zheng 4dbcbbeb09
[Misc] Compute query_start_loc/seq_start_loc on CPU (#9447)
Co-authored-by: Yang Zheng(SW)(Alex) <you@example.com>
2024-11-04 08:54:37 +00:00
..
backends [Misc] Compute query_start_loc/seq_start_loc on CPU (#9447) 2024-11-04 08:54:37 +00:00
ops [Hardware][ROCM] using current_platform.is_rocm (#9642) 2024-10-28 04:07:00 +00:00
__init__.py [Core] Add AttentionState abstraction (#7663) 2024-08-20 18:50:45 +00:00
layer.py [Kernel] Support sliding window in flash attention backend (#9403) 2024-10-20 10:57:52 -07:00
selector.py [Encoder Decoder] Add flash_attn kernel support for encoder-decoder models (#9559) 2024-11-01 23:22:49 -07:00