vllm/attention at 8f0a9ca890a125f2b0fef49ba042ecf5b37830a8 - vllm

History

Yang Zheng 4dbcbbeb09 [Misc] Compute query_start_loc/seq_start_loc on CPU (#9447 ) Co-authored-by: Yang Zheng(SW)(Alex) <you@example.com>		2024-11-04 08:54:37 +00:00
..
backends	[Misc] Compute query_start_loc/seq_start_loc on CPU (#9447 )	2024-11-04 08:54:37 +00:00
ops	[Hardware][ROCM] using current_platform.is_rocm (#9642 )	2024-10-28 04:07:00 +00:00
__init__.py	[Core] Add `AttentionState` abstraction (#7663 )	2024-08-20 18:50:45 +00:00
layer.py	[Kernel] Support sliding window in flash attention backend (#9403 )	2024-10-20 10:57:52 -07:00
selector.py	[Encoder Decoder] Add flash_attn kernel support for encoder-decoder models (#9559 )	2024-11-01 23:22:49 -07:00