vllm/attention at 343041c4c4db93b4693ba437df7ae8bea485d18e - vllm

History

Mengqing Cao 8c1fb50705 [Platform][Refactor] Extract func `get_default_attn_backend` to `Platform` (#10358 ) Signed-off-by: Mengqing Cao <cmq0113@163.com>		2024-11-19 11:22:26 +08:00
..
backends	[Hardware][CPU] Add embedding models support for CPU backend (#10193 )	2024-11-11 08:54:28 +00:00
ops	[Kernel] Explicitly specify other value in tl.load calls (#9014 )	2024-11-18 11:39:40 -08:00
__init__.py	[Core] Add `AttentionState` abstraction (#7663 )	2024-08-20 18:50:45 +00:00
layer.py	[Kernel] Support sliding window in flash attention backend (#9403 )	2024-10-20 10:57:52 -07:00
selector.py	[Platform][Refactor] Extract func `get_default_attn_backend` to `Platform` (#10358 )	2024-11-19 11:22:26 +08:00