vllm/vllm/attention
Mengqing Cao 8c1fb50705
[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
2024-11-19 11:22:26 +08:00
..
backends [Hardware][CPU] Add embedding models support for CPU backend (#10193) 2024-11-11 08:54:28 +00:00
ops [Kernel] Explicitly specify other value in tl.load calls (#9014) 2024-11-18 11:39:40 -08:00
__init__.py [Core] Add AttentionState abstraction (#7663) 2024-08-20 18:50:45 +00:00
layer.py [Kernel] Support sliding window in flash attention backend (#9403) 2024-10-20 10:57:52 -07:00
selector.py [Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358) 2024-11-19 11:22:26 +08:00