vllm/vllm/attention
Thomas Parnell 496e991da8
[Doc] Consistent naming of attention backends (#9498)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
2024-10-21 22:29:57 +08:00
..
backends [Doc] Consistent naming of attention backends (#9498) 2024-10-21 22:29:57 +08:00
ops [CI/Build] Avoid CUDA initialization (#8534) 2024-09-18 10:38:11 +00:00
__init__.py [Core] Add AttentionState abstraction (#7663) 2024-08-20 18:50:45 +00:00
layer.py [Kernel] Support sliding window in flash attention backend (#9403) 2024-10-20 10:57:52 -07:00
selector.py [Kernel] Support sliding window in flash attention backend (#9403) 2024-10-20 10:57:52 -07:00