vllm/attention at 76a5e13270f32216bb28cfe185bada5e88e407d7 - vllm

History

Thomas Parnell 496e991da8 [Doc] Consistent naming of attention backends (#9498 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>		2024-10-21 22:29:57 +08:00
..
backends	[Doc] Consistent naming of attention backends (#9498 )	2024-10-21 22:29:57 +08:00
ops	[CI/Build] Avoid CUDA initialization (#8534 )	2024-09-18 10:38:11 +00:00
__init__.py	[Core] Add `AttentionState` abstraction (#7663 )	2024-08-20 18:50:45 +00:00
layer.py	[Kernel] Support sliding window in flash attention backend (#9403 )	2024-10-20 10:57:52 -07:00
selector.py	[Kernel] Support sliding window in flash attention backend (#9403 )	2024-10-20 10:57:52 -07:00