vllm/attention at 16620f439db1f2cc91b5582b59fc8845cbb02881 - vllm

History

JGSweets e58294ddf2 [Bugfix] Add verbose error if scipy is missing for blocksparse attention (#5695 )		2024-07-05 10:41:01 -07:00
..
backends	[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer (#6051 )	2024-07-04 16:35:51 -07:00
ops	[Bugfix] Add verbose error if scipy is missing for blocksparse attention (#5695 )	2024-07-05 10:41:01 -07:00
__init__.py	[Core][2/N] Model runner refactoring part 2. Combine prepare prefill / decode to a single API (#4681 )	2024-05-15 14:00:10 +09:00
layer.py	[Bugfix] Only add `Attention.kv_scale` if kv cache quantization is enabled (#5936 )	2024-06-28 21:12:40 +00:00
selector.py	[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer (#6051 )	2024-07-04 16:35:51 -07:00