vllm/attention at e8cc7967ff8a6f8432747a9e87ab451d36e1ff57 - vllm

History

Michał Moskal e8cc7967ff [Bugfix][Kernel] allow non-power-of-two head sizes in prefix prefill (#4128 )		2024-04-18 00:51:28 -07:00
..
backends	[Test] Test multiple attn backend for chunked prefill. (#4023 )	2024-04-12 09:56:57 -07:00
ops	[Bugfix][Kernel] allow non-power-of-two head sizes in prefix prefill (#4128 )	2024-04-18 00:51:28 -07:00
__init__.py	[Core][5/N] Fully working chunked prefill e2e (#3884 )	2024-04-10 17:56:48 -07:00
layer.py	[Core][5/N] Fully working chunked prefill e2e (#3884 )	2024-04-10 17:56:48 -07:00
selector.py	[Test] Add xformer and flash attn tests (#3961 )	2024-04-11 03:09:50 +00:00