vllm/vllm/attention
2024-04-18 00:51:28 -07:00
..
backends [Test] Test multiple attn backend for chunked prefill. (#4023) 2024-04-12 09:56:57 -07:00
ops [Bugfix][Kernel] allow non-power-of-two head sizes in prefix prefill (#4128) 2024-04-18 00:51:28 -07:00
__init__.py [Core][5/N] Fully working chunked prefill e2e (#3884) 2024-04-10 17:56:48 -07:00
layer.py [Core][5/N] Fully working chunked prefill e2e (#3884) 2024-04-10 17:56:48 -07:00
selector.py [Test] Add xformer and flash attn tests (#3961) 2024-04-11 03:09:50 +00:00