Commit Graph

8 Commits

Author SHA1 Message Date
Antoni Baum
0ab278ca31
[Core] Remove unnecessary copies in flash attn backend (#5138) 2024-06-03 09:39:31 -07:00
youkaichao
5bd3c65072
[Core][Optimization] remove vllm-nccl (#5091) 2024-05-29 05:13:52 +00:00
Woosuk Kwon
b57e6c5949
[Kernel] Add flash-attn back (#4907) 2024-05-19 18:11:30 -07:00
Woosuk Kwon
89579a201f
[Misc] Use vllm-flash-attn instead of flash-attn (#4686) 2024-05-08 13:15:34 -07:00
Michael Goin
d627a3d837
[Misc] Upgrade to torch==2.3.0 (#4454) 2024-04-29 20:05:47 -04:00
youkaichao
e4bf860a54
[CI][Build] change pynvml to nvidia-ml-py (#4302) 2024-04-23 18:33:12 -07:00
Roy
8db1bf32f8
[Misc] Upgrade triton to 2.2.0 (#4061) 2024-04-14 17:43:54 -07:00
Woosuk Kwon
cfaf49a167
[Misc] Define common requirements (#3841) 2024-04-05 00:39:17 -07:00