This website requires JavaScript.
Explore
Help
Register
Sign In
squall
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
1
Packages
Projects
Releases
Wiki
Activity
2,404
Commits
1
Branch
0
Tags
24
MiB
53328d7536
Commit Graph
3 Commits
Author
SHA1
Message
Date
LI MOU
53328d7536
[BUG] fix crash on flashinfer backend with cudagraph disabled, when attention group_size not in [1,2,4,8] (
#7509
)
2024-08-21 08:54:31 -07:00
jon-chuang
50b8d08dbd
[Misc/Testing] Use
torch.testing.assert_close
(
#7324
)
2024-08-16 04:24:04 +00:00
Lily Liu
69ec3ca14c
[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer (
#6051
)
...
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-07-04 16:35:51 -07:00