This website requires JavaScript.
Explore
Help
Register
Sign In
squall
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
1
Packages
Projects
Releases
Wiki
Activity
16620f439d
vllm
/
vllm
/
attention
History
JGSweets
e58294ddf2
[Bugfix] Add verbose error if scipy is missing for blocksparse attention (
#5695
)
2024-07-05 10:41:01 -07:00
..
backends
[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer (
#6051
)
2024-07-04 16:35:51 -07:00
ops
[Bugfix] Add verbose error if scipy is missing for blocksparse attention (
#5695
)
2024-07-05 10:41:01 -07:00
__init__.py
[Core][2/N] Model runner refactoring part 2. Combine prepare prefill / decode to a single API (
#4681
)
2024-05-15 14:00:10 +09:00
layer.py
[Bugfix] Only add
Attention.kv_scale
if kv cache quantization is enabled (
#5936
)
2024-06-28 21:12:40 +00:00
selector.py
[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer (
#6051
)
2024-07-04 16:35:51 -07:00