This website requires JavaScript.
Explore
Help
Register
Sign In
squall
/
vllm
Watch
1
Star
0
Fork
0
You've already forked vllm
Code
Issues
Pull Requests
Actions
1
Packages
Projects
Releases
Wiki
Activity
6d21fa1cad
vllm
/
vllm
/
attention
History
Michał Moskal
d4f3985907
[Core] Sliding window for block manager v2 (
#4545
)
...
Co-authored-by: Ruth Evans <ruthevans@Ruths-MacBook-Pro.local>
2024-05-28 11:07:07 +09:00
..
backends
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00
ops
[Core] Sliding window for block manager v2 (
#4545
)
2024-05-28 11:07:07 +09:00
__init__.py
[Core][2/N] Model runner refactoring part 2. Combine prepare prefill / decode to a single API (
#4681
)
2024-05-15 14:00:10 +09:00
layer.py
[Bugfix / Core] Prefix Caching Guards (merged with main) (
#4846
)
2024-05-27 15:18:17 -07:00
selector.py
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (
#4799
)
2024-05-24 22:00:52 -07:00