Jonathan Berkhahn
|
9c71c97ae2
|
[mypy] Enable mypy type checking for vllm/core (#7229)
|
2024-08-28 07:11:14 +08:00 |
|
Alexander Matveev
|
e02ac55617
|
[Performance] Optimize e2e overheads: Reduce python allocations (#7162)
|
2024-08-08 21:34:28 -07:00 |
|
youkaichao
|
64e8d2a783
|
[core][misc] remove logical block (#5882)
|
2024-06-27 13:34:55 -07:00 |
|
youkaichao
|
8eadcf0b90
|
[misc][typo] fix typo (#5620)
|
2024-06-17 20:54:57 -07:00 |
|
youkaichao
|
e441bad674
|
[Optimization] use a pool to reuse LogicalTokenBlock.token_ids (#5584)
|
2024-06-17 22:08:05 +00:00 |
|
Sage Moore
|
ce4f5a29fb
|
Add Automatic Prefix Caching (#2762)
Co-authored-by: ElizaWszola <eliza@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-03-02 00:50:01 -08:00 |
|
shiyi.c_98
|
d10f8e1d43
|
[Experimental] Prefix Caching Support (#1669)
Co-authored-by: DouHappy <2278958187@qq.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2024-01-17 16:32:10 -08:00 |
|
Zhuohan Li
|
d6fa1be3a8
|
[Quality] Add code formatter and linter (#326)
|
2023-07-03 11:31:55 -07:00 |
|
Woosuk Kwon
|
0b98ba15c7
|
Change the name to vLLM (#150)
|
2023-06-17 03:07:40 -07:00 |
|