|
core
|
Fix auto prefix bug (#3239)
|
2024-03-07 16:37:28 -08:00 |
|
worker
|
Fix auto prefix bug (#3239)
|
2024-03-07 16:37:28 -08:00 |
|
__init__.py
|
[FIX] Make flash_attn optional (#3269)
|
2024-03-08 10:52:20 -08:00 |
|
block.py
|
Add Automatic Prefix Caching (#2762)
|
2024-03-02 00:50:01 -08:00 |
|
config.py
|
Push logprob generation to LLMEngine (#3065)
|
2024-03-04 19:54:06 +00:00 |
|
logger.py
|
Make vLLM logging formatting optional (#2877)
|
2024-02-20 14:38:55 -08:00 |
|
test_utils.py
|
Use CuPy for CUDA graphs (#2811)
|
2024-02-13 11:32:06 -08:00 |
|
utils.py
|
Measure model memory usage (#3120)
|
2024-03-07 11:42:42 -08:00 |