| .. |
|
attention
|
[Misc] Fix docstring of get_attn_backend (#5271)
|
2024-06-05 09:18:59 -07:00 |
|
core
|
[Misc]: Implement CPU/GPU swapping in BlockManagerV2 (#3834)
|
2024-06-03 13:37:11 -07:00 |
|
distributed
|
[Core][Distributed] improve p2p access check (#4992)
|
2024-05-29 11:29:07 +00:00 |
|
engine
|
[Bugfix][Frontend/Core] Don't log exception when AsyncLLMEngine gracefully shuts down. (#5290)
|
2024-06-05 15:18:12 -07:00 |
|
entrypoints
|
[BugFix] Apply get_cached_tokenizer to the tokenizer setter of LLM (#5207)
|
2024-06-05 10:59:02 -07:00 |
|
executor
|
[Bugfix] Fix torch.compile() error when using MultiprocessingGPUExecutor (#5229)
|
2024-06-03 20:55:50 -07:00 |
|
logging
|
[MISC] Rework logger to enable pythonic custom logging configuration to be provided (#4273)
|
2024-05-01 17:34:40 -07:00 |
|
lora
|
[Bugfix] Remove deprecated @abstractproperty (#5174)
|
2024-06-01 22:40:25 +00:00 |
|
model_executor
|
[Misc] Skip for logits_scale == 1.0 (#5291)
|
2024-06-05 15:19:02 -07:00 |
|
multimodal
|
[Core] Support image processor (#4197)
|
2024-06-02 22:56:41 -07:00 |
|
spec_decode
|
[Speculative Decoding] Add ProposerWorkerBase abstract class (#5252)
|
2024-06-05 14:53:05 -07:00 |
|
transformers_utils
|
[Core] Support image processor (#4197)
|
2024-06-02 22:56:41 -07:00 |
|
usage
|
[Frontend] Separate OpenAI Batch Runner usage from API Server (#4851)
|
2024-05-17 00:42:41 +09:00 |
|
worker
|
[Bugfix] Support prompt_logprobs==0 (#5217)
|
2024-06-03 17:59:30 -07:00 |
|
__init__.py
|
Bump version to v0.4.3 (#5046)
|
2024-05-30 11:13:46 -07:00 |
|
_custom_ops.py
|
[Kernel] Pass a device pointer into the quantize kernel for the scales (#5159)
|
2024-06-03 09:52:30 -07:00 |
|
block.py
|
Add Automatic Prefix Caching (#2762)
|
2024-03-02 00:50:01 -08:00 |
|
config.py
|
[BugFix] Fix log message about default max model length (#5284)
|
2024-06-05 14:53:16 -07:00 |
|
envs.py
|
[Misc] add logging level env var (#5045)
|
2024-05-24 23:49:49 -07:00 |
|
inputs.py
|
[Core] Avoid the need to pass None values to Sequence.inputs (#5099)
|
2024-05-29 16:05:01 -07:00 |
|
logger.py
|
[Misc] add logging level env var (#5045)
|
2024-05-24 23:49:49 -07:00 |
|
outputs.py
|
[Core] Consolidate prompt arguments to LLM engines (#4328)
|
2024-05-28 13:29:31 -07:00 |
|
pooling_params.py
|
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734)
|
2024-05-11 11:30:37 -07:00 |
|
py.typed
|
Add py.typed so consumers of vLLM can get type checking (#1509)
|
2023-10-30 14:50:47 -07:00 |
|
sampling_params.py
|
[Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985)
|
2024-05-23 22:04:24 +00:00 |
|
sequence.py
|
[Core] Support image processor (#4197)
|
2024-06-02 22:56:41 -07:00 |
|
utils.py
|
[Misc]: optimize eager mode host time (#4196)
|
2024-05-31 13:14:50 +08:00 |