| .. |
|
attention
|
[Bugfix] Add device assertion to TorchSDPA (#5402)
|
2024-06-12 12:58:53 -07:00 |
|
core
|
[Bugfix] Fix typo in scheduler.py (requeset -> request) (#5470)
|
2024-06-12 21:59:44 +00:00 |
|
distributed
|
[Core][Distributed] add same-node detection (#5369)
|
2024-06-11 10:53:59 -07:00 |
|
engine
|
[Hardware] Initial TPU integration (#5292)
|
2024-06-12 11:53:03 -07:00 |
|
entrypoints
|
[Bugfix][Frontend] Cleanup "fix chat logprobs" (#5026)
|
2024-06-10 22:36:46 -07:00 |
|
executor
|
[Hardware] Initial TPU integration (#5292)
|
2024-06-12 11:53:03 -07:00 |
|
logging
|
[MISC] Rework logger to enable pythonic custom logging configuration to be provided (#4273)
|
2024-05-01 17:34:40 -07:00 |
|
lora
|
[Misc] Improve error message when LoRA parsing fails (#5194)
|
2024-06-10 19:38:49 +08:00 |
|
model_executor
|
[Frontend] [Core] Support for sharded tensorized models (#4990)
|
2024-06-12 14:13:52 -07:00 |
|
multimodal
|
[Bugfix] Fix LLaVA-NeXT (#5380)
|
2024-06-10 15:38:47 +00:00 |
|
spec_decode
|
[Misc] Various simplifications and typing fixes (#5368)
|
2024-06-11 10:29:02 +08:00 |
|
transformers_utils
|
[Frontend] Customizable RoPE theta (#5197)
|
2024-06-11 10:42:26 -07:00 |
|
usage
|
[Frontend] Separate OpenAI Batch Runner usage from API Server (#4851)
|
2024-05-17 00:42:41 +09:00 |
|
worker
|
[Frontend] [Core] Support for sharded tensorized models (#4990)
|
2024-06-12 14:13:52 -07:00 |
|
__init__.py
|
Bump version to v0.5.0 (#5384)
|
2024-06-10 15:56:06 -07:00 |
|
_custom_ops.py
|
[misc] add hint for AttributeError (#5462)
|
2024-06-12 21:46:35 +00:00 |
|
block.py
|
Add Automatic Prefix Caching (#2762)
|
2024-03-02 00:50:01 -08:00 |
|
config.py
|
[Hardware] Initial TPU integration (#5292)
|
2024-06-12 11:53:03 -07:00 |
|
envs.py
|
[Hardware] Initial TPU integration (#5292)
|
2024-06-12 11:53:03 -07:00 |
|
inputs.py
|
[Bugfix] TYPE_CHECKING for MultiModalData (#5444)
|
2024-06-12 14:08:52 -07:00 |
|
logger.py
|
[Misc] add logging level env var (#5045)
|
2024-05-24 23:49:49 -07:00 |
|
outputs.py
|
[Core] Consolidate prompt arguments to LLM engines (#4328)
|
2024-05-28 13:29:31 -07:00 |
|
pooling_params.py
|
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734)
|
2024-05-11 11:30:37 -07:00 |
|
py.typed
|
Add py.typed so consumers of vLLM can get type checking (#1509)
|
2023-10-30 14:50:47 -07:00 |
|
sampling_params.py
|
[Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985)
|
2024-05-23 22:04:24 +00:00 |
|
sequence.py
|
[Core] Support image processor (#4197)
|
2024-06-02 22:56:41 -07:00 |
|
utils.py
|
[Hardware] Initial TPU integration (#5292)
|
2024-06-12 11:53:03 -07:00 |