vllm/vllm
2024-06-12 21:59:44 +00:00
..
attention [Bugfix] Add device assertion to TorchSDPA (#5402) 2024-06-12 12:58:53 -07:00
core [Bugfix] Fix typo in scheduler.py (requeset -> request) (#5470) 2024-06-12 21:59:44 +00:00
distributed [Core][Distributed] add same-node detection (#5369) 2024-06-11 10:53:59 -07:00
engine [Hardware] Initial TPU integration (#5292) 2024-06-12 11:53:03 -07:00
entrypoints [Bugfix][Frontend] Cleanup "fix chat logprobs" (#5026) 2024-06-10 22:36:46 -07:00
executor [Hardware] Initial TPU integration (#5292) 2024-06-12 11:53:03 -07:00
logging [MISC] Rework logger to enable pythonic custom logging configuration to be provided (#4273) 2024-05-01 17:34:40 -07:00
lora [Misc] Improve error message when LoRA parsing fails (#5194) 2024-06-10 19:38:49 +08:00
model_executor [Frontend] [Core] Support for sharded tensorized models (#4990) 2024-06-12 14:13:52 -07:00
multimodal [Bugfix] Fix LLaVA-NeXT (#5380) 2024-06-10 15:38:47 +00:00
spec_decode [Misc] Various simplifications and typing fixes (#5368) 2024-06-11 10:29:02 +08:00
transformers_utils [Frontend] Customizable RoPE theta (#5197) 2024-06-11 10:42:26 -07:00
usage [Frontend] Separate OpenAI Batch Runner usage from API Server (#4851) 2024-05-17 00:42:41 +09:00
worker [Frontend] [Core] Support for sharded tensorized models (#4990) 2024-06-12 14:13:52 -07:00
__init__.py Bump version to v0.5.0 (#5384) 2024-06-10 15:56:06 -07:00
_custom_ops.py [misc] add hint for AttributeError (#5462) 2024-06-12 21:46:35 +00:00
block.py Add Automatic Prefix Caching (#2762) 2024-03-02 00:50:01 -08:00
config.py [Hardware] Initial TPU integration (#5292) 2024-06-12 11:53:03 -07:00
envs.py [Hardware] Initial TPU integration (#5292) 2024-06-12 11:53:03 -07:00
inputs.py [Bugfix] TYPE_CHECKING for MultiModalData (#5444) 2024-06-12 14:08:52 -07:00
logger.py [Misc] add logging level env var (#5045) 2024-05-24 23:49:49 -07:00
outputs.py [Core] Consolidate prompt arguments to LLM engines (#4328) 2024-05-28 13:29:31 -07:00
pooling_params.py [Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734) 2024-05-11 11:30:37 -07:00
py.typed Add py.typed so consumers of vLLM can get type checking (#1509) 2023-10-30 14:50:47 -07:00
sampling_params.py [Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985) 2024-05-23 22:04:24 +00:00
sequence.py [Core] Support image processor (#4197) 2024-06-02 22:56:41 -07:00
utils.py [Hardware] Initial TPU integration (#5292) 2024-06-12 11:53:03 -07:00