vllm/vllm at 94a07bbdd813a0121d01a852ab03fb2430e73548 - vllm

History

Michael Goin 94a07bbdd8 [Bugfix] Fix typo in scheduler.py (requeset -> request) (#5470 )		2024-06-12 21:59:44 +00:00
..
attention	[Bugfix] Add device assertion to TorchSDPA (#5402 )	2024-06-12 12:58:53 -07:00
core	[Bugfix] Fix typo in scheduler.py (requeset -> request) (#5470 )	2024-06-12 21:59:44 +00:00
distributed	[Core][Distributed] add same-node detection (#5369 )	2024-06-11 10:53:59 -07:00
engine	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
entrypoints	[Bugfix][Frontend] Cleanup "fix chat logprobs" (#5026 )	2024-06-10 22:36:46 -07:00
executor	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
logging	[MISC] Rework logger to enable pythonic custom logging configuration to be provided (#4273 )	2024-05-01 17:34:40 -07:00
lora	[Misc] Improve error message when LoRA parsing fails (#5194 )	2024-06-10 19:38:49 +08:00
model_executor	[Frontend] [Core] Support for sharded tensorized models (#4990 )	2024-06-12 14:13:52 -07:00
multimodal	[Bugfix] Fix LLaVA-NeXT (#5380 )	2024-06-10 15:38:47 +00:00
spec_decode	[Misc] Various simplifications and typing fixes (#5368 )	2024-06-11 10:29:02 +08:00
transformers_utils	[Frontend] Customizable RoPE theta (#5197 )	2024-06-11 10:42:26 -07:00
usage	[Frontend] Separate OpenAI Batch Runner usage from API Server (#4851 )	2024-05-17 00:42:41 +09:00
worker	[Frontend] [Core] Support for sharded tensorized models (#4990 )	2024-06-12 14:13:52 -07:00
__init__.py	Bump version to v0.5.0 (#5384 )	2024-06-10 15:56:06 -07:00
_custom_ops.py	[misc] add hint for AttributeError (#5462 )	2024-06-12 21:46:35 +00:00
block.py	Add Automatic Prefix Caching (#2762 )	2024-03-02 00:50:01 -08:00
config.py	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
envs.py	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
inputs.py	[Bugfix] TYPE_CHECKING for MultiModalData (#5444 )	2024-06-12 14:08:52 -07:00
logger.py	[Misc] add logging level env var (#5045 )	2024-05-24 23:49:49 -07:00
outputs.py	[Core] Consolidate prompt arguments to LLM engines (#4328 )	2024-05-28 13:29:31 -07:00
pooling_params.py	[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )	2024-05-11 11:30:37 -07:00
py.typed	Add py.typed so consumers of vLLM can get type checking (#1509 )	2023-10-30 14:50:47 -07:00
sampling_params.py	[Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985 )	2024-05-23 22:04:24 +00:00
sequence.py	[Core] Support image processor (#4197 )	2024-06-02 22:56:41 -07:00
utils.py	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00