vllm/vllm at 348616ac4b72e2acc6e9a60ae94cf0f7fc29ac31 - vllm

History

Robert Shaw 15985680e2 [ Misc ] Rs/compressed tensors cleanup (#5432 ) Co-authored-by: mgoin <michael@neuralmagic.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>		2024-06-14 10:01:46 -07:00
..
attention	Revert "[Core] Remove unnecessary copies in flash attn backend" (#5478 )	2024-06-13 11:22:50 -07:00
core	[Bugfix] Fix typo in scheduler.py (requeset -> request) (#5470 )	2024-06-12 21:59:44 +00:00
distributed	Add `cuda_device_count_stateless` (#5473 )	2024-06-13 16:06:49 -07:00
engine	[Misc] Add vLLM version getter to utils (#5098 )	2024-06-13 11:21:39 -07:00
entrypoints	[Misc] Add vLLM version getter to utils (#5098 )	2024-06-13 11:21:39 -07:00
executor	Add `cuda_device_count_stateless` (#5473 )	2024-06-13 16:06:49 -07:00
logging	[MISC] Rework logger to enable pythonic custom logging configuration to be provided (#4273 )	2024-05-01 17:34:40 -07:00
lora	[Misc] Improve error message when LoRA parsing fails (#5194 )	2024-06-10 19:38:49 +08:00
model_executor	[ Misc ] Rs/compressed tensors cleanup (#5432 )	2024-06-14 10:01:46 -07:00
multimodal	[Bugfix] Fix LLaVA-NeXT (#5380 )	2024-06-10 15:38:47 +00:00
spec_decode	[Misc] Various simplifications and typing fixes (#5368 )	2024-06-11 10:29:02 +08:00
transformers_utils	[Frontend] Customizable RoPE theta (#5197 )	2024-06-11 10:42:26 -07:00
usage	[Misc] Add vLLM version getter to utils (#5098 )	2024-06-13 11:21:39 -07:00
worker	[Core][Distributed] code deduplication in tp&pp with coordinator(#5293 )	2024-06-12 17:27:08 -07:00
__init__.py	[Misc] Add vLLM version getter to utils (#5098 )	2024-06-13 11:21:39 -07:00
_custom_ops.py	[Kernel] Factor out epilogues from cutlass kernels (#5391 )	2024-06-13 11:22:19 -07:00
block.py	Add Automatic Prefix Caching (#2762 )	2024-03-02 00:50:01 -08:00
config.py	Add `cuda_device_count_stateless` (#5473 )	2024-06-13 16:06:49 -07:00
envs.py	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
inputs.py	[Bugfix] TYPE_CHECKING for MultiModalData (#5444 )	2024-06-12 14:08:52 -07:00
logger.py	[Misc] add logging level env var (#5045 )	2024-05-24 23:49:49 -07:00
outputs.py	[Core] Consolidate prompt arguments to LLM engines (#4328 )	2024-05-28 13:29:31 -07:00
pooling_params.py	[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )	2024-05-11 11:30:37 -07:00
py.typed	Add py.typed so consumers of vLLM can get type checking (#1509 )	2023-10-30 14:50:47 -07:00
sampling_params.py	[Core]: Option To Use Prompt Token Ids Inside Logits Processor (#4985 )	2024-05-23 22:04:24 +00:00
sequence.py	[Core] Support image processor (#4197 )	2024-06-02 22:56:41 -07:00
utils.py	Add `cuda_device_count_stateless` (#5473 )	2024-06-13 16:06:49 -07:00
version.py	bump version to v0.5.0.post1 (#5522 )	2024-06-13 19:42:06 -07:00