Woosuk Kwon
|
f9dadfbee3
|
[V1] Fix detokenizer ports (#10224)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-11 10:42:07 -08:00 |
|
dependabot[bot]
|
25144ceed0
|
Bump actions/setup-python from 5.2.0 to 5.3.0 (#10209)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-11-11 17:24:10 +00:00 |
|
youkaichao
|
e6de9784d2
|
[core][distributed] add stateless process group (#10216)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-11 09:02:14 -08:00 |
|
Yangcheng Li
|
36fc439de0
|
[Doc] fix doc string typo in block_manager swap_out function (#10212)
|
2024-11-11 08:53:07 -08:00 |
|
harrywu
|
874f551b36
|
[Metrics] add more metrics (#4464)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-12 00:17:38 +08:00 |
|
Isotr0py
|
2cebda42bb
|
[Bugfix][Hardware][CPU] Fix broken encoder-decoder CPU runner (#10218)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-11 12:37:58 +00:00 |
|
Roger Wang
|
5fb1f935b0
|
[V1] Allow tokenizer_mode and trust_remote_code for Detokenizer (#10211)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-11-11 18:01:18 +08:00 |
|
Jee Jee Li
|
36e4acd02a
|
[LoRA][Kernel] Remove the unused libentry module (#10214)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-11-11 09:43:23 +00:00 |
|
Isotr0py
|
58170d6503
|
[Hardware][CPU] Add embedding models support for CPU backend (#10193)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-11 08:54:28 +00:00 |
|
dependabot[bot]
|
9804ac7c7c
|
Bump the patch-update group with 5 updates (#10210)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-11-11 07:22:40 +00:00 |
|
youkaichao
|
f89d18ff74
|
[6/N] pass whole config to inner model (#10205)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-11 06:41:46 +00:00 |
|
youkaichao
|
f0f2e5638e
|
[doc] improve debugging code (#10206)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-10 17:49:40 -08:00 |
|
yansh97
|
ad9a78bf64
|
[Doc] Fix typo error in vllm/entrypoints/openai/cli_args.py (#10196)
|
2024-11-11 00:14:22 +00:00 |
|
youkaichao
|
73b9083e99
|
[misc] improve cloudpickle registration and tests (#10202)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-11 00:10:53 +00:00 |
|
Shawn Du
|
20cf2f553c
|
[Misc] small fixes to function tracing file path (#9543)
Signed-off-by: Shawn Du <shawnd200@outlook.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-11-10 15:21:06 -08:00 |
|
Yongzao
|
bfb7d61a7c
|
[doc] Polish the integration with huggingface doc (#10195)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-11-10 10:22:04 -08:00 |
|
FuryMartin
|
19682023b6
|
[Doc] Fix typo error in CONTRIBUTING.md (#10190)
Signed-off-by: FuryMartin <furymartin9910@outlook.com>
|
2024-11-10 07:47:24 +00:00 |
|
youkaichao
|
9fa4bdde9d
|
[ci][build] limit cmake version (#10188)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-09 16:27:26 -08:00 |
|
Cyrus Leung
|
51c2e1fcef
|
[CI/Build] Split up models tests (#10069)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-09 11:39:14 -08:00 |
|
Krishna Mandal
|
b09895a618
|
[Frontend][Core] Override HF config.json via CLI (#5836)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-09 16:19:27 +00:00 |
|
cjackal
|
d88bff1b96
|
[Frontend] add add_request_id middleware (#9594)
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
|
2024-11-09 10:18:29 +00:00 |
|
Zhao Yingzhuo
|
9e37266420
|
bugfix: fix the bug that stream generate not work (#2756)
|
2024-11-09 10:09:48 +00:00 |
|
youkaichao
|
8a4358ecb5
|
[doc] explaining the integration with huggingface (#10173)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-09 01:02:54 -08:00 |
|
youkaichao
|
bd46357ad9
|
[bugfix] fix broken tests of mlp speculator (#10177)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-09 00:04:50 -08:00 |
|
bnellnm
|
f192aeba74
|
[Bugfix] Enable some fp8 and quantized fullgraph tests (#10171)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
|
2024-11-09 08:01:27 +00:00 |
|
Chendi.Xue
|
8e1529dc57
|
[CI/Build] Add run-hpu-test.sh script (#10167)
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
|
2024-11-09 06:26:52 +00:00 |
|
youkaichao
|
1a95f10ee7
|
[5/N] pass the whole config to model (#9983)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-09 14:17:28 +08:00 |
|
Cyrus Leung
|
49d2a41a86
|
[Doc] Adjust RunLLM location (#10176)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-08 20:07:10 -08:00 |
|
Isotr0py
|
47672f38b5
|
[CI/Build] Fix VLM broadcast tests tensor_parallel_size passing (#10161)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-09 04:02:59 +00:00 |
|
Michael Goin
|
f83feccd7f
|
[Bugfix] Ignore GPTQ quantization of Qwen2-VL visual module (#10169)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2024-11-09 03:36:46 +00:00 |
|
Cyrus Leung
|
e0191a95d8
|
[0/N] Rename MultiModalInputs to MultiModalKwargs (#10040)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-09 11:31:02 +08:00 |
|
Li, Jiang
|
d7edca1dee
|
[CI/Build] Adding timeout in CPU CI to avoid CPU test queue blocking (#6892)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-09 03:27:11 +00:00 |
|
rasmith
|
127c07480e
|
[Kernel][Triton] Add Triton implementation for scaled_mm_triton to support fp8 and int8 SmoothQuant, symmetric case (#9857)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2024-11-08 19:59:22 -05:00 |
|
bnellnm
|
10b67d865d
|
[Bugfix] SymIntArrayRef expected to contain concrete integers (#10170)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
|
2024-11-08 14:44:18 -08:00 |
|
Luka Govedič
|
4f93dfe952
|
[torch.compile] Fuse RMSNorm with quant (#9138)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-11-08 21:20:08 +00:00 |
|
Florian Zimmermeister
|
e1b5a82179
|
Rename vllm.logging to vllm.logging_utils (#10134)
|
2024-11-08 20:53:24 +00:00 |
|
Luka Govedič
|
87713c6053
|
[CI/Build] Ignore .gitignored files for shellcheck (#10162)
Signed-off-by: luka <luka@neuralmagic.com>
|
2024-11-08 19:53:36 +00:00 |
|
Woosuk Kwon
|
b5815c8413
|
[V1] Fix non-cudagraph op name (#10166)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-08 10:23:04 -08:00 |
|
Rafael Vasquez
|
6b30471586
|
[Misc] Improve Web UI (#10090)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-11-08 09:51:04 -08:00 |
|
sroy745
|
f6778620a9
|
Disable spec-decode + chunked-prefill for draft models with tensor parallelism > 1 (#10136)
Signed-off-by: Sourashis Roy <sroy@roblox.com>
|
2024-11-08 15:56:18 +00:00 |
|
Patrick von Platen
|
0535e5fe6c
|
Fix edge case Mistral tokenizer (#10152)
|
2024-11-08 15:42:27 +00:00 |
|
Cyrus Leung
|
b489fc3c91
|
[CI/Build] Update CPU tests to include all "standard" tests (#5481)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-08 23:30:04 +08:00 |
|
Roger Wang
|
208ce622c7
|
[V1]Enable APC by default only for text models (#10148)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-11-08 14:39:41 +00:00 |
|
Isotr0py
|
1ff4aed5bd
|
[Model] Expose size to Idefics3 as mm_processor_kwargs (#10146)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-08 09:56:58 +00:00 |
|
Yan Ma
|
f10797c0ce
|
[Bugfix][XPU] Fix xpu tp by introducing XpuCommunicator (#10144)
Signed-off-by: yan ma <yan.ma@intel.com>
|
2024-11-08 09:41:03 +00:00 |
|
Cyrus Leung
|
f4c2187e29
|
[Misc] Fix typo in #5895 (#10145)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-08 09:07:01 +00:00 |
|
Michael Goin
|
aea6ad629f
|
Add hf_transfer to testing image (#10096)
|
2024-11-08 08:35:25 +00:00 |
|
Tao He
|
da07a9ead7
|
Fixes a typo about 'max_decode_seq_len' which causes crashes with cuda graph. (#9285)
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
|
2024-11-08 05:31:28 +00:00 |
|
Russell Bryant
|
3a7f15a398
|
[Doc] Move CONTRIBUTING to docs site (#9924)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-08 05:15:12 +00:00 |
|
Mengqing Cao
|
7371749d54
|
[Misc] Fix ImportError causing by triton (#9493)
|
2024-11-08 05:08:51 +00:00 |
|