Cyrus Leung
|
49d2a41a86
|
[Doc] Adjust RunLLM location (#10176)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-08 20:07:10 -08:00 |
|
Isotr0py
|
47672f38b5
|
[CI/Build] Fix VLM broadcast tests tensor_parallel_size passing (#10161)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-09 04:02:59 +00:00 |
|
Michael Goin
|
f83feccd7f
|
[Bugfix] Ignore GPTQ quantization of Qwen2-VL visual module (#10169)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2024-11-09 03:36:46 +00:00 |
|
Cyrus Leung
|
e0191a95d8
|
[0/N] Rename MultiModalInputs to MultiModalKwargs (#10040)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-09 11:31:02 +08:00 |
|
Li, Jiang
|
d7edca1dee
|
[CI/Build] Adding timeout in CPU CI to avoid CPU test queue blocking (#6892)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-09 03:27:11 +00:00 |
|
rasmith
|
127c07480e
|
[Kernel][Triton] Add Triton implementation for scaled_mm_triton to support fp8 and int8 SmoothQuant, symmetric case (#9857)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2024-11-08 19:59:22 -05:00 |
|
bnellnm
|
10b67d865d
|
[Bugfix] SymIntArrayRef expected to contain concrete integers (#10170)
Signed-off-by: Bill Nell <bill@neuralmagic.com>
|
2024-11-08 14:44:18 -08:00 |
|
Luka Govedič
|
4f93dfe952
|
[torch.compile] Fuse RMSNorm with quant (#9138)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-11-08 21:20:08 +00:00 |
|
Florian Zimmermeister
|
e1b5a82179
|
Rename vllm.logging to vllm.logging_utils (#10134)
|
2024-11-08 20:53:24 +00:00 |
|
Luka Govedič
|
87713c6053
|
[CI/Build] Ignore .gitignored files for shellcheck (#10162)
Signed-off-by: luka <luka@neuralmagic.com>
|
2024-11-08 19:53:36 +00:00 |
|
Woosuk Kwon
|
b5815c8413
|
[V1] Fix non-cudagraph op name (#10166)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-08 10:23:04 -08:00 |
|
Rafael Vasquez
|
6b30471586
|
[Misc] Improve Web UI (#10090)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-11-08 09:51:04 -08:00 |
|
sroy745
|
f6778620a9
|
Disable spec-decode + chunked-prefill for draft models with tensor parallelism > 1 (#10136)
Signed-off-by: Sourashis Roy <sroy@roblox.com>
|
2024-11-08 15:56:18 +00:00 |
|
Patrick von Platen
|
0535e5fe6c
|
Fix edge case Mistral tokenizer (#10152)
|
2024-11-08 15:42:27 +00:00 |
|
Cyrus Leung
|
b489fc3c91
|
[CI/Build] Update CPU tests to include all "standard" tests (#5481)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-08 23:30:04 +08:00 |
|
Roger Wang
|
208ce622c7
|
[V1]Enable APC by default only for text models (#10148)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-11-08 14:39:41 +00:00 |
|
Isotr0py
|
1ff4aed5bd
|
[Model] Expose size to Idefics3 as mm_processor_kwargs (#10146)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-08 09:56:58 +00:00 |
|
Yan Ma
|
f10797c0ce
|
[Bugfix][XPU] Fix xpu tp by introducing XpuCommunicator (#10144)
Signed-off-by: yan ma <yan.ma@intel.com>
|
2024-11-08 09:41:03 +00:00 |
|
Cyrus Leung
|
f4c2187e29
|
[Misc] Fix typo in #5895 (#10145)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-08 09:07:01 +00:00 |
|
Michael Goin
|
aea6ad629f
|
Add hf_transfer to testing image (#10096)
|
2024-11-08 08:35:25 +00:00 |
|
Tao He
|
da07a9ead7
|
Fixes a typo about 'max_decode_seq_len' which causes crashes with cuda graph. (#9285)
Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
|
2024-11-08 05:31:28 +00:00 |
|
Russell Bryant
|
3a7f15a398
|
[Doc] Move CONTRIBUTING to docs site (#9924)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-08 05:15:12 +00:00 |
|
Mengqing Cao
|
7371749d54
|
[Misc] Fix ImportError causing by triton (#9493)
|
2024-11-08 05:08:51 +00:00 |
|
DearPlanet
|
ad39bd640c
|
[Bugfix] Add error handling when server cannot respond any valid tokens (#5895)
|
2024-11-08 04:58:37 +00:00 |
|
whyiug
|
40d0e7411d
|
[Doc] Update FAQ links in spec_decode.rst (#9662)
Signed-off-by: whyiug <whyiug@hotmail.com>
|
2024-11-08 04:44:58 +00:00 |
|
Russell Bryant
|
6bb52b0f97
|
[CI/Build] Give PR cleanup job PR write access (#10139)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-08 12:10:20 +08:00 |
|
Cody Yu
|
201fc07730
|
[V1] Prefix caching (take 2) (#9972)
Signed-off-by: Cody Yu <hao.yu.cody@gmail.com>
|
2024-11-07 17:34:44 -08:00 |
|
Woosuk Kwon
|
42b4f46b71
|
[V1] Add all_token_ids attribute to Request (#10135)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-07 17:08:24 -08:00 |
|
Jiangtao Hu
|
073a472728
|
[Misc] report relevant env vars in collect_env.py tool (#9293)
|
2024-11-07 16:14:01 -08:00 |
|
dependabot[bot]
|
93bff421bc
|
Bump actions/checkout from 4.2.1 to 4.2.2 (#9746)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-11-07 21:44:58 +00:00 |
|
litianjian
|
28b2877d30
|
Online video support for VLMs (#10020)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-07 20:25:59 +00:00 |
|
dependabot[bot]
|
97b8475beb
|
Bump actions/setup-python from 5.2.0 to 5.3.0 (#9745)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-11-07 18:55:35 +00:00 |
|
Russell Bryant
|
a2f1f3b089
|
[CI/Build] Automate PR body text cleanup (#10082)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-07 18:26:28 +00:00 |
|
Russell Bryant
|
3be5b26a76
|
[CI/Build] Add shell script linting using shellcheck (#7925)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-07 18:17:29 +00:00 |
|
Russell Bryant
|
de0e61a323
|
[CI/Build] Always run mypy (#10122)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-07 16:43:16 +00:00 |
|
Nicolò Lucchesi
|
9d43afcc53
|
[Feature] [Spec decode]: Combine chunked prefill with speculative decoding (#9291)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2024-11-07 08:15:14 -08:00 |
|
Maximilien de Bayser
|
ae62fd17c0
|
[Frontend] Tool calling parser for Granite 3.0 models (#9027)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2024-11-07 07:09:02 -08:00 |
|
Atlas
|
a62bc0109c
|
[Misc] Add Gamma-Distribution Request Generation Support for Serving Benchmark. (#10105)
Signed-off-by: Mozhou <spli161006@gmail.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-11-07 11:20:30 +00:00 |
|
Jiahao Li
|
999df95b4e
|
[Bugfix] Make image processor respect mm_processor_kwargs for Qwen2-VL (#10112)
Signed-off-by: Jiahao Li <liplus17@163.com>
|
2024-11-07 10:50:44 +00:00 |
|
Li, Jiang
|
a6f332d0d9
|
[Hardware][CPU][bugfix] Fix half dtype support on AVX2-only target (#10108)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-07 18:42:50 +08:00 |
|
Lei Yang
|
0dfba97b42
|
[Frontend] Fix multiple values for keyword argument error (#10075) (#10076)
Signed-off-by: Lei <ylxx@live.com>
|
2024-11-07 09:07:19 +00:00 |
|
Flávia Béo
|
aa9078fa03
|
Adds method to read the pooling types from model's files (#9506)
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
|
2024-11-07 08:42:40 +00:00 |
|
Russell Bryant
|
e036e527a0
|
[CI/Build] Improve mypy + python version matrix (#10041)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-07 07:54:16 +00:00 |
|
Hanzhi Zhou
|
6192e9b8fe
|
[Core][Distributed] Refactor ipc buffer init in CustomAllreduce (#10030)
Signed-off-by: Hanzhi Zhou <hanzhi713@gmail.com>
|
2024-11-06 23:50:47 -08:00 |
|
Rafael Vasquez
|
d7263a1bb8
|
Doc: Improve benchmark documentation (#9927)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-11-06 23:50:35 -08:00 |
|
Russell Bryant
|
104d729656
|
[CI/Build] re-add codespell to CI (#10083)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-06 22:54:46 -08:00 |
|
Cyrus Leung
|
db7db4aab9
|
[Misc] Consolidate ModelConfig code related to HF config (#10104)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-07 06:00:21 +00:00 |
|
Nick Hill
|
1fa020c539
|
[V1][BugFix] Fix Generator construction in greedy + seed case (#10097)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2024-11-07 05:06:57 +00:00 |
|
youkaichao
|
e7b84c394d
|
[doc] add back Python 3.8 ABI (#10100)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-06 21:06:41 -08:00 |
|
Li, Jiang
|
a4b3e0c1e9
|
[Hardware][CPU] Update torch 2.5 (#9911)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-07 04:43:08 +00:00 |
|