Isotr0py
|
c4e464333e
|
[Misc] Add uninitialized params tracking for AutoWeightsLoader (#10327)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-18 09:07:46 +08:00 |
|
wchen61
|
d1557e66d3
|
[Misc] Enhance offline_inference to support user-configurable paramet… (#10392)
Signed-off-by: wchen61 <wchen61@foxmail.com>
|
2024-11-17 11:32:40 +00:00 |
|
电脑星人
|
80d85c5d7b
|
[Bugfix] Fix mrope_position_delta in non-last prefill chunk (#10403)
Signed-off-by: imkero <kerorek@outlook.com>
|
2024-11-17 08:50:24 +00:00 |
|
Kunshang Ji
|
76aab90ab6
|
[Hardware] [HPU]add mark_step for hpu (#10239)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2024-11-17 00:44:44 -08:00 |
|
youkaichao
|
8d74b5aee9
|
[platforms] refactor cpu code (#10402)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 23:14:23 -08:00 |
|
Isotr0py
|
cf349c4a97
|
[Bugfix][CPU] Fix CPU embedding runner with tensor parallel (#10394)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-16 23:12:04 -08:00 |
|
Chendi.Xue
|
905d0f0af4
|
[CI/Build] Fix IDC hpu [Device not found] issue (#10384)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2024-11-17 14:58:22 +08:00 |
|
Roger Wang
|
643ecf7b11
|
[V1] Refactor model executable interface for all text-only language models (#10374)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-11-17 05:18:46 +00:00 |
|
youkaichao
|
4fd9375028
|
[2/N][torch.compile] make compilation cfg part of vllm cfg (#10383)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 18:02:14 -08:00 |
|
Woosuk Kwon
|
661a34fd4f
|
[V1] Add code owners for V1 (#10397)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-16 10:45:26 -08:00 |
|
电脑星人
|
361c29e174
|
[Bugfix] Fix M-RoPE position calculation when chunked prefill is enabled (#10388)
Signed-off-by: imkero <kerorek@outlook.com>
|
2024-11-17 02:10:00 +08:00 |
|
Sky Lee
|
b98d89efd4
|
[Misc] Medusa supports custom bias (#10361)
|
2024-11-16 16:33:01 +00:00 |
|
Jaehyun An
|
8b6725b0cf
|
[Misc] Update benchmark to support image_url file or http (#10287)
Signed-off-by: rbbang <anjaehyun87@gmail.com>
|
2024-11-16 18:15:40 +08:00 |
|
rasmith
|
1d75472626
|
[BugFix] [Kernel] Fix GPU SEGV occuring in fused_moe kernel (#10385)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2024-11-16 09:55:05 +00:00 |
|
youkaichao
|
2f427c2d16
|
[misc][plugin] improve log messages (#10386)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 01:23:20 -08:00 |
|
youkaichao
|
755b85359b
|
[doc] add doc for the plugin system (#10372)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-15 21:46:27 -08:00 |
|
Cyrus Leung
|
32e46e000f
|
[Frontend] Automatic detection of chat content format from AST (#9919)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-16 13:35:40 +08:00 |
|
Michael Green
|
4f168f69a3
|
[Docs] Misc updates to TPU installation instructions (#10165)
|
2024-11-15 13:26:17 -08:00 |
|
Russell Bryant
|
3e8d14d8a1
|
[Doc] Move PR template content to docs (#10159)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-15 13:20:20 -08:00 |
|
Russell Bryant
|
a067f85e08
|
[Frontend] Add --version flag to CLI (#10369)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-15 13:13:53 -08:00 |
|
Simon Mo
|
c76ac49d26
|
[Docs] Add Nebius as sponsors (#10371)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2024-11-15 12:47:40 -08:00 |
|
Simon Mo
|
a6221a144a
|
[Misc] bump mistral common version (#10367)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2024-11-15 09:48:07 -08:00 |
|
ElizaWszola
|
79ee45b428
|
[Misc] Bump up test_fused_moe tolerance (#10364)
Signed-off-by: ElizaWszola <eliza@neuralmagic.com>
|
2024-11-15 16:31:18 +00:00 |
|
Guillaume Calmettes
|
691a3ec047
|
[Bugfix] Ensure special tokens are properly filtered out for guided structured output with MistralTokenizer (#10363)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2024-11-15 14:50:40 +00:00 |
|
youkaichao
|
3a763ba0c3
|
[core][misc] keep compatibility for old-style classes (#10356)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-15 13:55:51 +00:00 |
|
shangmingc
|
f2056f726d
|
[Misc] Fix some help info of arg_utils to improve readability (#10362)
|
2024-11-15 12:40:30 +00:00 |
|
Jee Jee Li
|
1d65ec7eeb
|
[Bugfix] Fix fully sharded LoRA bug (#10352)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-11-15 10:34:58 +00:00 |
|
Xin Yang
|
26908554b2
|
[Doc] Remove float32 choice from --lora-dtype (#10348)
Signed-off-by: Xin Yang <xyang19@gmail.com>
|
2024-11-15 10:22:57 +00:00 |
|
Cyrus Leung
|
b311efd0bd
|
[Misc] Fix import error in tensorizer tests and cleanup some code (#10349)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-15 09:34:17 +00:00 |
|
wchen61
|
3d158cdc8d
|
Add default value to avoid Falcon crash (#5363) (#10347)
Signed-off-by: wchen61 <wchen61@foxmail.com>
|
2024-11-15 08:52:20 +00:00 |
|
Simon Mo
|
02dbf30e9a
|
[Build] skip renaming files for release wheels pipeline (#9671)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2024-11-14 23:31:52 -08:00 |
|
Cyrus Leung
|
2ac6d0e75b
|
[Misc] Consolidate pooler config overrides (#10351)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-15 06:59:00 +00:00 |
|
Sky Lee
|
2ec8827288
|
[Bugfix] Qwen-vl output is inconsistent in speculative decoding (#10350)
|
2024-11-15 05:40:10 +00:00 |
|
Cyrus Leung
|
b40cf6402e
|
[Model] Support Qwen2 embeddings and use tags to select model tests (#10184)
|
2024-11-14 20:23:09 -08:00 |
|
Tyler Michael Smith
|
2885ba0e24
|
[Misc] Change RedundantReshapesPass and FusionPass logging from info to debug (#10308)
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2024-11-15 02:44:26 +00:00 |
|
Luka Govedič
|
bf2ddc6610
|
[bugfix] Fix static asymmetric quantization case (#10334)
Signed-off-by: Daniël de Kok <me@danieldk.eu>
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: Daniël de Kok <me@danieldk.eu>
|
2024-11-15 09:35:11 +08:00 |
|
Cyrus Leung
|
972112d82f
|
[Bugfix] Fix unable to load some models (#10312)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-14 16:55:54 -08:00 |
|
Patrick von Platen
|
11cd1ae6ad
|
[Tool parsing] Improve / correct mistral tool parsing (#10333)
|
2024-11-15 00:42:49 +00:00 |
|
Zijin Xiao
|
554af9228d
|
[Bugfix] use AF_INET6 for OpenAI Compatible Server with ipv6 (#9583)
Signed-off-by: xiaozijin <xiaozijin@bytedance.com>
|
2024-11-14 16:38:53 -08:00 |
|
Murali Andoorveedu
|
b2e0ad3b59
|
[Perf] Reduce peak memory usage of llama (#10339)
Signed-off-by: andoorve <37849411+andoorve@users.noreply.github.com>
|
2024-11-15 00:38:20 +00:00 |
|
Maximilien de Bayser
|
4a18fd14ba
|
Support Roberta embedding models (#9387)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Co-authored-by: Flavia Beo <flavia.beo@ibm.com>
|
2024-11-14 21:23:29 +00:00 |
|
Woosuk Kwon
|
1dbae0329c
|
[Docs] Publish meetup slides (#10331)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-14 16:19:38 +00:00 |
|
Cyrus Leung
|
675d603400
|
[CI/Build] Make shellcheck happy (#10285)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-14 09:47:53 +00:00 |
|
Isotr0py
|
03025c023f
|
[CI/Build] Fix CPU CI online inference timeout (#10314)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-14 16:45:32 +08:00 |
|
youkaichao
|
29f3ef26a3
|
[ci][distributed] disable hanging tests (#10317)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-14 00:23:39 -08:00 |
|
B-201
|
294bf467ba
|
[Model] Add BNB quantization support for Idefics3 (#10310)
Signed-off-by: B-201 <Joy25810@foxmail.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-11-14 06:31:44 +00:00 |
|
Guillaume Calmettes
|
52b48c1ead
|
[BugFix]: properly deserialize tool_calls iterator before processing by mistral-common when MistralTokenizer is used (#9951)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2024-11-14 04:48:16 +00:00 |
|
Mike Depinet
|
f67ce05d0b
|
[Frontend] Pythonic tool parser (#9859)
Signed-off-by: Mike Depinet <mike@fixie.ai>
|
2024-11-14 04:14:34 +00:00 |
|
Russell Bryant
|
e0853b6508
|
[Misc] format.sh: Simplify tool_version_check (#10305)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-14 11:12:35 +08:00 |
|
youkaichao
|
504ac53d18
|
[misc] error early for old-style class (#10304)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-13 18:55:39 -08:00 |
|