Maximilien de Bayser
|
a324d3a1a7
|
Change granite chat template to keep json list formatting for tool calls (#10452)
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
|
2024-11-19 18:16:54 -07:00 |
|
ElizaWszola
|
b00b33d77e
|
[Model][Quantization] HQQ support through Marlin kernel expansion (#9766)
Signed-off-by: ElizaWszola <eliza@neuralmagic.com>
|
2024-11-19 13:31:12 -08:00 |
|
Russell Bryant
|
efa9084628
|
[Core] Avoid metrics log noise when idle (#8868)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-19 21:05:25 +00:00 |
|
youkaichao
|
803f37eaaa
|
[6/N] torch.compile rollout to users (#10437)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-19 10:09:03 -08:00 |
|
Russell Bryant
|
fd9f124971
|
[Doc] fix link for page that was renamed (#10455)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-19 09:48:30 -08:00 |
|
Manjul Mohan
|
1ea291a417
|
Fix: Build error seen on Power Architecture (#10421)
Signed-off-by: Manjul Mohan <manjul.mohan@ibm.com>
Signed-off-by: B-201 <Joy25810@foxmail.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: ismael-dm <ismaeldm99@gmail.com>
Signed-off-by: Andrew Nesbitt <andrewnez@gmail.com>
Signed-off-by: mgoin <michael@neuralmagic.com>
Signed-off-by: yan ma <yan.ma@intel.com>
Signed-off-by: Angus Wang <wangjadehao@gmail.com>
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: rickyx <rickyx@anyscale.com>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Manjul Mohan manjul.mohan@ibm.com <manjulmohan@ltcd97-lp2.aus.stglabs.ibm.com>
Co-authored-by: B-201 <Joy25810@foxmail.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: ismael-dm <ismaeldm99@gmail.com>
Co-authored-by: Andrew Nesbitt <andrewnez@gmail.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
Co-authored-by: Yan Ma <yan.ma@intel.com>
Co-authored-by: Angus Wang <wangjadehao@gmail.com>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Ricky Xu <rickyx@anyscale.com>
Co-authored-by: Kevin H. Luu <kevin@anyscale.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Travis Johnson <tsjohnso@us.ibm.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
|
2024-11-19 09:34:57 -08:00 |
|
Patrick von Platen
|
11fd7ea639
|
[Pixtral-Large] Pixtral actually has no bias in vision-lang adapter (#10449)
|
2024-11-19 17:33:06 +00:00 |
|
COSMOPlat
|
f028dff33d
|
[BugFix] Fix hermes tool parser output error stream arguments in some cases (#10395) (#10398)
Signed-off-by: xiyuan lee <lixiyuan@haier.com>
|
2024-11-19 13:42:50 +00:00 |
|
Yuan
|
b4614656b8
|
[CI][CPU] adding numa node number as container name suffix (#10441)
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
|
2024-11-19 13:16:43 +00:00 |
|
youkaichao
|
25f9c78961
|
[misc][plugin] improve plugin loading (#10443)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-19 10:43:21 +00:00 |
|
Russell Bryant
|
5390d6664f
|
[Doc] Add the start of an arch overview page (#10368)
|
2024-11-19 09:52:11 +00:00 |
|
Jee Jee Li
|
382b6a4852
|
[Misc] Avoid misleading warning messages (#10438)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-11-19 08:54:58 +00:00 |
|
Travis Johnson
|
272e31c0bd
|
[Bugfix] Guard for negative counter metrics to prevent crash (#10430)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
2024-11-19 04:57:10 +00:00 |
|
Michael Goin
|
74f8c2cf5f
|
Add openai.beta.chat.completions.parse example to structured_outputs.rst (#10433)
|
2024-11-19 04:37:46 +00:00 |
|
Mengqing Cao
|
8c1fb50705
|
[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2024-11-19 11:22:26 +08:00 |
|
Jee Jee Li
|
7eb719df13
|
[Bugfix]Fix Phi-3 BNB online quantization (#10417)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2024-11-19 03:21:42 +00:00 |
|
Kevin H. Luu
|
284203f171
|
[ci/build] Have dependabot ignore all patch update (#10436)
We have too many dependencies and all patch updates can be a little noisy. This is to have dependabot ignore all patch version updates.
|
2024-11-19 01:04:25 +00:00 |
|
Ricky Xu
|
90a6c759ca
|
[misc] partial prefix & random input generation benchmark (#9929)
Signed-off-by: rickyx <rickyx@anyscale.com>
|
2024-11-18 15:39:14 -08:00 |
|
youkaichao
|
2298e69b5f
|
[ci][bugfix] fix kernel tests (#10431)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-18 15:29:37 -08:00 |
|
youkaichao
|
a03ea40792
|
[3/N][torch.compile] consolidate custom op logging (#10399)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-18 15:14:59 -08:00 |
|
Lucas Wilkinson
|
96d999fbe8
|
[Kernel] Initial Machete W4A8 support + Refactors (#9855)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
|
2024-11-18 12:59:29 -07:00 |
|
Angus Wang
|
c2170a5b39
|
[Kernel] Explicitly specify other value in tl.load calls (#9014)
Signed-off-by: Angus Wang <wangjadehao@gmail.com>
|
2024-11-18 11:39:40 -08:00 |
|
Yan Ma
|
6b2d25efc7
|
[Hardware][XPU] AWQ/GPTQ support for xpu backend (#10107)
Signed-off-by: yan ma <yan.ma@intel.com>
|
2024-11-18 11:18:05 -07:00 |
|
Michael Goin
|
281cc4b3cd
|
[Model][Bugfix] Support TP for PixtralHF ViT (#10405)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2024-11-18 10:04:14 -08:00 |
|
Andrew Nesbitt
|
4f686d139f
|
Fix open_collective value in FUNDING.yml (#10426)
Signed-off-by: Andrew Nesbitt <andrewnez@gmail.com>
|
2024-11-18 09:52:42 -08:00 |
|
ismael-dm
|
31894a2155
|
[Doc] Add documentation for Structured Outputs (#9943)
Signed-off-by: ismael-dm <ismaeldm99@gmail.com>
|
2024-11-18 09:52:12 -08:00 |
|
youkaichao
|
7851b45196
|
[5/N][torch.compile] torch.jit.script --> torch.compile (#10406)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-18 23:20:06 +08:00 |
|
B-201
|
4186be8111
|
[Doc] Update doc for LoRA support in GLM-4V (#10425)
Signed-off-by: B-201 <Joy25810@foxmail.com>
|
2024-11-18 15:08:30 +00:00 |
|
Isotr0py
|
e7ebb662d7
|
[Model] Remove transformers attention porting in VITs (#10414)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-18 21:45:21 +08:00 |
|
B-201
|
5be4e52b65
|
[Model][LoRA]LoRA support added for glm-4v (#10418)
Signed-off-by: B-201 <Joy25810@foxmail.com>
|
2024-11-18 12:57:10 +00:00 |
|
Maybewuss
|
01aae1cc68
|
[Model] Remove redundant softmax when using PoolingType.STEP (#10415)
|
2024-11-18 10:05:36 +00:00 |
|
lkchen
|
c7dec926f6
|
[VLM] Report multi_modal_placeholders in output (#10407)
Signed-off-by: Linkun Chen <lkchen+anyscale@github.com>
|
2024-11-18 16:06:16 +08:00 |
|
youkaichao
|
51bb12d17b
|
[4/N][torch.compile] clean up set_torch_compile_backend (#10401)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-17 23:57:20 -08:00 |
|
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
|
47826cacf0
|
[Bugfix] Ignore ray reinit error when current platform is ROCm or XPU (#10375)
Signed-off-by: Hollow Man <hollowman@opensuse.org>
|
2024-11-18 11:29:26 +08:00 |
|
Isotr0py
|
c4e464333e
|
[Misc] Add uninitialized params tracking for AutoWeightsLoader (#10327)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-18 09:07:46 +08:00 |
|
wchen61
|
d1557e66d3
|
[Misc] Enhance offline_inference to support user-configurable paramet… (#10392)
Signed-off-by: wchen61 <wchen61@foxmail.com>
|
2024-11-17 11:32:40 +00:00 |
|
电脑星人
|
80d85c5d7b
|
[Bugfix] Fix mrope_position_delta in non-last prefill chunk (#10403)
Signed-off-by: imkero <kerorek@outlook.com>
|
2024-11-17 08:50:24 +00:00 |
|
Kunshang Ji
|
76aab90ab6
|
[Hardware] [HPU]add mark_step for hpu (#10239)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2024-11-17 00:44:44 -08:00 |
|
youkaichao
|
8d74b5aee9
|
[platforms] refactor cpu code (#10402)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 23:14:23 -08:00 |
|
Isotr0py
|
cf349c4a97
|
[Bugfix][CPU] Fix CPU embedding runner with tensor parallel (#10394)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2024-11-16 23:12:04 -08:00 |
|
Chendi.Xue
|
905d0f0af4
|
[CI/Build] Fix IDC hpu [Device not found] issue (#10384)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2024-11-17 14:58:22 +08:00 |
|
Roger Wang
|
643ecf7b11
|
[V1] Refactor model executable interface for all text-only language models (#10374)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2024-11-17 05:18:46 +00:00 |
|
youkaichao
|
4fd9375028
|
[2/N][torch.compile] make compilation cfg part of vllm cfg (#10383)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 18:02:14 -08:00 |
|
Woosuk Kwon
|
661a34fd4f
|
[V1] Add code owners for V1 (#10397)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-11-16 10:45:26 -08:00 |
|
电脑星人
|
361c29e174
|
[Bugfix] Fix M-RoPE position calculation when chunked prefill is enabled (#10388)
Signed-off-by: imkero <kerorek@outlook.com>
|
2024-11-17 02:10:00 +08:00 |
|
Sky Lee
|
b98d89efd4
|
[Misc] Medusa supports custom bias (#10361)
|
2024-11-16 16:33:01 +00:00 |
|
Jaehyun An
|
8b6725b0cf
|
[Misc] Update benchmark to support image_url file or http (#10287)
Signed-off-by: rbbang <anjaehyun87@gmail.com>
|
2024-11-16 18:15:40 +08:00 |
|
rasmith
|
1d75472626
|
[BugFix] [Kernel] Fix GPU SEGV occuring in fused_moe kernel (#10385)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2024-11-16 09:55:05 +00:00 |
|
youkaichao
|
2f427c2d16
|
[misc][plugin] improve log messages (#10386)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 01:23:20 -08:00 |
|
youkaichao
|
755b85359b
|
[doc] add doc for the plugin system (#10372)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-15 21:46:27 -08:00 |
|