squall/vllm - vllm - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Simon Mo	02dbf30e9a	[Build] skip renaming files for release wheels pipeline (#9671 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2024-11-14 23:31:52 -08:00
Cyrus Leung	b40cf6402e	[Model] Support Qwen2 embeddings and use tags to select model tests (#10184 )	2024-11-14 20:23:09 -08:00
Cyrus Leung	972112d82f	[Bugfix] Fix unable to load some models (#10312 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-14 16:55:54 -08:00
Cyrus Leung	675d603400	[CI/Build] Make shellcheck happy (#10285 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-14 09:47:53 +00:00
Isotr0py	03025c023f	[CI/Build] Fix CPU CI online inference timeout (#10314 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-14 16:45:32 +08:00
Yuan	d201d41973	[CI][CPU]refactor CPU tests to allow to bind with different cores (#10222 ) Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>	2024-11-12 10:07:32 +00:00
Robert Shaw	6ace6fba2c	[V1] `AsyncLLM` Implementation (#9826 ) Signed-off-by: Nick Hill <nickhill@us.ibm.com> Signed-off-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Signed-off-by: Nick Hill <nhill@redhat.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by: Nick Hill <nhill@redhat.com> Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>	2024-11-11 23:05:38 +00:00
Isotr0py	2cebda42bb	[Bugfix][Hardware][CPU] Fix broken encoder-decoder CPU runner (#10218 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-11 12:37:58 +00:00
Isotr0py	58170d6503	[Hardware][CPU] Add embedding models support for CPU backend (#10193 ) Signed-off-by: Isotr0py <2037008807@qq.com>	2024-11-11 08:54:28 +00:00
Cyrus Leung	51c2e1fcef	[CI/Build] Split up models tests (#10069 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-09 11:39:14 -08:00
Chendi.Xue	8e1529dc57	[CI/Build] Add run-hpu-test.sh script (#10167 ) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>	2024-11-09 06:26:52 +00:00
Li, Jiang	d7edca1dee	[CI/Build] Adding timeout in CPU CI to avoid CPU test queue blocking (#6892 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-09 03:27:11 +00:00
Cyrus Leung	b489fc3c91	[CI/Build] Update CPU tests to include all "standard" tests (#5481 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-11-08 23:30:04 +08:00
Russell Bryant	3be5b26a76	[CI/Build] Add shell script linting using shellcheck (#7925 ) Signed-off-by: Russell Bryant <rbryant@redhat.com>	2024-11-07 18:17:29 +00:00
Li, Jiang	a4b3e0c1e9	[Hardware][CPU] Update torch 2.5 (#9911 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2024-11-07 04:43:08 +00:00
youkaichao	719c1ca468	[core][distributed] add stateless_init_process_group (#10072 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-06 16:42:09 -08:00
Aaron Pham	21063c11c7	[CI/Build] drop support for Python 3.8 EOL (#8464 ) Signed-off-by: Aaron Pham <contact@aarnphm.xyz>	2024-11-06 07:11:55 +00:00
youkaichao	4be3a45158	[distributed] add function to create ipc buffers directly (#10064 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-11-05 22:35:03 -08:00
Peter Salas	ffc0f2b47a	[Model][OpenVINO] Fix regressions from #8346 (#10045 ) Signed-off-by: Peter Salas <peter@fixie.ai>	2024-11-06 04:19:15 +00:00
Michael Goin	02462465ea	[CI] Prune tests/models/decoder_only/language/* tests (#9940 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2024-11-05 16:02:23 -05:00
hissu-hyvarinen	5208dc7a20	[Bugfix][CI/Build][Hardware][AMD] Shard ID parameters in AMD tests running parallel jobs (#9279 ) Signed-off-by: Hissu Hyvarinen <hissu.hyvarinen@amd.com>	2024-11-04 11:37:46 -08:00
Robert Shaw	1c45f4c385	[CI] Basic Integration Test For TPU (#9968 ) Signed-off-by: Robert Shaw <rshaw@neuralmagic.com>	2024-11-04 11:34:26 -08:00
Alexei-V-Ivanov-AMD	77f7ef2908	[CI/Build] Adding a forced docker system prune to clean up space (#9849 )	2024-11-01 01:02:58 +08:00
Alex Brooks	16b8f7a86f	[CI/Build] Add Model Tests for Qwen2-VL (#9846 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-31 09:10:52 -07:00
Alex Brooks	cc98f1e079	[CI/Build] VLM Test Consolidation (#9372 ) Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>	2024-10-30 09:32:17 -07:00
youkaichao	ff5ed6e1bc	[torch.compile] rework compile control with piecewise cudagraph (#9715 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-10-29 23:03:49 -07:00
Joe Runde	ef7faad1b8	🐛 Fixup more test failures from memory profiling (#9563 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>	2024-10-21 17:10:56 -07:00
Cyrus Leung	696b01af8f	[CI/Build] Split up decoder-only LM tests (#9488 ) Co-authored-by: Nick Hill <nickhill@us.ibm.com>	2024-10-20 21:27:50 -07:00
bnellnm	eca2c5f7c0	[Bugfix] Fix support for dimension like integers and ScalarType (#9299 )	2024-10-17 19:08:34 +00:00
Daniele	a2c71c5405	[CI/Build] remove .github from .dockerignore, add dirty repo check (#9375 )	2024-10-17 10:25:06 -07:00
Kuntai Du	81ede99ca4	[Core] Deprecating block manager v1 and make block manager v2 default (#8704 ) Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).	2024-10-17 11:38:15 -05:00
Li, Jiang	5eda21e773	[Hardware][CPU] compressed-tensor INT8 W8A8 AZP support (#9344 )	2024-10-17 12:21:04 -04:00
Lucas Wilkinson	9d30a056e7	[misc] CUDA Time Layerwise Profiler (#8337 ) Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-10-17 10:36:09 -04:00
Cyrus Leung	1de76a0e55	[CI/Build] Test VLM embeddings (#9406 )	2024-10-16 09:44:30 +00:00
Daniele	203ab8f80f	[CI/Build] setuptools-scm fixes (#8900 )	2024-10-14 11:34:47 -07:00
Tyler Michael Smith	7342a7d7f8	[Model] Support Mamba (#6484 )	2024-10-11 15:40:06 +00:00
Kevin H. Luu	a78c6ba7c8	[ci/build] Add placeholder command for custom models test (#9262 )	2024-10-10 15:45:09 -07:00
youkaichao	e4d652ea3e	[torch.compile] integration with compilation control (#9058 )	2024-10-10 12:39:36 -07:00
sroy745	f3a507f1d3	[Core] Add an environment variable which needs to be set explicitly to allow BlockSpaceManagerV1 (#9149 )	2024-10-10 14:17:17 +08:00
Li, Jiang	ca77dd7a44	[Hardware][CPU] Support AWQ for CPU backend (#7515 )	2024-10-09 10:28:08 -06:00
youkaichao	c8627cd41b	[ci][test] use load dummy for testing (#9165 )	2024-10-09 00:38:40 -07:00
Michael Goin	9ba0bd6aa6	Add `lm-eval` directly to requirements-test.txt (#9161 )	2024-10-08 18:22:31 -07:00
Isotr0py	4f95ffee6f	[Hardware][CPU] Cross-attention and Encoder-Decoder models support on CPU backend (#9089 )	2024-10-07 06:50:35 +00:00
Kuntai Du	fbb74420e7	[CI] Update performance benchmark: upgrade trt-llm to r24.07, and add SGLang (#7412 )	2024-10-04 14:01:44 -07:00
Murali Andoorveedu	0f6d7a9a34	[Models] Add remaining model PP support (#7168 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-10-04 10:56:58 +08:00
Lily Liu	1570203864	[Spec Decode] (1/2) Remove batch expansion (#8839 )	2024-10-01 16:04:42 -07:00
Lily Liu	bce324487a	[CI][SpecDecode] Fix spec decode tests, use flash attention backend for spec decode CI tests. (#8975 )	2024-10-01 00:51:40 +00:00
Kevin H. Luu	1425a1bcf9	[ci] Add CODEOWNERS for test directories (#8795 ) Signed-off-by: kevin <kevin@anyscale.com>	2024-10-01 00:47:08 +00:00
Cyrus Leung	e1a3f5e831	[CI/Build] Update models tests & examples (#8874 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-09-28 09:54:35 -07:00
Tyler Titsworth	260024a374	[Bugfix][Intel] Fix XPU Dockerfile Build (#7824 ) Signed-off-by: tylertitsworth <tyler.titsworth@intel.com> Co-authored-by: youkaichao <youkaichao@126.com>	2024-09-27 23:45:50 -07:00

1 2 3 4 5 ...

305 Commits