vllm/.buildkite at a0dce9383ab7de0015060fb9fedadeb7d8ffdfb9 - vllm

History

HandH1998 6512937de1 Support W4A8 quantization for vllm (#5218 )		2024-07-31 07:55:21 -06:00
..
lm-eval-harness	Support W4A8 quantization for vllm (#5218 )	2024-07-31 07:55:21 -06:00
nightly-benchmarks	[Speculative decoding] Add serving benchmark for llama3 70b + speculative decoding (#6964 )	2024-07-31 00:53:21 +00:00
check-wheel-size.py	[build] relax wheel size limit (#6704 )	2024-07-23 14:03:49 -07:00
release-pipeline.yaml	[ci] Use different sccache bucket for CUDA 11.8 wheel build (#6656 )	2024-07-22 14:20:41 -07:00
run-amd-test.sh	[Build/CI] Update run-amd-test.sh. Enable Docker Hub login. (#6711 )	2024-07-24 05:01:14 -07:00
run-benchmarks.sh	[ci] Fix Buildkite agent path (#5392 )	2024-06-10 18:58:07 -07:00
run-cpu-test.sh	[Model] H2O Danube3-4b (#6451 )	2024-07-26 20:47:50 -07:00
run-multi-node-test.sh	[ci] try to add multi-node tests (#6280 )	2024-07-12 21:51:48 -07:00
run-neuron-test.sh	[CI] clean docker cache for neuron (#4441 )	2024-04-28 23:32:07 +00:00
run-openvino-test.sh	[Hardware][Intel] OpenVINO vLLM backend (#5379 )	2024-06-28 13:50:16 +00:00
run-tpu-test.sh	[CI/Build][TPU] Add TPU CI test (#6277 )	2024-07-15 14:31:16 -07:00
run-xpu-test.sh	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 )	2024-06-17 11:01:25 -07:00
test-pipeline.yaml	[Bugfix] Fix broadcasting logic for `multi_modal_kwargs` (#6836 )	2024-07-31 10:38:45 +08:00