vllm/configs at abfe705a02160db53f4b0cf90c7b016f04291b9c - vllm

History

Robert Shaw abfe705a02 [ Misc ] Support Fp8 via `llm-compressor` (#6110 ) Co-authored-by: Robert Shaw <rshaw@neuralmagic>		2024-07-07 20:42:11 +00:00
..
Meta-Llama-3-8B-Instruct-FP8-compressed-tensors.yaml	[ Misc ] Support Fp8 via `llm-compressor` (#6110 )	2024-07-07 20:42:11 +00:00
Meta-Llama-3-8B-Instruct-FP8.yaml	[ Misc ] Support Fp8 via `llm-compressor` (#6110 )	2024-07-07 20:42:11 +00:00
Meta-Llama-3-8B-Instruct-INT8-compressed-tensors.yaml	[ Misc ] Support Fp8 via `llm-compressor` (#6110 )	2024-07-07 20:42:11 +00:00
Meta-Llama-3-8B-Instruct.yaml	[ CI/Build ] LM Eval Harness Based CI Testing (#5838 )	2024-06-29 13:04:30 -04:00
Meta-Llama-3-70B-Instruct.yaml	[ CI/Build ] LM Eval Harness Based CI Testing (#5838 )	2024-06-29 13:04:30 -04:00
Mixtral-8x7B-Instruct-v0.1-FP8.yaml	[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970 )	2024-07-02 21:54:35 +00:00
Mixtral-8x7B-Instruct-v0.1.yaml	[ CI/Build ] LM Eval Harness Based CI Testing (#5838 )	2024-06-29 13:04:30 -04:00
Mixtral-8x22B-Instruct-v0.1-FP8-Dynamic.yaml	[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970 )	2024-07-02 21:54:35 +00:00
models-large.txt	[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970 )	2024-07-02 21:54:35 +00:00
models-small.txt	[ Misc ] Support Fp8 via `llm-compressor` (#6110 )	2024-07-07 20:42:11 +00:00
Qwen2-57B-A14-Instruct.yaml	[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970 )	2024-07-02 21:54:35 +00:00