vllm/vllm/model_executor
Woosuk Kwon e3e79e9e8a
Implement AWQ quantization support for LLaMA (#1032)
Co-authored-by: Robert Irvine <robert@seamlessml.com>
Co-authored-by: root <rirv938@gmail.com>
Co-authored-by: Casper <casperbh.96@gmail.com>
Co-authored-by: julian-q <julianhquevedo@gmail.com>
2023-09-16 00:03:37 -07:00
..
layers Implement AWQ quantization support for LLaMA (#1032) 2023-09-16 00:03:37 -07:00
models Implement AWQ quantization support for LLaMA (#1032) 2023-09-16 00:03:37 -07:00
parallel_utils Implement AWQ quantization support for LLaMA (#1032) 2023-09-16 00:03:37 -07:00
quantization_utils Implement AWQ quantization support for LLaMA (#1032) 2023-09-16 00:03:37 -07:00
__init__.py [Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
input_metadata.py Add support for BLOOM (#331) 2023-07-03 13:12:35 -07:00
model_loader.py Implement AWQ quantization support for LLaMA (#1032) 2023-09-16 00:03:37 -07:00
utils.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
weight_utils.py Implement AWQ quantization support for LLaMA (#1032) 2023-09-16 00:03:37 -07:00