vllm/vllm/model_executor
2023-10-10 19:48:16 -07:00
..
layers change the timing of sorting logits (#1309) 2023-10-10 19:37:42 -07:00
models [Minor] Fix comment in mistral.py (#1303) 2023-10-09 19:44:37 -07:00
parallel_utils TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00
quantization_utils workaround of AWQ for Turing GPUs (#1252) 2023-10-10 19:48:16 -07:00
__init__.py [Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
input_metadata.py [Mistral] Mistral-7B-v0.1 support (#1196) 2023-09-28 10:41:03 -07:00
model_loader.py [Mistral] Mistral-7B-v0.1 support (#1196) 2023-09-28 10:41:03 -07:00
utils.py TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00
weight_utils.py Implement AWQ quantization support for LLaMA (#1032) 2023-09-16 00:03:37 -07:00