vllm/model_executor at 82857368400bcf6a12a3d42a3ccdc5f585153404 - vllm

History

twaka 8285736840 workaround of AWQ for Turing GPUs (#1252 )		2023-10-10 19:48:16 -07:00
..
layers	change the timing of sorting logits (#1309 )	2023-10-10 19:37:42 -07:00
models	[Minor] Fix comment in mistral.py (#1303 )	2023-10-09 19:44:37 -07:00
parallel_utils	TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181 )	2023-10-02 15:36:09 -07:00
quantization_utils	workaround of AWQ for Turing GPUs (#1252 )	2023-10-10 19:48:16 -07:00
__init__.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
input_metadata.py	[Mistral] Mistral-7B-v0.1 support (#1196 )	2023-09-28 10:41:03 -07:00
model_loader.py	[Mistral] Mistral-7B-v0.1 support (#1196 )	2023-09-28 10:41:03 -07:00
utils.py	TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181 )	2023-10-02 15:36:09 -07:00
weight_utils.py	Implement AWQ quantization support for LLaMA (#1032 )	2023-09-16 00:03:37 -07:00