vllm/model_executor at bb00f66e19acdf6cb614683ab74f777ed3932eee - vllm

History

Woosuk Kwon bb00f66e19 Use `quantization_config` in hf config (#1695 )		2023-11-17 16:23:49 -08:00
..
layers	Support Min P Sampler (#1642 )	2023-11-17 16:20:49 -08:00
models	Support Microsoft Phi 1.5 (#1664 )	2023-11-16 14:28:39 -08:00
parallel_utils	TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )	2023-11-15 22:50:41 -08:00
__init__.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
input_metadata.py	Delay GPU->CPU sync in sampling (#1337 )	2023-10-30 09:01:34 -07:00
model_loader.py	Use `quantization_config` in hf config (#1695 )	2023-11-17 16:23:49 -08:00
utils.py	TP/quantization/weight loading refactor part 2 - Refactor quantized linear logic and extend quantization support to all models (#1622 )	2023-11-15 22:50:41 -08:00
weight_utils.py	Use `quantization_config` in hf config (#1695 )	2023-11-17 16:23:49 -08:00