vllm/vllm/model_executor/model_loader
Harsha vardhan manoj Bikki 008cf886c9
[Neuron] Adding support for adding/ overriding neuron configuration a… (#8062)
Co-authored-by: Harsha Bikki <harbikh@amazon.com>
2024-09-04 16:33:43 -07:00
..
__init__.py [VLM] Refactor MultiModalConfig initialization and profiling (#7530) 2024-08-17 13:30:55 -07:00
loader.py support bitsandbytes 8-bit and FP4 quantized models (#7445) 2024-08-29 19:09:08 -04:00
neuron.py [Neuron] Adding support for adding/ overriding neuron configuration a… (#8062) 2024-09-04 16:33:43 -07:00
openvino.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
tensorizer.py [Frontend] Add FlexibleArgumentParser to support both underscore and dash in names (#5718) 2024-06-20 17:00:13 -06:00
utils.py [Kernel] Expand MoE weight loading + Add Fused Marlin MoE Kernel (#7766) 2024-08-27 15:07:09 -07:00
weight_utils.py [Model] Add AWQ quantization support for InternVL2 model (#7187) 2024-08-20 23:18:57 -07:00