vllm/vllm/model_executor
Robert Shaw 79a268c4ab
[BUG] fixed fp8 conflict with aqlm (#4307)
Fixes fp8 iterface which broke in AQLM merge.
2024-04-23 18:26:33 -07:00
..
guided_decoding [Bugfix] Add fix for JSON whitespace (#4189) 2024-04-19 20:49:22 -07:00
layers [BUG] fixed fp8 conflict with aqlm (#4307) 2024-04-23 18:26:33 -07:00
model_loader [Kernel] FP8 support for MoE kernel / Mixtral (#4244) 2024-04-24 01:18:23 +00:00
models [Kernel] FP8 support for MoE kernel / Mixtral (#4244) 2024-04-24 01:18:23 +00:00
__init__.py [Core] Refactor Attention Take 2 (#3462) 2024-03-25 04:39:33 +00:00
sampling_metadata.py [Typing] Mypy typing part 2 (#4043) 2024-04-17 17:28:43 -07:00
utils.py [Hardware][Neuron] Refactor neuron support (#3471) 2024-03-22 01:22:17 +00:00