vllm/vllm/model_executor
2024-09-04 18:55:37 +08:00
..
guided_decoding [misc][core] lazy import outlines (#7831) 2024-08-24 00:51:38 -07:00
layers [Misc] Update fbgemmfp8 to use vLLMParameters (#7972) 2024-09-03 20:12:41 -06:00
model_loader [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
models [Bugfix] remove post_layernorm in siglip (#8106) 2024-09-04 18:55:37 +08:00
__init__.py [Performance] Optimize e2e overheads: Reduce python allocations (#7162) 2024-08-08 21:34:28 -07:00
custom_op.py [XPU] fallback to native implementation for xpu custom op (#7670) 2024-08-20 00:26:09 -07:00
parameter.py [Misc] Update GPTQ to use vLLMParameters (#7976) 2024-09-03 17:21:44 -04:00
pooling_metadata.py [Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734) 2024-05-11 11:30:37 -07:00
sampling_metadata.py [Core] Optimize SPMD architecture with delta + serialization optimization (#7109) 2024-08-18 17:57:20 -07:00
utils.py [Hardware][Neuron] Refactor neuron support (#3471) 2024-03-22 01:22:17 +00:00