vllm/model_executor at 0b769992ec1d780b3229c46152c6e647da113aa6 - vllm

History

Isotr0py 8aaf3d5347 [Model][VLM] Support multi-images inputs for Phi-3-vision models (#7783 )		2024-08-25 11:51:20 +00:00
..
guided_decoding	[misc][core] lazy import outlines (#7831 )	2024-08-24 00:51:38 -07:00
layers	[Misc] Update `marlin` to use vLLMParameters (#7803 )	2024-08-23 14:30:52 -04:00
model_loader	Fix ShardedStateLoader for vllm fp8 quantization (#7708 )	2024-08-22 08:25:04 -04:00
models	[Model][VLM] Support multi-images inputs for Phi-3-vision models (#7783 )	2024-08-25 11:51:20 +00:00
__init__.py	[Performance] Optimize e2e overheads: Reduce python allocations (#7162 )	2024-08-08 21:34:28 -07:00
custom_op.py	[XPU] fallback to native implementation for xpu custom op (#7670 )	2024-08-20 00:26:09 -07:00
parameter.py	[Misc] update fp8 to use `vLLMParameter` (#7437 )	2024-08-22 08:36:18 -04:00
pooling_metadata.py	[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )	2024-05-11 11:30:37 -07:00
sampling_metadata.py	[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )	2024-08-18 17:57:20 -07:00
utils.py	[Hardware][Neuron] Refactor neuron support (#3471 )	2024-03-22 01:22:17 +00:00