vllm/vllm/model_executor/layers
ElizaWszola d081da0064
[Bugfix] Fix Marlin MoE act order when is_k_full == False (#8741)
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2024-09-28 18:19:40 -07:00
..
fused_moe [Bugfix] Fix Marlin MoE act order when is_k_full == False (#8741) 2024-09-28 18:19:40 -07:00
mamba [Kernel] Fullgraph and opcheck tests (#8479) 2024-09-25 08:35:52 -06:00
quantization [CI/Build] Update models tests & examples (#8874) 2024-09-28 09:54:35 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
activation.py [Hardware][intel GPU] bump up ipex version to 2.3 (#8365) 2024-09-13 16:54:34 -07:00
layernorm.py [Hardware][intel GPU] bump up ipex version to 2.3 (#8365) 2024-09-13 16:54:34 -07:00
linear.py [Feature][kernel] tensor parallelism with bitsandbytes quantization (#8434) 2024-09-17 08:09:12 -07:00
logits_processor.py [Bugfix] Fix weight loading for Chameleon when TP>1 (#7410) 2024-08-13 05:33:41 +00:00
pooler.py [Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734) 2024-05-11 11:30:37 -07:00
rejection_sampler.py [SpecDec][Misc] Cleanup, remove bonus token logic. (#8701) 2024-09-22 12:34:14 -07:00
resampler.py [MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) (#8029) 2024-09-05 12:48:10 +00:00
rotary_embedding.py [Misc] Use RoPE cache for MRoPE (#8396) 2024-09-11 23:13:14 -07:00
sampler.py [Core][Bugfix] Support prompt_logprobs returned with speculative decoding (#8047) 2024-09-24 17:29:56 -07:00
spec_decode_base_sampler.py [SpecDec][Misc] Cleanup, remove bonus token logic. (#8701) 2024-09-22 12:34:14 -07:00
typical_acceptance_sampler.py Fix typical acceptance sampler with correct recovered token ids (#8562) 2024-09-23 12:32:27 -07:00
vocab_parallel_embedding.py [Misc] Update GPTQ to use vLLMParameters (#7976) 2024-09-03 17:21:44 -04:00