vllm/layers at ae5279a16385e15c07ab2bcadcbcab44367595e9 - vllm

History

Pavani Majety 598b6d7b07 [Bugfix/Core] Flashinfer k_scale and v_scale (#9861 )		2024-11-01 12:15:05 -07:00
..
fused_moe	[torch.compile] directly register custom op (#9896 )	2024-10-31 21:56:09 -07:00
mamba	[Kernel][Model] Improve continuous batching for Jamba and Mamba (#9189 )	2024-10-16 12:12:43 -04:00
quantization	[Bugfix/Core] Flashinfer k_scale and v_scale (#9861 )	2024-11-01 12:15:05 -07:00
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
activation.py	[Kernel] add kernel for FATReLU (#9610 )	2024-10-24 16:18:27 +08:00
layernorm.py	[Model] FalconMamba Support (#9325 )	2024-10-21 12:50:16 -04:00
linear.py	[Hardware][CPU] Support AWQ for CPU backend (#7515 )	2024-10-09 10:28:08 -06:00
logits_processor.py	[V1] Implement vLLM V1 [1/N] (#9289 )	2024-10-22 01:24:07 -07:00
pooler.py	[Model] Support math-shepherd-mistral-7b-prm model (#9697 )	2024-10-30 09:33:42 -07:00
rejection_sampler.py	[SpecDec][Misc] Cleanup, remove bonus token logic. (#8701 )	2024-09-22 12:34:14 -07:00
resampler.py	[MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) (#8029 )	2024-09-05 12:48:10 +00:00
rotary_embedding.py	[torch.compile] Fine-grained CustomOp enabling mechanism (#9300 )	2024-10-17 18:36:37 +00:00
sampler.py	[CI/Build] mypy: Resolve some errors from checking vllm/engine (#9267 )	2024-10-16 22:55:59 +00:00
spec_decode_base_sampler.py	[SpecDec][Misc] Cleanup, remove bonus token logic. (#8701 )	2024-09-22 12:34:14 -07:00
typical_acceptance_sampler.py	Fix typical acceptance sampler with correct recovered token ids (#8562 )	2024-09-23 12:32:27 -07:00
vocab_parallel_embedding.py	[Bugfix] Fix lm_head weights tying with lora for llama (#9227 )	2024-10-10 21:11:56 +08:00