vllm/layers at fa9e3852290ecb6eaae45befbd629bb060f57fb7 - vllm

History

sroy745 fa9e385229 [Speculative Decoding 1/2 ] Add typical acceptance sampling as one of the sampling techniques in the verifier (#5131 )		2024-06-17 21:29:09 -05:00
..
fused_moe	[Kernel] Tune Qwen2MoE kernel configurations with tp2,4 (#5497 )	2024-06-13 09:01:10 -07:00
ops	[Mypy] Part 3 fix typing for nested directories for most of directory (#4161 )	2024-04-22 21:32:44 -07:00
quantization	[Kernel] `compressed-tensors` marlin 24 support (#5435 )	2024-06-17 12:32:48 -04:00
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
activation.py	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 )	2024-06-17 11:01:25 -07:00
layernorm.py	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 )	2024-06-17 11:01:25 -07:00
linear.py	[mypy] Enable type checking for test directory (#5017 )	2024-06-15 04:45:31 +00:00
logits_processor.py	[Misc] Skip for logits_scale == 1.0 (#5291 )	2024-06-05 15:19:02 -07:00
pooler.py	[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734 )	2024-05-11 11:30:37 -07:00
rejection_sampler.py	[Speculative Decoding 1/2 ] Add typical acceptance sampling as one of the sampling techniques in the verifier (#5131 )	2024-06-17 21:29:09 -05:00
rotary_embedding.py	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 )	2024-06-17 11:01:25 -07:00
sampler.py	[mypy] Enable type checking for test directory (#5017 )	2024-06-15 04:45:31 +00:00
spec_decode_base_sampler.py	[Speculative Decoding 1/2 ] Add typical acceptance sampling as one of the sampling techniques in the verifier (#5131 )	2024-06-17 21:29:09 -05:00
typical_acceptance_sampler.py	[Speculative Decoding 1/2 ] Add typical acceptance sampling as one of the sampling techniques in the verifier (#5131 )	2024-06-17 21:29:09 -05:00
vocab_parallel_embedding.py	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 )	2024-06-17 11:01:25 -07:00