vllm/vllm at e3e79e9e8a2224e03a711c3d1ef7a35daa447083 - vllm

History

Woosuk Kwon e3e79e9e8a Implement AWQ quantization support for LLaMA (#1032 ) Co-authored-by: Robert Irvine <robert@seamlessml.com> Co-authored-by: root <rirv938@gmail.com> Co-authored-by: Casper <casperbh.96@gmail.com> Co-authored-by: julian-q <julianhquevedo@gmail.com>		2023-09-16 00:03:37 -07:00
..
core	Make `AsyncLLMEngine` more robust & fix batched abort (#969 )	2023-09-07 13:43:45 -07:00
engine	Implement AWQ quantization support for LLaMA (#1032 )	2023-09-16 00:03:37 -07:00
entrypoints	Only fail if logit_bias has actual values (#1045 )	2023-09-14 17:33:01 -07:00
model_executor	Implement AWQ quantization support for LLaMA (#1032 )	2023-09-16 00:03:37 -07:00
transformers_utils	Fix warning message on LLaMA FastTokenizer (#1037 )	2023-09-14 17:33:32 -07:00
worker	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
__init__.py	Bump up the version to v0.1.7 (#1013 )	2023-09-11 00:54:30 -07:00
block.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
config.py	Implement AWQ quantization support for LLaMA (#1032 )	2023-09-16 00:03:37 -07:00
logger.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
outputs.py	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
sampling_params.py	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
sequence.py	[FIX] Minor bug fixes (#1035 )	2023-09-13 16:38:12 -07:00
utils.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00