vllm/vllm
2023-09-26 22:27:13 -07:00
..
core Fix hanging when prompt exceeds limit (#1029) 2023-09-17 01:48:56 -07:00
engine feat: support stop_token_ids parameter. (#1097) 2023-09-21 15:34:02 -07:00
entrypoints Align max_tokens behavior with openai (#852) 2023-09-23 18:10:13 -07:00
model_executor Add comments on RoPE initialization (#1176) 2023-09-26 10:48:33 -07:00
transformers_utils Fix warning message on LLaMA FastTokenizer (#1037) 2023-09-14 17:33:32 -07:00
worker Allocate more shared memory to attention kernel (#1154) 2023-09-26 22:27:13 -07:00
__init__.py Bump up the version to v0.1.7 (#1013) 2023-09-11 00:54:30 -07:00
block.py [Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
config.py Fix config for Falcon (#1164) 2023-09-23 17:38:43 -07:00
logger.py [Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
outputs.py Align vLLM's beam search implementation with HF generate (#857) 2023-09-04 17:29:42 -07:00
sampling_params.py [Sampler] Vectorized sampling (simplified) (#1048) 2023-09-22 17:48:04 -07:00
sequence.py Fix get_max_num_running_seqs for waiting and swapped seq groups (#1068) 2023-09-18 11:49:40 -07:00
utils.py Allocate more shared memory to attention kernel (#1154) 2023-09-26 22:27:13 -07:00