vllm/vllm at ba0bfd40e21cacfd5da6a1e43028a37258a29cb4 - vllm

History

Zhuohan Li ba0bfd40e2 TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181 )		2023-10-02 15:36:09 -07:00
..
core	[Mistral] Mistral-7B-v0.1 support (#1196 )	2023-09-28 10:41:03 -07:00
engine	Provide default max model length (#1224 )	2023-09-28 14:44:02 -07:00
entrypoints	Provide default max model length (#1224 )	2023-09-28 14:44:02 -07:00
model_executor	TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181 )	2023-10-02 15:36:09 -07:00
transformers_utils	Fix Mistral model (#1220 )	2023-09-28 10:44:05 -07:00
worker	[Fix] Remove false assertion (#1222 )	2023-09-28 10:52:38 -07:00
__init__.py	Bump up the version to v0.2.0 (#1212 )	2023-09-28 15:30:38 -07:00
block.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
config.py	Provide default max model length (#1224 )	2023-09-28 14:44:02 -07:00
logger.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
outputs.py	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
sampling_params.py	[Minor] Fix type annotations (#1238 )	2023-10-02 15:28:31 -07:00
sequence.py	Fix get_max_num_running_seqs for waiting and swapped seq groups (#1068 )	2023-09-18 11:49:40 -07:00
utils.py	Allocate more shared memory to attention kernel (#1154 )	2023-09-26 22:27:13 -07:00