vllm/vllm at e67b4f2c2a216ff12d4f607caa3ba3409ae3f572 - vllm

History

Woosuk Kwon e67b4f2c2a Use FP32 in RoPE initialization (#1004 ) Co-authored-by: One <imone@tuta.io>		2023-09-11 00:26:35 -07:00
..
core	Make `AsyncLLMEngine` more robust & fix batched abort (#969 )	2023-09-07 13:43:45 -07:00
engine	fix "tansformers_module" ModuleNotFoundError when load model with `trust_remote_code=True` (#871 )	2023-09-08 17:21:30 -07:00
entrypoints	Start background task in `AsyncLLMEngine.generate` (#988 )	2023-09-08 00:03:39 -07:00
model_executor	Use FP32 in RoPE initialization (#1004 )	2023-09-11 00:26:35 -07:00
transformers_utils	Only emit warning about internal tokenizer if it isn't being used (#939 )	2023-09-05 00:50:55 +09:00
worker	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
__init__.py	Bump up the version to v0.1.6 (#989 )	2023-09-08 00:07:46 -07:00
block.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
config.py	fix: CUDA error when inferencing with Falcon-40B base model (#992 )	2023-09-10 01:39:02 -07:00
logger.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00
outputs.py	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
sampling_params.py	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
sequence.py	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
utils.py	[Quality] Add code formatter and linter (#326 )	2023-07-03 11:31:55 -07:00