vllm/vllm at 1715056fef0e8f047f71bfb864fe9773af47af41 - vllm

History

Adam Boeglin 1715056fef [Bugfix] Update neuron_executor.py to add optional vision_language_config (#3695 )		2024-03-28 10:43:34 -07:00
..
attention	[CI] Try introducing isort. (#3495 )	2024-03-25 07:59:47 -07:00
core	[2/N] Chunked prefill data update (#3538 )	2024-03-28 10:06:01 -07:00
engine	[2/N] Chunked prefill data update (#3538 )	2024-03-28 10:06:01 -07:00
entrypoints	[Misc] Include matched stop string/token in responses (#2976 )	2024-03-25 17:31:32 -07:00
executor	[Bugfix] Update neuron_executor.py to add optional vision_language_config (#3695 )	2024-03-28 10:43:34 -07:00
lora	Enable more models to inference based on LoRA (#3382 )	2024-03-25 18:09:31 -07:00
model_executor	[Kernel] DBRX Triton MoE kernel H100 (#3692 )	2024-03-28 10:05:34 -07:00
spec_decode	[CI] Try introducing isort. (#3495 )	2024-03-25 07:59:47 -07:00
transformers_utils	[Model] Add support for DBRX (#3660 )	2024-03-27 13:01:46 -07:00
worker	[2/N] Chunked prefill data update (#3538 )	2024-03-28 10:06:01 -07:00
__init__.py	Add distributed model executor abstraction (#3191 )	2024-03-11 11:03:45 -07:00
block.py	Add Automatic Prefix Caching (#2762 )	2024-03-02 00:50:01 -08:00
config.py	[2/N] Chunked prefill data update (#3538 )	2024-03-28 10:06:01 -07:00
logger.py	[CI] Try introducing isort. (#3495 )	2024-03-25 07:59:47 -07:00
outputs.py	[Misc] Include matched stop string/token in responses (#2976 )	2024-03-25 17:31:32 -07:00
py.typed	Add py.typed so consumers of vLLM can get type checking (#1509 )	2023-10-30 14:50:47 -07:00
sampling_params.py	feat: implement the min_tokens sampling parameter (#3124 )	2024-03-25 10:14:26 -07:00
sequence.py	[2/N] Chunked prefill data update (#3538 )	2024-03-28 10:06:01 -07:00
test_utils.py	[Core] remove cupy dependency (#3625 )	2024-03-27 00:33:26 -07:00
utils.py	[Core][Bugfix]Refactor block manager for better testability (#3492 )	2024-03-27 23:59:28 -07:00