Tyler Michael Smith
|
16b24e7dcd
|
[Bugfix] Bandaid fix for speculative decoding tests (#9327)
|
2024-10-13 23:02:11 +00:00 |
|
Lily Liu
|
f519902c52
|
[CI] Fix merge conflict (#9317)
|
2024-10-13 06:41:23 +00:00 |
|
Jee Jee Li
|
250e26a63e
|
[Bugfix]Fix MiniCPM's LoRA bug (#9286)
|
2024-10-12 09:36:47 -07:00 |
|
Yunmeng
|
2b184ddd4f
|
[Misc][Installation] Improve source installation script and doc (#9309)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-10-12 09:36:40 -07:00 |
|
Xiang Xu
|
00298e092c
|
[Bugfix] Fix bug of xformer prefill for encoder-decoder (#9026)
|
2024-10-12 15:00:43 +08:00 |
|
Lily Liu
|
89feb4c84d
|
[SpecDec] Remove Batch Expansion (2/3) (#9298)
|
2024-10-12 05:13:37 +00:00 |
|
Maximilien de Bayser
|
ec10cb8511
|
[BugFix] Fix tool call finish reason in streaming case (#9209)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2024-10-11 18:24:26 -07:00 |
|
Prashant Gupta
|
d11b46f3a5
|
[bugfix] fix f-string for error (#9295)
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
|
2024-10-11 17:03:48 -07:00 |
|
Allen Wang
|
c6cf9295e1
|
[Bugfix] Sets is_first_step_output for TPUModelRunner (#9202)
|
2024-10-11 13:28:10 -07:00 |
|
Lucas Wilkinson
|
de9fb4bef8
|
[Bugfix][CI/Build] Fix docker build where CUDA archs < 7.0 are being detected (#9254)
|
2024-10-11 15:57:39 -04:00 |
|
Wallas Henrique
|
8baf85e4e9
|
[Doc] Compatibility matrix for mutual exclusive features (#8512)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
|
2024-10-11 11:18:50 -07:00 |
|
homeffjy
|
1a1823871d
|
[Doc] Remove outdated comment to avoid misunderstanding (#9287)
|
2024-10-11 18:02:03 +00:00 |
|
sixgod
|
6cf1167c1a
|
[Model] Add GLM-4v support and meet vllm==0.6.2 (#9242)
|
2024-10-11 17:36:13 +00:00 |
|
Burkhard Ringlein
|
f710090d8e
|
[Kernel] adding fused moe kernel config for L40S TP4 (#9245)
|
2024-10-11 08:54:22 -07:00 |
|
Tyler Michael Smith
|
7342a7d7f8
|
[Model] Support Mamba (#6484)
|
2024-10-11 15:40:06 +00:00 |
|
Sebastian Schoennenbeck
|
df3dcdf49d
|
[Bugfix] Fix priority in multiprocessing engine (#9277)
|
2024-10-11 15:35:35 +00:00 |
|
Jee Jee Li
|
36ea79079b
|
[Misc][LoRA] Support loading LoRA weights for target_modules in reg format (#9275)
|
2024-10-11 12:31:21 +00:00 |
|
Cyrus Leung
|
e808156f30
|
[Misc] Collect model support info in a single process per model (#9233)
|
2024-10-11 11:08:11 +00:00 |
|
youkaichao
|
cbc2ef5529
|
[misc] hide best_of from engine (#9261)
Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>
|
2024-10-10 21:30:44 -07:00 |
|
Andy Dai
|
94bf9ae4e9
|
[Misc] Fix sampling from sonnet for long context case (#9235)
|
2024-10-11 00:33:16 +00:00 |
|
omrishiv
|
f990bab2a4
|
[Doc][Neuron] add note to neuron documentation about resolving triton issue (#9257)
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
|
2024-10-10 23:36:32 +00:00 |
|
youkaichao
|
e00c094f15
|
[torch.compile] generic decorators (#9258)
|
2024-10-10 15:54:23 -07:00 |
|
Kevin H. Luu
|
a78c6ba7c8
|
[ci/build] Add placeholder command for custom models test (#9262)
|
2024-10-10 15:45:09 -07:00 |
|
dependabot[bot]
|
fb870fd491
|
Bump actions/setup-python from 3 to 5 (#9195)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-10-10 13:30:46 -07:00 |
|
dependabot[bot]
|
270953bafb
|
Bump actions/checkout from 3 to 4 (#9196)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-10-10 13:30:35 -07:00 |
|
dependabot[bot]
|
9cc811c4ff
|
Bump actions/github-script from 6 to 7 (#9197)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2024-10-10 13:30:24 -07:00 |
|
youkaichao
|
e4d652ea3e
|
[torch.compile] integration with compilation control (#9058)
|
2024-10-10 12:39:36 -07:00 |
|
Simon Mo
|
78c0b4166c
|
Suggest codeowners for the core componenets (#9210)
|
2024-10-10 12:29:24 -07:00 |
|
jordanyono
|
21efb603f5
|
[CI/Build] Make the Dockerfile.cpu file's PIP_EXTRA_INDEX_URL Configurable as a Build Argument (#9252)
|
2024-10-10 18:18:18 +00:00 |
|
Rafael Vasquez
|
055f3270d4
|
[Doc] Improve debugging documentation (#9204)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-10-10 10:48:51 -07:00 |
|
Lucas Wilkinson
|
18511aeda6
|
[Bugfix] Fix Machete unittests failing with NotImplementedError (#9218)
|
2024-10-10 17:39:56 +00:00 |
|
Ilya Lavrenov
|
83ea5c72b9
|
[OpenVINO] Use torch 2.4.0 and newer optimim version (#9121)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-10 11:18:58 -06:00 |
|
whyiug
|
04de9057ab
|
[Model] support input image embedding for minicpmv (#9237)
|
2024-10-10 15:00:47 +00:00 |
|
Isotr0py
|
07c11cf4d4
|
[Bugfix] Fix lm_head weights tying with lora for llama (#9227)
|
2024-10-10 21:11:56 +08:00 |
|
sroy745
|
f3a507f1d3
|
[Core] Add an environment variable which needs to be set explicitly to allow BlockSpaceManagerV1 (#9149)
|
2024-10-10 14:17:17 +08:00 |
|
Lucas Wilkinson
|
a64e7b9407
|
[Bugfix] Machete garbage results for some models (large K dim) (#9212)
|
2024-10-10 14:16:17 +08:00 |
|
Michael Goin
|
ce00231a8b
|
[Bugfix] Fix Weight Loading Multiple GPU Test - Large Models (#9213)
|
2024-10-10 14:15:40 +08:00 |
|
youkaichao
|
de895f1697
|
[misc] improve model support check in another process (#9208)
|
2024-10-09 21:58:27 -07:00 |
|
Russell Bryant
|
cf25b93bdd
|
[Core] Fix invalid args to _process_request (#9201)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-10-10 12:10:09 +08:00 |
|
Michael Goin
|
d5fbb8706d
|
[CI/Build] Update Dockerfile install+deploy image to ubuntu 22.04 (#9130)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-09 12:51:47 -06:00 |
|
Russell Bryant
|
cdca8994bd
|
[CI/Build] mypy: check vllm/entrypoints (#9194)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-10-09 17:15:28 +00:00 |
|
Li, Jiang
|
ca77dd7a44
|
[Hardware][CPU] Support AWQ for CPU backend (#7515)
|
2024-10-09 10:28:08 -06:00 |
|
Ewout ter Hoeven
|
7dea289066
|
Add Dependabot configuration for GitHub Actions updates (#1217)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-10-09 08:16:26 -07:00 |
|
Cyrus Leung
|
cfaa6008e6
|
[Bugfix] Access get_vocab instead of vocab in tool parsers (#9188)
|
2024-10-09 08:59:57 -06:00 |
|
Ahmad Fahadh Ilyas
|
21906a6f50
|
[Bugfix] Fix lora loading for Compressed Tensors in #9120 (#9179)
|
2024-10-09 12:10:44 +00:00 |
|
Jiangtao Hu
|
dc4aea677a
|
[Doc] Fix VLM prompt placeholder sample bug (#9170)
|
2024-10-09 08:59:42 +00:00 |
|
youkaichao
|
c8627cd41b
|
[ci][test] use load dummy for testing (#9165)
|
2024-10-09 00:38:40 -07:00 |
|
Cyrus Leung
|
8bfaa4e31e
|
[Bugfix] fix composite weight loading and EAGLE weight loading (#9160)
|
2024-10-09 00:36:55 -07:00 |
|
AlpinDale
|
0b5b5d767e
|
[Frontend] Log the maximum supported concurrency (#8831)
|
2024-10-09 00:03:14 -07:00 |
|
Hui Liu
|
cdc72e3c80
|
[Model] Remap FP8 kv_scale in CommandR and DBRX (#9174)
|
2024-10-09 06:43:06 +00:00 |
|