Daniele
a2c71c5405
[CI/Build] remove .github from .dockerignore, add dirty repo check ( #9375 )
2024-10-17 10:25:06 -07:00
Daniele
203ab8f80f
[CI/Build] setuptools-scm fixes ( #8900 )
2024-10-14 11:34:47 -07:00
Michael Goin
d5fbb8706d
[CI/Build] Update Dockerfile install+deploy image to ubuntu 22.04 ( #9130 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-09 12:51:47 -06:00
Peter Pan
cfba685bd4
[CI/Build] Add examples folder into Docker image so that we can leverage the templates*.jinja when serving models ( #8758 )
...
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2024-10-08 09:37:34 -07:00
Michael Goin
520db4dbc1
[Docs] Add README to the build docker image ( #8825 )
2024-09-26 11:02:52 -07:00
Tyler Michael Smith
f70bccac75
[Build/CI] Upgrade to gcc 10 in the base build Docker image ( #8814 )
2024-09-26 10:07:18 -07:00
Jee Jee Li
c6f2485c82
[[Misc]] Add extra deps for openai server image ( #8792 )
2024-09-25 09:35:23 -07:00
Daniele
ee5f34b1c2
[CI/Build] use setuptools-scm to set __version__ ( #4738 )
...
Co-authored-by: youkaichao <youkaichao@126.com>
2024-09-23 09:44:26 -07:00
Luka Govedič
71c60491f2
[Kernel] Build flash-attn from source ( #8245 )
2024-09-20 23:27:10 -07:00
Joe Runde
cca61642e0
[Bugfix] Fix 3.12 builds on main ( #8510 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-09-17 00:01:45 +00:00
Simon Mo
5ce45eb54d
[misc] small qol fixes for release process ( #8517 )
2024-09-16 15:11:27 -07:00
Yangshen⚡Deng
6a512a00df
[model] Support for Llava-Next-Video model ( #7559 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-09-10 22:21:36 -07:00
Joe Runde
cfe712bf1a
[CI/Build] Use python 3.12 in cuda image ( #8133 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-09-07 13:03:16 -07:00
Rui Qiao
de80783b69
[Misc] Use ray[adag] dependency instead of cuda ( #7938 )
2024-09-06 09:18:35 -07:00
TimWang
ccd7207191
chore: Update check-wheel-size.py to read MAX_SIZE_MB from env ( #8103 )
2024-09-03 23:17:05 -07:00
Lily Liu
e6a26ed037
[SpecDecode][Kernel] Flashinfer Rejection Sampling ( #7244 )
2024-09-01 21:23:29 -07:00
Mor Zusman
fdd9daafa3
[Kernel/Model] Migrate mamba_ssm and causal_conv1d kernels to vLLM ( #7651 )
2024-08-28 15:06:52 -07:00
Kevin H. Luu
666ad0aa16
[ci] Cleanup & refactor Dockerfile to pass different Python versions and sccache bucket via build args ( #7705 )
...
Signed-off-by: kevin <kevin@anyscale.com>
2024-08-22 20:10:55 +00:00
Peng Guanwen
f710fb5265
[Core] Use flashinfer sampling kernel when available ( #7137 )
...
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-08-19 03:24:03 +00:00
Lily Liu
ec2affa8ae
[Kernel] Flashinfer correctness fix for v0.1.3 ( #7319 )
2024-08-12 07:59:17 +00:00
Rui Qiao
05308891e2
[Core] Pipeline parallel with Ray ADAG ( #6837 )
...
Support pipeline-parallelism with Ray accelerated DAG.
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
2024-08-02 13:55:40 -07:00
Sage Moore
7e0861bd0b
[CI/Build] Update PyTorch to 2.4.0 ( #6951 )
...
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-08-01 11:11:24 -07:00
Jee Jee Li
7ecee34321
[Kernel][RFC] Refactor the punica kernel based on Triton ( #5036 )
2024-07-31 17:12:24 -07:00
youkaichao
5a96ee52a3
[ci][build] add back vim in docker ( #6661 )
2024-07-22 16:26:29 -07:00
Kevin H. Luu
69d5ae38dc
[ci] Use different sccache bucket for CUDA 11.8 wheel build ( #6656 )
...
Signed-off-by: kevin <kevin@anyscale.com>
2024-07-22 14:20:41 -07:00
youkaichao
e81522e879
[build] add ib in image for out-of-the-box infiniband support ( #6599 )
...
[build] add ib so that multi-node support with infiniband can be supported out-of-the-box (#6599 )
2024-07-19 17:16:57 -07:00
Tyler Michael Smith
1689219ebf
[CI/Build] Build on Ubuntu 20.04 instead of 22.04 ( #6517 )
2024-07-18 17:29:25 -07:00
Pernekhan Utemuratov
a63a4c6341
[Misc] Use 0.0.9 version for flashinfer ( #6447 )
...
Co-authored-by: Pernekhan Utemuratov <pernekhan@deepinfra.com>
2024-07-15 10:10:26 -07:00
Robert Shaw
a754dc2cb9
[CI/Build] Cross python wheel ( #6394 )
2024-07-14 18:54:46 -07:00
youkaichao
ccd3c04571
[ci][build] fix commit id ( #6420 )
...
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2024-07-14 22:16:21 +08:00
Simon Mo
4f0e0ea131
Add FlashInfer to default Dockerfile ( #6172 )
2024-07-08 13:38:03 -07:00
Simon Mo
bc96d5c330
Move release wheel env var to Dockerfile instead ( #6163 )
2024-07-05 17:19:53 -07:00
Mor Zusman
9d6a8daa87
[Model] Jamba support ( #4115 )
...
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Erez Schwartz <erezs@ai21.com>
Co-authored-by: Mor Zusman <morz@ai21.com>
Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com>
Co-authored-by: Tomer Asida <tomera@ai21.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Co-authored-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
2024-07-02 23:11:29 +00:00
zhyncs
f1e72cc19a
[BugFix] exclude version 1.15.0 for modelscope ( #5668 )
2024-06-21 13:15:48 -06:00
Kevin H. Luu
19091efc44
[ci] Setup Release pipeline and build release wheels with cache ( #5610 )
...
Signed-off-by: kevin <kevin@anyscale.com>
2024-06-18 11:00:36 -07:00
Antoni Baum
a8fda4f661
Seperate dev requirements into lint and test ( #5474 )
2024-06-13 11:22:41 -07:00
Kevin H. Luu
916d219d62
[ci] Use sccache to build images ( #5419 )
...
Signed-off-by: kevin <kevin@anyscale.com>
2024-06-12 17:58:12 -07:00
youkaichao
4fbcb0f27e
[Doc][Build] update after removing vllm-nccl ( #5103 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2024-05-29 23:51:18 +00:00
Woosuk Kwon
89579a201f
[Misc] Use vllm-flash-attn instead of flash-attn ( #4686 )
2024-05-08 13:15:34 -07:00
Simon Mo
021b1a2ab7
[CI] check size of the wheels ( #4319 )
2024-05-04 20:44:36 +00:00
Prashant Gupta
b31a1fb63c
[Doc] add visualization for multi-stage dockerfile ( #4456 )
...
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-04-30 17:41:59 +00:00
Michael Goin
d627a3d837
[Misc] Upgrade to torch==2.3.0 ( #4454 )
2024-04-29 20:05:47 -04:00
Woosuk Kwon
cfaf49a167
[Misc] Define common requirements ( #3841 )
2024-04-05 00:39:17 -07:00
youkaichao
d03d64fd2e
[CI/Build] refactor dockerfile & fix pip cache
...
[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels (#3859 )
2024-04-04 21:53:16 -07:00
youkaichao
ca81ff5196
[Core] manage nccl via a pypi package & upgrade to pt 2.2.1 ( #3805 )
2024-04-04 10:26:19 -07:00
yhu422
d8658c8cc1
Usage Stats Collection ( #2852 )
2024-03-28 22:16:12 -07:00
Simon Mo
7bc94a0fdd
add ccache to docker build image ( #3704 )
2024-03-28 22:14:24 -07:00
youkaichao
8f44facddd
[Core] remove cupy dependency ( #3625 )
2024-03-27 00:33:26 -07:00
ifsheldon
c614cfee58
Update dockerfile with ModelScope support ( #3429 )
2024-03-19 10:54:59 -07:00
bnellnm
9fdf3de346
Cmake based build system ( #2830 )
2024-03-18 15:38:33 -07:00