Roger Wang
|
f8a12ecc7f
|
[Misc] Bump transformers version (#3592)
|
2024-03-24 06:32:45 -07:00 |
|
Woosuk Kwon
|
c188ecb080
|
[Misc] Bump up transformers to v4.39.0 & Remove StarCoder2Config (#3551)
Co-authored-by: Roy <jasonailu87@gmail.com>
Co-authored-by: Roger Meier <r.meier@siemens.com>
|
2024-03-21 07:58:12 -07:00 |
|
bnellnm
|
9fdf3de346
|
Cmake based build system (#2830)
|
2024-03-18 15:38:33 -07:00 |
|
Simon Mo
|
81653d9688
|
[Hotfix] [Debug] test_openai_server.py::test_guided_regex_completion (#3383)
|
2024-03-13 17:02:21 -07:00 |
|
felixzhu555
|
703e42ee4b
|
Add guided decoding for OpenAI API server (#2819)
Co-authored-by: br3no <breno@veltefaria.de>
Co-authored-by: simon-mo <simon.mo@hey.com>
|
2024-02-29 22:13:08 +00:00 |
|
Allen.Dou
|
e46fa5d52e
|
Restrict prometheus_client >= 0.18.0 to prevent errors when importing pkgs (#3070)
|
2024-02-28 05:38:26 +00:00 |
|
Harry Mellor
|
ef978fe411
|
Port metrics from aioprometheus to prometheus_client (#2730)
|
2024-02-25 11:54:00 -08:00 |
|
Woosuk Kwon
|
c20ecb6a51
|
Upgrade transformers to v4.38.0 (#2965)
|
2024-02-21 09:38:03 -08:00 |
|
Nikola Borisov
|
87069ccf68
|
Fix docker python version (#2845)
|
2024-02-14 10:17:57 -08:00 |
|
Woosuk Kwon
|
a463c333dd
|
Use CuPy for CUDA graphs (#2811)
|
2024-02-13 11:32:06 -08:00 |
|
whyiug
|
c9b45adeeb
|
Require triton >= 2.1.0 (#2746)
Co-authored-by: yangrui1 <yangrui@lanjingren.com>
|
2024-02-04 23:07:36 -08:00 |
|
Simon Mo
|
7d648418b8
|
Update Ray version requirements (#2636)
|
2024-01-28 14:27:22 -08:00 |
|
Hanzhi Zhou
|
380170038e
|
Implement custom all reduce kernels (#2192)
|
2024-01-27 12:46:35 -08:00 |
|
Junyang Lin
|
94b5edeb53
|
Add qwen2 (#2495)
|
2024-01-22 14:34:21 -08:00 |
|
Jannis Schönleber
|
71d63ed72e
|
migrate pydantic from v1 to v2 (#2531)
|
2024-01-21 16:05:56 -08:00 |
|
Zhuohan Li
|
fd4ea8ef5c
|
Use NCCL instead of ray for control-plane communication to remove serialization overhead (#2221)
|
2024-01-03 11:30:22 -08:00 |
|
Woosuk Kwon
|
c9fadda543
|
[Minor] Fix xformers version (#2158)
|
2023-12-17 02:28:02 -08:00 |
|
Woosuk Kwon
|
c3372e87be
|
Remove dependency on CuPy (#2152)
|
2023-12-17 01:49:07 -08:00 |
|
Woosuk Kwon
|
b0a1d667b0
|
Pin PyTorch & xformers versions (#2155)
|
2023-12-17 01:46:54 -08:00 |
|
Woosuk Kwon
|
37ca558103
|
Optimize model execution with CUDA graph (#1926)
Co-authored-by: Chen Shen <scv119@gmail.com>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
|
2023-12-16 21:12:08 -08:00 |
|
Woosuk Kwon
|
7e1b21daac
|
Remove einops from requirements (#2049)
|
2023-12-12 09:34:09 -08:00 |
|
Woosuk Kwon
|
cb3f30c600
|
Upgrade transformers version to 4.36.0 (#2046)
|
2023-12-11 18:39:14 -08:00 |
|
Woosuk Kwon
|
f3e024bece
|
[CI/CD] Upgrade PyTorch version to v2.1.1 (#2045)
|
2023-12-11 17:48:11 -08:00 |
|
Woosuk Kwon
|
beeee69bc9
|
Revert adding Megablocks (#2030)
|
2023-12-11 10:49:00 -08:00 |
|
Ram
|
9bf28d0b69
|
Update requirements.txt for mixtral (#2029)
|
2023-12-11 10:39:29 -08:00 |
|
Simon Mo
|
5313c2cb8b
|
Add Production Metrics in Prometheus format (#1890)
|
2023-12-02 16:37:44 -08:00 |
|
maximzubkov
|
521b35f799
|
Support Microsoft Phi 1.5 (#1664)
|
2023-11-16 14:28:39 -08:00 |
|
Zhuohan Li
|
06458a0b42
|
Upgrade to CUDA 12 (#1527)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2023-11-08 14:17:49 -08:00 |
|
Thiago Salvatore
|
bf31d3606a
|
Pin pydantic dependency versions (#1429)
|
2023-10-21 11:18:58 -07:00 |
|
Woosuk Kwon
|
e7c8555d06
|
Bump up transformers version & Remove MistralConfig (#1254)
|
2023-10-13 10:05:26 -07:00 |
|
yanxiyue
|
6a6119554c
|
lock torch version to 2.0.1 (#1290)
|
2023-10-10 09:21:57 -07:00 |
|
Chris Bamford
|
bb1ba58f06
|
[Mistral] Mistral-7B-v0.1 support (#1196)
Co-authored-by: timlacroix <t@mistral.ai>
|
2023-09-28 10:41:03 -07:00 |
|
Danilo Peixoto
|
649aa730c5
|
Use standard extras for uvicorn (#1166)
|
2023-09-27 17:41:36 -07:00 |
|
Woosuk Kwon
|
400b8289f7
|
Add pyarrow to dependencies & Print warning on Ray import error (#1094)
|
2023-09-18 22:36:17 -07:00 |
|
Woosuk Kwon
|
a58936966f
|
Add pandas to requirements.txt (#1047)
* Add pandas to requirements.txt
* Minor
|
2023-09-14 17:31:38 -07:00 |
|
Woosuk Kwon
|
7a9c20c715
|
Bum up transformers version (#976)
|
2023-09-07 13:15:53 -07:00 |
|
Woosuk Kwon
|
2a4ec90854
|
Fix for breaking changes in xformers 0.0.21 (#834)
|
2023-08-23 17:44:21 +09:00 |
|
Zhuohan Li
|
82ad323dee
|
[Fix] Add chat completion Example and simplify dependencies (#576)
|
2023-07-25 23:45:48 -07:00 |
|
Zhuohan Li
|
6fc2a38b11
|
Add support for LLaMA-2 (#505)
|
2023-07-20 11:38:27 -07:00 |
|
Antoni Baum
|
9925c17940
|
Ray placement group support (#397)
|
2023-07-19 22:49:31 -07:00 |
|
Keming
|
51be365143
|
fix: freeze pydantic to v1 (#429)
|
2023-07-12 11:10:55 -04:00 |
|
Zhuohan Li
|
98fe8cb542
|
[Server] Add option to specify chat template for chat endpoint (#345)
|
2023-07-03 23:01:56 -07:00 |
|
Zhuohan Li
|
057daef778
|
OpenAI Compatible Frontend (#116)
|
2023-05-23 21:39:50 -07:00 |
|
Woosuk Kwon
|
7addca5935
|
Specify python package dependencies in requirements.txt (#78)
|
2023-05-07 16:30:43 -07:00 |
|