Commit Graph

34 Commits

Author SHA1 Message Date
whyiug
c9b45adeeb
Require triton >= 2.1.0 (#2746)
Co-authored-by: yangrui1 <yangrui@lanjingren.com>
2024-02-04 23:07:36 -08:00
Simon Mo
7d648418b8
Update Ray version requirements (#2636) 2024-01-28 14:27:22 -08:00
Hanzhi Zhou
380170038e
Implement custom all reduce kernels (#2192) 2024-01-27 12:46:35 -08:00
Junyang Lin
94b5edeb53
Add qwen2 (#2495) 2024-01-22 14:34:21 -08:00
Jannis Schönleber
71d63ed72e
migrate pydantic from v1 to v2 (#2531) 2024-01-21 16:05:56 -08:00
Zhuohan Li
fd4ea8ef5c
Use NCCL instead of ray for control-plane communication to remove serialization overhead (#2221) 2024-01-03 11:30:22 -08:00
Woosuk Kwon
c9fadda543
[Minor] Fix xformers version (#2158) 2023-12-17 02:28:02 -08:00
Woosuk Kwon
c3372e87be
Remove dependency on CuPy (#2152) 2023-12-17 01:49:07 -08:00
Woosuk Kwon
b0a1d667b0
Pin PyTorch & xformers versions (#2155) 2023-12-17 01:46:54 -08:00
Woosuk Kwon
37ca558103
Optimize model execution with CUDA graph (#1926)
Co-authored-by: Chen Shen <scv119@gmail.com>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2023-12-16 21:12:08 -08:00
Woosuk Kwon
7e1b21daac
Remove einops from requirements (#2049) 2023-12-12 09:34:09 -08:00
Woosuk Kwon
cb3f30c600
Upgrade transformers version to 4.36.0 (#2046) 2023-12-11 18:39:14 -08:00
Woosuk Kwon
f3e024bece
[CI/CD] Upgrade PyTorch version to v2.1.1 (#2045) 2023-12-11 17:48:11 -08:00
Woosuk Kwon
beeee69bc9
Revert adding Megablocks (#2030) 2023-12-11 10:49:00 -08:00
Ram
9bf28d0b69
Update requirements.txt for mixtral (#2029) 2023-12-11 10:39:29 -08:00
Simon Mo
5313c2cb8b
Add Production Metrics in Prometheus format (#1890) 2023-12-02 16:37:44 -08:00
maximzubkov
521b35f799
Support Microsoft Phi 1.5 (#1664) 2023-11-16 14:28:39 -08:00
Zhuohan Li
06458a0b42
Upgrade to CUDA 12 (#1527)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2023-11-08 14:17:49 -08:00
Thiago Salvatore
bf31d3606a
Pin pydantic dependency versions (#1429) 2023-10-21 11:18:58 -07:00
Woosuk Kwon
e7c8555d06
Bump up transformers version & Remove MistralConfig (#1254) 2023-10-13 10:05:26 -07:00
yanxiyue
6a6119554c
lock torch version to 2.0.1 (#1290) 2023-10-10 09:21:57 -07:00
Chris Bamford
bb1ba58f06
[Mistral] Mistral-7B-v0.1 support (#1196)
Co-authored-by: timlacroix <t@mistral.ai>
2023-09-28 10:41:03 -07:00
Danilo Peixoto
649aa730c5
Use standard extras for uvicorn (#1166) 2023-09-27 17:41:36 -07:00
Woosuk Kwon
400b8289f7
Add pyarrow to dependencies & Print warning on Ray import error (#1094) 2023-09-18 22:36:17 -07:00
Woosuk Kwon
a58936966f
Add pandas to requirements.txt (#1047)
* Add pandas to requirements.txt

* Minor
2023-09-14 17:31:38 -07:00
Woosuk Kwon
7a9c20c715
Bum up transformers version (#976) 2023-09-07 13:15:53 -07:00
Woosuk Kwon
2a4ec90854
Fix for breaking changes in xformers 0.0.21 (#834) 2023-08-23 17:44:21 +09:00
Zhuohan Li
82ad323dee
[Fix] Add chat completion Example and simplify dependencies (#576) 2023-07-25 23:45:48 -07:00
Zhuohan Li
6fc2a38b11
Add support for LLaMA-2 (#505) 2023-07-20 11:38:27 -07:00
Antoni Baum
9925c17940
Ray placement group support (#397) 2023-07-19 22:49:31 -07:00
Keming
51be365143
fix: freeze pydantic to v1 (#429) 2023-07-12 11:10:55 -04:00
Zhuohan Li
98fe8cb542
[Server] Add option to specify chat template for chat endpoint (#345) 2023-07-03 23:01:56 -07:00
Zhuohan Li
057daef778
OpenAI Compatible Frontend (#116) 2023-05-23 21:39:50 -07:00
Woosuk Kwon
7addca5935
Specify python package dependencies in requirements.txt (#78) 2023-05-07 16:30:43 -07:00