Commit Graph

31 Commits

Author SHA1 Message Date
Zhuohan Li
c957c741d9
Enable safetensors loading for all models (#974) 2023-09-07 15:49:52 -07:00
Woosuk Kwon
320a622ec4
[BugFix] Implement RoPE for GPT-J (#941) 2023-09-06 11:54:33 +09:00
Zhuohan Li
002800f081
Align vLLM's beam search implementation with HF generate (#857) 2023-09-04 17:29:42 -07:00
JFDuan
0d93f15694
Accelerate LLaMA model loading (#234) 2023-08-30 01:00:13 -07:00
Antoni Baum
4b6f069b6f
Add support for CodeLlama (#854) 2023-08-25 12:44:07 -07:00
Wen Sun
eedac9dba0
fix: revert code to avoid no attribute problem (#827) 2023-08-22 11:55:16 -07:00
shunxing1234
ad5f2fe34c
Add support for aquila (#663)
* add aquila

Signed-off-by: ftgreat <ftgreat@163.com>

* fix some bug

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* delete pdb

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* fix bugs

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* fix bugs

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* delete whitespace

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* format

* fix order

---------

Signed-off-by: ftgreat <ftgreat@163.com>
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
Co-authored-by: ftgreat <ftgreat@163.com>
2023-08-22 00:13:36 -07:00
zhaoyang-star
4f8584756d
Fix mqa is false case in gpt_bigcode (#806) 2023-08-21 22:22:06 -07:00
WRH
462ae5220a
[Fix] unwantted bias in InternLM Model (#740) 2023-08-11 11:40:37 -07:00
Jia Guoqing
735ecfff61
add internlm model (#528) 2023-08-08 16:35:06 -07:00
Qing
a57d13cc96
add QWen-7b (#685)
Co-authored-by: wq.chu <wq.chu@tianrang-inc.com>
2023-08-08 13:50:38 -07:00
Zhuohan Li
f7389f4763
[Doc] Add Baichuan 13B to supported models (#656) 2023-08-02 16:45:12 -07:00
Zhuohan Li
1b0bd0fe8a
Add Falcon support (new) (#592) 2023-08-02 14:04:39 -07:00
Song
64f23c2900
fix baichuan for different position embedding for 7b and 13b models (#643) 2023-08-01 22:22:51 -07:00
Qing
d4c7755ca8
fix biachuan-7b tp (#598)
Co-authored-by: wq.chu <wq.chu@tianrang-inc.com>
2023-08-01 15:41:36 -07:00
Zhuohan Li
7d5a155e4a
[Fix] Fix GPTBigcoder for distributed execution (#503) 2023-07-24 18:36:33 -07:00
leegohi04517
1dde34e0f8
GPTJConfig has no attribute rotary. (#532) 2023-07-24 11:29:30 -07:00
Zhuohan Li
6fc2a38b11
Add support for LLaMA-2 (#505) 2023-07-20 11:38:27 -07:00
codethazine
20b0d88d16
Add support for baichuan (#365) 2023-07-17 13:50:55 -07:00
Zhuohan Li
96853af5a8
Optimize MQA Kernel (#452) 2023-07-14 20:06:40 -04:00
Wen Sun
dbed69058c
Fix the KeyError when loading bloom-based models (#441) 2023-07-13 21:58:09 -07:00
panda
7b6ae94059
add vocab padding for LLama(Support WizardLM) (#411) 2023-07-13 23:56:22 -04:00
Andre Slavescu
c894836108
[Model] Add support for GPT-J (#226)
Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>
2023-07-08 17:55:16 -07:00
Woosuk Kwon
404422f42e
[Model] Add support for MPT (#334) 2023-07-03 16:47:53 -07:00
Zhuohan Li
42e0c1df78
[Quality] Add CI for formatting (#343) 2023-07-03 14:50:56 -07:00
Woosuk Kwon
e41f06702c
Add support for BLOOM (#331) 2023-07-03 13:12:35 -07:00
Zhuohan Li
d6fa1be3a8
[Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
Zhuohan Li
598dc4b79a
[Fix] Weight loading for GPTBigCode (#313) 2023-06-29 22:14:17 -07:00
twaka
4026a049d3
expand coverage of gpt2 model loading (#271) 2023-06-27 06:27:41 -07:00
Michael Feil
298695b766
GPTBigCode (StarCoder, SantaCoder Support) (#209) 2023-06-23 01:49:27 +08:00
Woosuk Kwon
0b98ba15c7
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00