Zhuohan Li
|
f04908cae7
|
[FIX] Minor bug fixes (#1035)
* [FIX] Minor bug fixes
* Address review comments
|
2023-09-13 16:38:12 -07:00 |
|
Jasmond L
|
ab019eea75
|
Add Model Revision Support (#1014)
Co-authored-by: Jasmond Loh <Jasmond.Loh@hotmail.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2023-09-13 15:20:02 -07:00 |
|
Woosuk Kwon
|
e67b4f2c2a
|
Use FP32 in RoPE initialization (#1004)
Co-authored-by: One <imone@tuta.io>
|
2023-09-11 00:26:35 -07:00 |
|
Antoni Baum
|
a62de9ecfd
|
Fix wrong dtype in PagedAttentionWithALiBi bias (#996)
---------
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
|
2023-09-09 14:58:35 -07:00 |
|
Robert Irvine
|
4b5bcf8906
|
faster startup of vLLM (#982)
* update
---------
Co-authored-by: Robert Irvine <robert@seamlessml.com>
|
2023-09-08 14:48:54 +09:00 |
|
Zhuohan Li
|
c957c741d9
|
Enable safetensors loading for all models (#974)
|
2023-09-07 15:49:52 -07:00 |
|
Antoni Baum
|
005ba458b5
|
Set torch default dtype in a context manager (#971)
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
|
2023-09-07 15:39:37 +09:00 |
|
Woosuk Kwon
|
320a622ec4
|
[BugFix] Implement RoPE for GPT-J (#941)
|
2023-09-06 11:54:33 +09:00 |
|
Zhuohan Li
|
002800f081
|
Align vLLM's beam search implementation with HF generate (#857)
|
2023-09-04 17:29:42 -07:00 |
|
Dong-Yong Lee
|
e11222333f
|
fix: bug fix when penalties are negative (#913)
Co-authored-by: dongyong-lee <dongyong.lee@navercorp.com>
|
2023-09-01 00:37:17 +09:00 |
|
Aman Gupta Karmani
|
28873a2799
|
Improve _prune_hidden_states micro-benchmark (#707)
|
2023-08-31 13:28:43 +09:00 |
|
JFDuan
|
0d93f15694
|
Accelerate LLaMA model loading (#234)
|
2023-08-30 01:00:13 -07:00 |
|
Aman Gupta Karmani
|
75471386de
|
use flash-attn via xformers (#877)
|
2023-08-29 21:52:13 -07:00 |
|
Antoni Baum
|
4b6f069b6f
|
Add support for CodeLlama (#854)
|
2023-08-25 12:44:07 -07:00 |
|
Woosuk Kwon
|
94d2f59895
|
Set replacement=True in torch.multinomial (#858)
|
2023-08-25 12:22:01 +09:00 |
|
Woosuk Kwon
|
2a4ec90854
|
Fix for breaking changes in xformers 0.0.21 (#834)
|
2023-08-23 17:44:21 +09:00 |
|
Woosuk Kwon
|
d64bf1646c
|
Implement approximate GELU kernels (#828)
|
2023-08-23 07:43:21 +09:00 |
|
Wen Sun
|
eedac9dba0
|
fix: revert code to avoid no attribute problem (#827)
|
2023-08-22 11:55:16 -07:00 |
|
shunxing1234
|
ad5f2fe34c
|
Add support for aquila (#663)
* add aquila
Signed-off-by: ftgreat <ftgreat@163.com>
* fix some bug
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
* delete pdb
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
* fix bugs
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
* fix bugs
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
* delete whitespace
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
* format
* fix order
---------
Signed-off-by: ftgreat <ftgreat@163.com>
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
Co-authored-by: ftgreat <ftgreat@163.com>
|
2023-08-22 00:13:36 -07:00 |
|
zhaoyang-star
|
4f8584756d
|
Fix mqa is false case in gpt_bigcode (#806)
|
2023-08-21 22:22:06 -07:00 |
|
Xinyu Yang
|
73b3de79ea
|
explicitly del state (#784)
|
2023-08-17 12:56:04 -07:00 |
|
Abraham-Xu
|
d1744376ae
|
Align with huggingface Top K sampling (#753)
|
2023-08-15 16:44:33 -07:00 |
|
WRH
|
462ae5220a
|
[Fix] unwantted bias in InternLM Model (#740)
|
2023-08-11 11:40:37 -07:00 |
|
Jia Guoqing
|
735ecfff61
|
add internlm model (#528)
|
2023-08-08 16:35:06 -07:00 |
|
Qing
|
a57d13cc96
|
add QWen-7b (#685)
Co-authored-by: wq.chu <wq.chu@tianrang-inc.com>
|
2023-08-08 13:50:38 -07:00 |
|
Zhuohan Li
|
f7389f4763
|
[Doc] Add Baichuan 13B to supported models (#656)
|
2023-08-02 16:45:12 -07:00 |
|
Woosuk Kwon
|
55fe8a81ec
|
Refactor scheduler (#658)
|
2023-08-02 16:42:01 -07:00 |
|
Zhuohan Li
|
1b0bd0fe8a
|
Add Falcon support (new) (#592)
|
2023-08-02 14:04:39 -07:00 |
|
Song
|
64f23c2900
|
fix baichuan for different position embedding for 7b and 13b models (#643)
|
2023-08-01 22:22:51 -07:00 |
|
Qing
|
d4c7755ca8
|
fix biachuan-7b tp (#598)
Co-authored-by: wq.chu <wq.chu@tianrang-inc.com>
|
2023-08-01 15:41:36 -07:00 |
|
MoeedDar
|
2d867b55fa
|
fixed tensor parallel is not defined (#564)
|
2023-07-25 14:16:51 -07:00 |
|
Zhuohan Li
|
7d5a155e4a
|
[Fix] Fix GPTBigcoder for distributed execution (#503)
|
2023-07-24 18:36:33 -07:00 |
|
leegohi04517
|
1dde34e0f8
|
GPTJConfig has no attribute rotary. (#532)
|
2023-07-24 11:29:30 -07:00 |
|
Zhuohan Li
|
6fc2a38b11
|
Add support for LLaMA-2 (#505)
|
2023-07-20 11:38:27 -07:00 |
|
Song
|
bda41c70dd
|
hotfix attn alibi wo head mapping (#496)
Co-authored-by: oliveryuan <oliveryuan@basemind.com>
|
2023-07-18 11:31:48 -07:00 |
|
codethazine
|
20b0d88d16
|
Add support for baichuan (#365)
|
2023-07-17 13:50:55 -07:00 |
|
Zhuohan Li
|
96853af5a8
|
Optimize MQA Kernel (#452)
|
2023-07-14 20:06:40 -04:00 |
|
Wen Sun
|
dbed69058c
|
Fix the KeyError when loading bloom-based models (#441)
|
2023-07-13 21:58:09 -07:00 |
|
panda
|
7b6ae94059
|
add vocab padding for LLama(Support WizardLM) (#411)
|
2023-07-13 23:56:22 -04:00 |
|
Andre Slavescu
|
c894836108
|
[Model] Add support for GPT-J (#226)
Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>
|
2023-07-08 17:55:16 -07:00 |
|
Fazlul Shahriar
|
75beba29b5
|
Don't try to load training_args.bin (#373)
|
2023-07-08 15:26:28 -07:00 |
|
Woosuk Kwon
|
404422f42e
|
[Model] Add support for MPT (#334)
|
2023-07-03 16:47:53 -07:00 |
|
Zhuohan Li
|
42e0c1df78
|
[Quality] Add CI for formatting (#343)
|
2023-07-03 14:50:56 -07:00 |
|
Woosuk Kwon
|
e41f06702c
|
Add support for BLOOM (#331)
|
2023-07-03 13:12:35 -07:00 |
|
Zhuohan Li
|
d6fa1be3a8
|
[Quality] Add code formatter and linter (#326)
|
2023-07-03 11:31:55 -07:00 |
|
Zhuohan Li
|
598dc4b79a
|
[Fix] Weight loading for GPTBigCode (#313)
|
2023-06-29 22:14:17 -07:00 |
|
Lily Liu
|
425040d4c1
|
remove floats == 0 comparison (#285)
|
2023-06-28 14:11:51 -07:00 |
|
twaka
|
4026a049d3
|
expand coverage of gpt2 model loading (#271)
|
2023-06-27 06:27:41 -07:00 |
|
BasicCoder
|
471a7a4566
|
Compatible with Decapoda Research llama hf version (#251)
|
2023-06-26 09:23:57 -07:00 |
|
Michael Feil
|
298695b766
|
GPTBigCode (StarCoder, SantaCoder Support) (#209)
|
2023-06-23 01:49:27 +08:00 |
|