Commit Graph

32 Commits

Author SHA1 Message Date
SangBin Cho
2e9a2227ec
[Lora] Support long context lora (#4787)
Currently we need to call rotary embedding kernel for each LoRA, which makes it hard to serve multiple long context length LoRA. Add batched rotary embedding kernel and pipe it through.

It replaces the rotary embedding layer to the one that is aware of multiple cos-sin-cache per scaling factors.

Follow up of https://github.com/vllm-project/vllm/pull/3095/files
2024-05-18 16:05:23 +09:00
Hao Zhang
ebce310b74
[Model] Snowflake arctic model implementation (#4652)
Co-authored-by: Dash Desai <1723932+iamontheinet@users.noreply.github.com>
Co-authored-by: Aurick Qiao <qiao@aurick.net>
Co-authored-by: Aurick Qiao <aurick.qiao@snowflake.com>
Co-authored-by: Aurick Qiao <aurickq@users.noreply.github.com>
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
2024-05-09 22:37:14 +00:00
SangBin Cho
a88081bf76
[CI] Disable non-lazy string operation on logging (#4326)
Co-authored-by: Danny Guinther <dguinther@neuralmagic.com>
2024-04-26 00:16:58 -07:00
SangBin Cho
0ae11f78ab
[Mypy] Part 3 fix typing for nested directories for most of directory (#4161) 2024-04-22 21:32:44 -07:00
Michael Feil
c2b4a1bce9
[Doc] Add typing hints / mypy types cleanup (#3816)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2024-04-11 17:17:21 -07:00
Megha Agarwal
e24336b5a7
[Model] Add support for DBRX (#3660) 2024-03-27 13:01:46 -07:00
SangBin Cho
01bfb22b41
[CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
Woosuk Kwon
c188ecb080
[Misc] Bump up transformers to v4.39.0 & Remove StarCoder2Config (#3551)
Co-authored-by: Roy <jasonailu87@gmail.com>
Co-authored-by: Roger Meier <r.meier@siemens.com>
2024-03-21 07:58:12 -07:00
Lalit Pradhan
4c07dd28c0
[🚀 Ready to be merged] Added support for Jais models (#3183) 2024-03-21 09:45:24 +00:00
Zhuohan Li
2f8844ba08
Re-enable the 80 char line width limit (#3305) 2024-03-10 19:49:14 -07:00
Seonghyeon
bfdcfa6a05
Support starcoder2 architecture (#3089) 2024-02-29 00:51:48 -08:00
Roy
d9f726c4d0
[Minor] Remove unused config files (#3039) 2024-02-26 17:25:22 -08:00
Isotr0py
ab3a5a8259
Support OLMo models. (#2832) 2024-02-18 21:05:15 -08:00
Roy
4efbac6d35
Migrate AquilaForCausalLM to LlamaForCausalLM (#2867) 2024-02-14 12:30:24 -08:00
Philipp Moritz
317b29de0f
Remove Yi model definition, please use LlamaForCausalLM instead (#2854)
Co-authored-by: Roy <jasonailu87@gmail.com>
2024-02-13 14:22:22 -08:00
Philipp Moritz
ea356004d4
Revert "Refactor llama family models (#2637)" (#2851)
This reverts commit 5c976a7e1a.
2024-02-13 09:24:59 -08:00
Roy
5c976a7e1a
Refactor llama family models (#2637) 2024-02-13 00:09:23 -08:00
Simon Mo
5ffc0d13a2
Migrate linter from pylint to ruff (#1665) 2023-11-20 11:58:01 -08:00
Megha Agarwal
b514d3c496
Revert MptConfig to MPTConfig (#1668) 2023-11-16 01:19:39 -08:00
GoHomeToMacDonal
1a2bbc9301
ChatGLM Support (#1261) 2023-11-06 16:09:33 -08:00
Roy
e7f579eb97
Support Yi model (#1567) 2023-11-06 15:26:03 -08:00
Woosuk Kwon
1fe0990023
Remove MPTConfig (#1529) 2023-11-01 15:29:05 -07:00
Lu Wang
de89472897
Fix the issue for AquilaChat2-* models (#1339) 2023-10-13 11:51:29 -07:00
Woosuk Kwon
e7c8555d06
Bump up transformers version & Remove MistralConfig (#1254) 2023-10-13 10:05:26 -07:00
Woosuk Kwon
a8e98aee0c
Fix Mistral model (#1220) 2023-09-28 10:44:05 -07:00
Chris Bamford
bb1ba58f06
[Mistral] Mistral-7B-v0.1 support (#1196)
Co-authored-by: timlacroix <t@mistral.ai>
2023-09-28 10:41:03 -07:00
Qing
28e616c4e3
fix qwen-14b model (#1173) 2023-09-27 16:33:16 -07:00
shunxing1234
ad5f2fe34c
Add support for aquila (#663)
* add aquila

Signed-off-by: ftgreat <ftgreat@163.com>

* fix some bug

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* delete pdb

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* fix bugs

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* fix bugs

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* delete whitespace

Signed-off-by: shunxing1234 <xw747777271@gmail.com>

* format

* fix order

---------

Signed-off-by: ftgreat <ftgreat@163.com>
Signed-off-by: shunxing1234 <xw747777271@gmail.com>
Co-authored-by: ftgreat <ftgreat@163.com>
2023-08-22 00:13:36 -07:00
Qing
a57d13cc96
add QWen-7b (#685)
Co-authored-by: wq.chu <wq.chu@tianrang-inc.com>
2023-08-08 13:50:38 -07:00
Zhuohan Li
1b0bd0fe8a
Add Falcon support (new) (#592) 2023-08-02 14:04:39 -07:00
codethazine
20b0d88d16
Add support for baichuan (#365) 2023-07-17 13:50:55 -07:00
Woosuk Kwon
404422f42e
[Model] Add support for MPT (#334) 2023-07-03 16:47:53 -07:00