Murali Andoorveedu
|
c5832d2ae9
|
[Core] Pipeline Parallel Support (#4412)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-07-02 10:58:08 -07:00 |
|
Stephanie Wang
|
dda4811591
|
[Core] Refactor Worker and ModelRunner to consolidate control plane communication (#5408)
Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu>
Signed-off-by: Stephanie <swang@anyscale.com>
Co-authored-by: Stephanie <swang@anyscale.com>
|
2024-06-25 20:30:03 -07:00 |
|
rohithkrn
|
f5dda63eb5
|
[LoRA] Add support for pinning lora adapters in the LRU cache (#5603)
|
2024-06-21 15:42:46 -07:00 |
|
Nick Hill
|
eb6d3c264d
|
[Core] Eliminate parallel worker per-step task scheduling overhead (#4894)
|
2024-05-23 06:17:27 +09:00 |
|
Cody Yu
|
bc8ad68455
|
[Misc][Refactor] Introduce ExecuteModelData (#4540)
|
2024-05-03 17:47:07 -07:00 |
|
leiwen83
|
4bb53e2dde
|
[BugFix] fix num_lookahead_slots missing in async executor (#4165)
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
|
2024-04-30 10:12:59 -07:00 |
|
Nick Hill
|
ba4be44c32
|
[BugFix] Fix return type of executor execute_model methods (#4402)
|
2024-04-27 11:17:45 -07:00 |
|
Nick Hill
|
15e7c675b0
|
[Core] Add shutdown() method to ExecutorBase (#4349)
|
2024-04-25 16:32:48 -07:00 |
|
Cade Daniel
|
e95cd87959
|
[Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894)
|
2024-04-16 13:09:21 -07:00 |
|
Antoni Baum
|
69e1d2fb69
|
[Core] Refactor model loading code (#4097)
|
2024-04-16 11:34:39 -07:00 |
|
Nick Hill
|
eb46fbfda2
|
[Core] Simplifications to executor classes (#4071)
|
2024-04-15 13:05:09 -07:00 |
|
Dylan Hawk
|
5c2e66e487
|
[Bugfix] More type hint fixes for py 3.8 (#4039)
|
2024-04-12 21:07:04 -07:00 |
|
youkaichao
|
96b6a6d790
|
[Bugfix] fix type hint for py 3.8 (#4036)
|
2024-04-12 19:35:44 +00:00 |
|
Cade Daniel
|
e7c7067b45
|
[Misc] [Core] Implement RFC "Augment BaseExecutor interfaces to enable hardware-agnostic speculative decoding" (#3837)
|
2024-04-09 11:44:15 -07:00 |
|
Cade Daniel
|
5757d90e26
|
[Speculative decoding] Adding configuration object for speculative decoding (#3706)
Co-authored-by: Lily Liu <lilyliupku@gmail.com>
|
2024-04-03 00:40:57 +00:00 |
|
xwjiang2010
|
64172a976c
|
[Feature] Add vision language model support. (#3042)
|
2024-03-25 14:16:30 -07:00 |
|
SangBin Cho
|
01bfb22b41
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
Zhuohan Li
|
4c922709b6
|
Add distributed model executor abstraction (#3191)
|
2024-03-11 11:03:45 -07:00 |
|