Commit Graph

41 Commits

Author SHA1 Message Date
Nick Hill
4b377d6feb
[BugFix] Fix test breakages from transformers 4.45 upgrade (#8829) 2024-09-26 16:46:43 -07:00
Chen Zhang
770ec6024f
[Model] Add support for the multi-modal Llama 3.2 model (#8811)
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Chang Su <chang.s.su@oracle.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-09-25 13:29:32 -07:00
Isotr0py
4ca65a9763
[Core][Bugfix] Accept GGUF model without .gguf extension (#8056) 2024-09-02 08:43:26 -04:00
Patrick von Platen
6fc4e6e07a
[Model] Add Mistral Tokenization to improve robustness and chat encoding (#7739) 2024-08-27 12:40:02 +00:00
Cyrus Leung
9ba85bc152
[mypy] Misc. typing improvements (#7417) 2024-08-13 09:20:20 +08:00
Cyrus Leung
66d617e343
[Frontend] Gracefully handle missing chat template and fix CI failure (#7238)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-08-07 09:12:05 +00:00
Isotr0py
360bd67cf0
[Core] Support loading GGUF model (#5191)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-08-05 17:54:23 -06:00
Jiaxin Shan
42c7f66a38
[Core] Support dynamically loading Lora adapter from HuggingFace (#6234)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2024-07-22 15:42:40 -07:00
Nick Hill
e2fbaee725
[BugFix][Frontend] Use LoRA tokenizer in OpenAI APIs (#6227)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-07-18 15:13:30 +08:00
youkaichao
5b8a7c1cb0
[Misc] centralize all usage of environment variables (#4548) 2024-05-02 11:13:25 -07:00
fuchen.ljl
6ad58f42c5
fix_tokenizer_snapshot_download_bug (#4493) 2024-04-30 16:38:50 -07:00
Prashant Gupta
d6e520e170
[Core] Support offline use of local cache for models (#4374)
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Co-authored-by: Travis Johnson <tjohnson31415@gmail.com>
2024-04-27 09:59:55 -07:00
SangBin Cho
a88081bf76
[CI] Disable non-lazy string operation on logging (#4326)
Co-authored-by: Danny Guinther <dguinther@neuralmagic.com>
2024-04-26 00:16:58 -07:00
Cyrus Leung
a74dee9b62
[Bugfix] Fix parameter name in get_tokenizer (#4107) 2024-04-25 19:10:48 -07:00
Antoni Baum
69e1d2fb69
[Core] Refactor model loading code (#4097) 2024-04-16 11:34:39 -07:00
SangBin Cho
09473ee41c
[mypy] Add mypy type annotation part 1 (#4006) 2024-04-12 14:35:50 -07:00
Nick Hill
49782fcb76
[Misc] Some minor simplifications to detokenization logic (#3670)
Some simplifications made for clarity.

Also moves detokenization-related functions from tokenizer.py to detokenizer.py.
2024-04-01 13:22:06 -07:00
youkaichao
203d4f82ac
[Core][Bugfix] cache len of tokenizer (#3741) 2024-03-29 18:46:39 -07:00
Roy
6110c39dc8
[BugFix] Fix tokenizer out of vocab size (#3685) 2024-03-29 08:18:59 -07:00
SangBin Cho
01bfb22b41
[CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
Antoni Baum
bfdb1ba5c3
[Core] Improve detokenization performance for prefill (#3469)
Co-authored-by: MeloYang <meloyang05@gmail.com>
2024-03-22 13:44:12 -07:00
Antoni Baum
fb96c1e98c
Asynchronous tokenization (#2879) 2024-03-15 23:37:01 +00:00
Roy
9e8744a545
[BugFix] Fix get tokenizer when using ray (#3301) 2024-03-10 19:17:16 -07:00
Antoni Baum
9b945daaf1
[Experimental] Add multi-LoRA support (#1804)
Co-authored-by: Chen Shen <scv119@gmail.com>
Co-authored-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Co-authored-by: Avnish Narayan <avnish@anyscale.com>
2024-01-23 15:26:37 -08:00
Woosuk Kwon
3d1cfbfc74
[Minor] Delete Llama tokenizer warnings (#2146) 2023-12-16 22:05:18 -08:00
Woosuk Kwon
d06980dfa7
Fix Baichuan tokenizer error (#1874) 2023-11-30 18:35:50 -08:00
Dan Lord
7013a80170
Add support for spaces_between_special_tokens 2023-10-30 16:52:56 -07:00
Ricardo Lu
beac8dd461
fix: don't skip first special token. (#1497) 2023-10-29 04:26:36 -07:00
Antoni Baum
ec3b5ce9cc
Improve detokenization performance (#1338) 2023-10-13 09:59:07 -07:00
Federico Cassano
66d18a7fb0
add support for tokenizer revision (#1163)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2023-10-02 19:19:46 -07:00
Woosuk Kwon
64ca424e75
Fix warning message on LLaMA FastTokenizer (#1037) 2023-09-14 17:33:32 -07:00
Antoni Baum
dd54a4b026
Fix detokenization leaving special tokens (#1044)
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
2023-09-14 16:37:03 -07:00
Antoni Baum
9841d48a10
Use TGI-like incremental detokenization (#984) 2023-09-13 13:38:01 -07:00
Nelson Liu
e15932bb60
Only emit warning about internal tokenizer if it isn't being used (#939) 2023-09-05 00:50:55 +09:00
Ikko Eltociear Ashimine
805de738f6
Fix typo in tokenizer.py (#750)
conjuction -> conjunction
2023-08-14 22:26:36 -07:00
xcnick
c6dfc3cdbe
Fix handling of special tokens in decoding. (#418) 2023-07-12 11:14:56 -04:00
Woosuk Kwon
ddfdf470ae
Add trust_remote_code arg to get_config (#405) 2023-07-08 15:24:17 -07:00
codethazine
a945fcc2ae
Add trust-remote-code flag to handle remote tokenizers (#364) 2023-07-07 11:04:58 -07:00
Zhuohan Li
d6fa1be3a8
[Quality] Add code formatter and linter (#326) 2023-07-03 11:31:55 -07:00
Woosuk Kwon
998d9d1509
[Tokenizer] Add tokenizer mode (#298) 2023-06-28 14:19:22 -07:00
Woosuk Kwon
4338cc4750
[Tokenizer] Add an option to specify tokenizer (#284) 2023-06-28 09:46:58 -07:00