Tri Dao
|
fd20f16a4e
|
Support cache_seqlens being integer
|
2023-09-05 11:27:48 -07:00 |
|
Tri Dao
|
913922cac5
|
[Gen] Refactor decoding function
|
2023-09-04 17:01:38 -07:00 |
|
dan_the_3rd
|
011ec323d6
|
Support MQA + MP for decoding (#490)
Co-authored-by: danthe3rd <danthe3rd>
|
2023-08-30 10:29:54 -07:00 |
|
Tri Dao
|
9f42cb6e7a
|
[Gen] Clone logits before returning when cg=True
|
2023-08-27 23:19:58 -07:00 |
|
Tri Dao
|
f8aea6ead0
|
[GPT] Generalize last_token_only arg to num_last_tokens
|
2023-08-26 20:47:53 -07:00 |
|
Tri Dao
|
371e20658c
|
[GPT] Test generation when passing in multiple tokens
|
2023-08-26 13:56:41 -07:00 |
|
Tri Dao
|
c000c3a2c0
|
[GPT] Move more tests to test_gpt.py
|
2023-08-26 13:00:40 -07:00 |
|
Tri Dao
|
9b713872ea
|
[GPT] Move GPT and OPT generation tests to test_{gpt,opt}.py
|
2023-08-26 12:55:02 -07:00 |
|
Tri Dao
|
0e8c46ae08
|
Run isort and black on test files
|
2023-08-18 20:59:35 -07:00 |
|
Tri Dao
|
4d87e4d875
|
Implement GPT-J
|
2023-03-22 16:16:58 -07:00 |
|
Tri Dao
|
88173a1aaf
|
[FusedDense] Support relu, rename FusedDenseGeluDense -> FusedMLP
|
2023-01-17 18:12:27 -08:00 |
|
Tri Dao
|
ff34123bd4
|
Reorder LN in Block, support OPT
|
2023-01-15 22:14:31 -08:00 |
|
Tri Dao
|
63670fd84a
|
Implement generation for GPT
|
2022-12-27 21:01:50 -08:00 |
|
Tri Dao
|
9d797d8848
|
Support loading GPT2 weights from Huggingface
|
2022-12-27 11:22:48 -08:00 |
|