Kevin Hu
|
07005806ff
|
Add BigCode converters (#532)
|
2023-09-10 17:24:50 -07:00 |
|
Kevin Hu
|
4c91621a5e
|
Inverse state dict for BERT (#527)
|
2023-09-09 01:44:21 -07:00 |
|
Tri Dao
|
ef6d8c75d9
|
[GPT] Fix loading weights from HF hub
|
2023-08-21 22:56:02 -07:00 |
|
Tri Dao
|
0e8c46ae08
|
Run isort and black on test files
|
2023-08-18 20:59:35 -07:00 |
|
Tri Dao
|
88173a1aaf
|
[FusedDense] Support relu, rename FusedDenseGeluDense -> FusedMLP
|
2023-01-17 18:12:27 -08:00 |
|
Tri Dao
|
c6ecd40a59
|
Tweak CrossEntropyLoss to take process_group in init
|
2022-12-27 10:47:43 -08:00 |
|
Tri Dao
|
13cdceb377
|
Implement last_layer_subset optimization for BERT
|
2022-12-19 22:18:46 -08:00 |
|
Tri Dao
|
5fb6df0e04
|
Implement BERT
|
2022-12-18 21:47:27 -08:00 |
|