Ying Zhang
|
cdbbe844b1
|
minor changes to unpad_input test util func
|
2024-09-16 14:24:11 -07:00 |
|
Tri Dao
|
abbc131173
|
[LayerNorm] Switch from CUDA to Triton implementation
|
2024-01-05 00:31:17 -08:00 |
|
Kevin Hu
|
07005806ff
|
Add BigCode converters (#532)
|
2023-09-10 17:24:50 -07:00 |
|
Kevin Hu
|
4c91621a5e
|
Inverse state dict for BERT (#527)
|
2023-09-09 01:44:21 -07:00 |
|
Tri Dao
|
f1a73d0740
|
Run isort and black on python files
|
2023-08-18 14:22:11 -07:00 |
|
Kiarash Jamali
|
684196b8c5
|
Allow rotary embeddings for Bert (#363)
|
2023-07-23 00:21:45 -07:00 |
|
Tri Dao
|
96d10f6545
|
Implement LLaMa
|
2023-04-18 21:51:35 -07:00 |
|
Tri Dao
|
88173a1aaf
|
[FusedDense] Support relu, rename FusedDenseGeluDense -> FusedMLP
|
2023-01-17 18:12:27 -08:00 |
|
Tri Dao
|
ff34123bd4
|
Reorder LN in Block, support OPT
|
2023-01-15 22:14:31 -08:00 |
|
Tri Dao
|
714c1b4f0f
|
[Bert] Fix embedding layer norm before embedding dropout
|
2023-01-01 10:38:05 -08:00 |
|
Tri Dao
|
c6ecd40a59
|
Tweak CrossEntropyLoss to take process_group in init
|
2022-12-27 10:47:43 -08:00 |
|
Tri Dao
|
dff68c2b22
|
Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss
|
2022-12-23 14:51:08 -08:00 |
|
Tri Dao
|
e68ebbe89a
|
Simplify FusedDense
|
2022-12-22 21:25:31 -08:00 |
|
Tri Dao
|
13cdceb377
|
Implement last_layer_subset optimization for BERT
|
2022-12-19 22:18:46 -08:00 |
|
Tri Dao
|
5fb6df0e04
|
Implement BERT
|
2022-12-18 21:47:27 -08:00 |
|