Tri Dao
|
08124c8f9c
|
[CrossEntropy] Implement logit_scale option
|
2023-12-16 18:39:37 -08:00 |
|
Tri Dao
|
aaa1474129
|
[CrossEntropy] Simplify the case of large vocab with Tensor Parallel
|
2023-11-19 23:19:36 -08:00 |
|
Tri Dao
|
5400fdc4ac
|
[CE] Implement CrossEntropyLoss in Triton
|
2023-09-15 20:05:28 -07:00 |
|
Tri Dao
|
0e8c46ae08
|
Run isort and black on test files
|
2023-08-18 20:59:35 -07:00 |
|
Tri Dao
|
c6ecd40a59
|
Tweak CrossEntropyLoss to take process_group in init
|
2022-12-27 10:47:43 -08:00 |
|
Tri Dao
|
b4018a5028
|
Implement Tensor Parallel for GPT model
|
2022-12-26 16:22:43 -08:00 |
|
Tri Dao
|
dff68c2b22
|
Add smoothing for CrossEntropyParallel, rename to CrossEntropyLoss
|
2022-12-23 14:51:08 -08:00 |
|
Tri Dao
|
343492ec30
|
Make nccl operations async in CrossEntropyLossParallel
|
2022-11-13 17:27:26 -08:00 |
|
Tri Dao
|
7c9953815a
|
Add fused cross entropy loss
|
2022-11-12 21:58:41 -08:00 |
|