Tri Dao
|
abbc131173
|
[LayerNorm] Switch from CUDA to Triton implementation
|
2024-01-05 00:31:17 -08:00 |
|
Tri Dao
|
f1a73d0740
|
Run isort and black on python files
|
2023-08-18 14:22:11 -07:00 |
|
Tri Dao
|
75e334d407
|
[MLP] Add ParallelMLP
|
2023-07-22 23:45:51 -07:00 |
|
Tri Dao
|
b3177dfaf6
|
[GPT] Enable FlashAttention for GPT-J
|
2023-07-21 17:29:10 -07:00 |
|
Tri Dao
|
6fc1e07da2
|
[Block] Re-enable DropPath
|
2023-07-21 16:39:23 -07:00 |
|
Tri Dao
|
4f285b3547
|
FlashAttention-2 release
|
2023-07-17 06:21:34 -07:00 |
|
ljss
|
8e44c0eefb
|
Fix a bug
|
2023-06-02 13:46:19 +08:00 |
|
Federico Berto
|
3889ba168b
|
[BugFix] cannot unpack non-iterable NoneType object
|
2023-05-07 03:07:30 +09:00 |
|
Tri Dao
|
ba2fe7f378
|
[Gen] Move allocate_inference_cache to within the model
|
2023-04-20 18:15:12 -07:00 |
|
Tri Dao
|
96d10f6545
|
Implement LLaMa
|
2023-04-18 21:51:35 -07:00 |
|
Tri Dao
|
393882bc08
|
[LayerNorm] Implement LN with parallel residual, support dim 8k
|
2023-03-31 14:23:45 -07:00 |
|
Tri Dao
|
4d87e4d875
|
Implement GPT-J
|
2023-03-22 16:16:58 -07:00 |
|
Tri Dao
|
88173a1aaf
|
[FusedDense] Support relu, rename FusedDenseGeluDense -> FusedMLP
|
2023-01-17 18:12:27 -08:00 |
|
Tri Dao
|
780e8eeabb
|
[ViT] Support timm checkpoint, add tests
|
2023-01-16 01:20:34 -08:00 |
|
Tri Dao
|
ef085cfcda
|
[ViT] Fix extra norm_0, use new LN order in Block
|
2023-01-15 22:58:56 -08:00 |
|
Tri Dao
|
ff34123bd4
|
Reorder LN in Block, support OPT
|
2023-01-15 22:14:31 -08:00 |
|
Tri Dao
|
93383bd55b
|
[TP] Implement TensorParallel without sequence parallel
|
2023-01-07 13:45:22 -08:00 |
|
Tri Dao
|
a8cfe51551
|
Implement Tensor Parallel for transformer Block
|
2022-12-25 14:08:21 -08:00 |
|
Tri Dao
|
5fb6df0e04
|
Implement BERT
|
2022-12-18 21:47:27 -08:00 |
|
Tri Dao
|
d4b320b31f
|
Add MLP, MHA, Block, Embedding modules
|
2022-11-13 22:06:44 -08:00 |
|