ferdinand.mom
|
8af19d0caa
|
picotron top level folder
|
2024-11-04 15:29:26 +00:00 |
|
ferdinand.mom
|
f6c9a39d17
|
fix spliting input twice for context parallel (done in dataloader)
|
2024-10-30 15:43:42 +00:00 |
|
ferdinand.mom
|
3c635092f9
|
add assert in TensorParallel for num_attention_heads and key_values_heads
|
2024-10-30 14:58:41 +00:00 |
|
ferdinand.mom
|
46af5b0425
|
some fixes
|
2024-10-29 14:17:42 +00:00 |
|
zzhhjjj
|
b7f3e253be
|
add context parallel
|
2024-10-29 13:42:38 +00:00 |
|
zzhhjjj
|
6220892716
|
refactor
|
2024-10-28 20:44:15 +00:00 |
|
zzhhjjj
|
a6d79b07b5
|
add cuda kernels
|
2024-10-22 22:38:29 +00:00 |
|
ferdinand.mom
|
134d48b658
|
remove merged qkv
|
2024-10-18 14:59:04 +00:00 |
|
zzhhjjj
|
7377238741
|
tesnsor parallel, will clean later
|
2024-10-18 05:13:44 +00:00 |
|
zzhhjjj
|
1aba6079e8
|
model file change. This requires some change on PP
|
2024-10-16 16:41:12 +00:00 |
|
ferdinand.mom
|
770800b978
|
add new modeling
|
2024-10-10 14:57:17 +00:00 |
|