|
src
|
use global pgm for ddp
|
2024-10-18 14:59:26 +00:00 |
|
.gitignore
|
tesnsor parallel, will clean later
|
2024-10-18 05:13:44 +00:00 |
|
convert_hf_to_picotron.py
|
refactor organisation
|
2024-10-10 15:12:14 +00:00 |
|
convert_picotron_to_hf.py
|
refactor organisation
|
2024-10-10 15:12:14 +00:00 |
|
generate.py
|
renaming
|
2024-10-14 09:26:31 +00:00 |
|
model.py
|
remove merged qkv
|
2024-10-18 14:59:04 +00:00 |
|
README.md
|
Initial commit
|
2024-09-18 14:01:22 +02:00 |
|
requirements.txt
|
add wandb support
|
2024-09-25 14:19:16 +00:00 |
|
setup.py
|
tesnsor parallel, will clean later
|
2024-10-18 05:13:44 +00:00 |
|
train.py
|
leave out CP integration at the very end
|
2024-10-18 14:59:39 +00:00 |
|
utils.py
|
add DDP
|
2024-10-16 16:48:55 +00:00 |