picotron/template
2024-11-04 14:27:50 +00:00
..
base_config.json set num worker to 1 otherwise OS memory error 2024-11-04 14:27:50 +00:00
base_job.slurm fix multi-node training by using global rank instead of local rank to init process_group 2024-11-03 00:14:14 +00:00