Go to file
2024-12-17 15:45:38 +00:00
picotron Merge remote-tracking branch 'origin/main' into loading_big_model 2024-12-17 15:45:38 +00:00
template small changes 2024-12-17 05:01:35 +00:00
tests Merge pull request #9 from huggingface/async_tp 2024-12-14 07:24:35 -05:00
.gitignore small changes 2024-12-17 05:01:35 +00:00
create_config.py raise Exception when not enough layers to distributed in rank + rename variable 2024-12-03 13:17:52 +00:00
extract_metrics.py add mfu parsing 2024-12-04 13:08:28 +00:00
README.md Initial commit 2024-09-18 14:01:22 +02:00
requirements.txt fix requirements to avoid drop in throughput 2024-11-04 14:33:07 +00:00
setup.py tesnsor parallel, will clean later 2024-10-18 05:13:44 +00:00
submit_slurm_jobs.py can now load big model through safetensors (sharded and single file) 2024-12-01 19:39:16 +00:00
train.py Merge remote-tracking branch 'origin/main' into loading_big_model 2024-12-17 15:45:38 +00:00

picotron