flash-attention

Author	SHA1	Message	Date
Yuchao Dai	187c2a0635	Fix E1136 (#563 )	2023-09-21 11:48:23 -07:00
Tri Dao	0705d2718d	[Llama] Fix some tests, add tests for Llama 2 and CodeLlama	2023-09-20 23:36:46 -07:00
Kevin Hu	42832575d4	Fix Llama GQA/MQA (#546 ) * Fix llama MQA * Fix permute shape * Update llama.py	2023-09-19 22:15:59 -07:00
dan_the_3rd	c9d4a816fa	Support LLaMa2 and CodeLLaMa (#491 ) Co-authored-by: danthe3rd <danthe3rd>	2023-08-30 10:31:14 -07:00
Xuechen Li	7fcd3e6a04	map custom model state_dict back to huggingface format (#465 ) * fix name. * set inv function. * add map back function. * handle gqa. * add type annotation to avoid confusion. * fix docstr. * test inverse remap logic.	2023-08-18 20:51:39 -07:00
Tri Dao	f1a73d0740	Run isort and black on python files	2023-08-18 14:22:11 -07:00
Xuechen Li	0f7853c6a1	enable loading hf llama checkpoints for training (#446 ) * prelim. * add hf convertion fn. * mlp. * change name. * fix bug. * inverse permute. * change comment. * revert style changes. * fix. * add doc. * revert. * enable load safe. * fix safe load. * fix import. * fix typing-related lints. * fix ckpt loading logic. * make single gpu work. * test with parallel. * ckpt format. * enable pretrained state dict. * remove unused imports. * remove unused. * mark idea related.	2023-08-15 08:33:15 -07:00
Tri Dao	96d10f6545	Implement LLaMa	2023-04-18 21:51:35 -07:00