| .. |
|
DeepSeek-V2-Lite-Chat.yaml
|
[BugFix] Fix DeepSeek remote code (#7178)
|
2024-08-06 08:16:53 -07:00 |
|
Meta-Llama-3-8B-Instruct-Channelwise-compressed-tensors.yaml
|
[ Misc ] fbgemm checkpoints (#6559)
|
2024-07-20 09:36:57 -07:00 |
|
Meta-Llama-3-8B-Instruct-FBGEMM-nonuniform.yaml
|
[ Misc ] fbgemm checkpoints (#6559)
|
2024-07-20 09:36:57 -07:00 |
|
Meta-Llama-3-8B-Instruct-FP8-compressed-tensors.yaml
|
[Kernel] Turn off CUTLASS scaled_mm for Ada Lovelace (#6384)
|
2024-07-14 13:37:19 +00:00 |
|
Meta-Llama-3-8B-Instruct-FP8.yaml
|
[Kernel] Turn off CUTLASS scaled_mm for Ada Lovelace (#6384)
|
2024-07-14 13:37:19 +00:00 |
|
Meta-Llama-3-8B-Instruct-INT8-compressed-tensors.yaml
|
[ Misc ] Support Fp8 via llm-compressor (#6110)
|
2024-07-07 20:42:11 +00:00 |
|
Meta-Llama-3-8B-Instruct-nonuniform-compressed-tensors.yaml
|
[ Misc ] non-uniform quantization via compressed-tensors for Llama (#6515)
|
2024-07-18 22:39:18 -04:00 |
|
Meta-Llama-3-8B-Instruct.yaml
|
[ CI/Build ] LM Eval Harness Based CI Testing (#5838)
|
2024-06-29 13:04:30 -04:00 |
|
Meta-Llama-3-8B-QQQ.yaml
|
[Kernel] Replaced blockReduce[...] functions with cub::BlockReduce (#7233)
|
2024-08-21 20:18:00 -04:00 |
|
Meta-Llama-3-70B-Instruct-FBGEMM-nonuniform.yaml
|
[ Kernel ] Enable fp8-marlin for fbgemm-fp8 models (#6606)
|
2024-07-20 18:50:10 +00:00 |
|
Meta-Llama-3-70B-Instruct.yaml
|
[ CI/Build ] LM Eval Harness Based CI Testing (#5838)
|
2024-06-29 13:04:30 -04:00 |
|
Minitron-4B-Base-FP8.yaml
|
[Model] Align nemotron config with final HF state and fix lm-eval-small (#7611)
|
2024-08-16 15:56:34 -07:00 |
|
Mixtral-8x7B-Instruct-v0.1-FP8.yaml
|
[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970)
|
2024-07-02 21:54:35 +00:00 |
|
Mixtral-8x7B-Instruct-v0.1.yaml
|
[ CI/Build ] LM Eval Harness Based CI Testing (#5838)
|
2024-06-29 13:04:30 -04:00 |
|
Mixtral-8x22B-Instruct-v0.1-FP8-Dynamic.yaml
|
[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970)
|
2024-07-02 21:54:35 +00:00 |
|
models-large.txt
|
[ Kernel ] Enable fp8-marlin for fbgemm-fp8 models (#6606)
|
2024-07-20 18:50:10 +00:00 |
|
models-small.txt
|
[Model] Align nemotron config with final HF state and fix lm-eval-small (#7611)
|
2024-08-16 15:56:34 -07:00 |
|
Qwen2-1.5B-Instruct-FP8W8.yaml
|
[ Misc ] fp8-marlin channelwise via compressed-tensors (#6524)
|
2024-07-25 09:46:04 -07:00 |
|
Qwen2-1.5B-Instruct-INT8-compressed-tensors.yaml
|
[ Misc ] Support Models With Bias in compressed-tensors integration (#6356)
|
2024-07-12 11:11:29 -04:00 |
|
Qwen2-1.5B-Instruct-W8A16-compressed-tensors.yaml
|
[ Misc ] Support Models With Bias in compressed-tensors integration (#6356)
|
2024-07-12 11:11:29 -04:00 |
|
Qwen2-57B-A14-Instruct.yaml
|
[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970)
|
2024-07-02 21:54:35 +00:00 |