Robert Shaw
|
73030b7dae
|
[ Misc ] Enable Quantizing All Layers of DeekSeekv2 (#6423)
|
2024-07-14 21:38:42 +00:00 |
|
Tyler Michael Smith
|
9dad5cc859
|
[Kernel] Turn off CUTLASS scaled_mm for Ada Lovelace (#6384)
|
2024-07-14 13:37:19 +00:00 |
|
Robert Shaw
|
fb6af8bc08
|
[ Misc ] Apply MoE Refactor to Deepseekv2 To Support Fp8 (#6417)
|
2024-07-13 20:03:58 -07:00 |
|
Robert Shaw
|
babf52dade
|
[ Misc ] More Cleanup of Marlin (#6359)
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
|
2024-07-13 10:21:37 +00:00 |
|
Robert Shaw
|
aea19f0989
|
[ Misc ] Support Models With Bias in compressed-tensors integration (#6356)
|
2024-07-12 11:11:29 -04:00 |
|
Robert Shaw
|
abfe705a02
|
[ Misc ] Support Fp8 via llm-compressor (#6110)
Co-authored-by: Robert Shaw <rshaw@neuralmagic>
|
2024-07-07 20:42:11 +00:00 |
|
Robert Shaw
|
7c008c51a9
|
[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970)
Co-authored-by: Robert Shaw <rshaw@neuralmagic>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-07-02 21:54:35 +00:00 |
|
Robert Shaw
|
75aa1442db
|
[ CI/Build ] LM Eval Harness Based CI Testing (#5838)
Co-authored-by: Robert Shaw <rshaw@neuralmagic>
|
2024-06-29 13:04:30 -04:00 |
|