From 77549ae6c8cf31c7ac4c8b88180a8708a8683da4 Mon Sep 17 00:00:00 2001 From: Haicheng Wu <57973641+hwu36@users.noreply.github.com> Date: Sat, 25 Mar 2023 21:17:05 -0400 Subject: [PATCH] Update PUBLICATIONS.md msft moe paper --- PUBLICATIONS.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/PUBLICATIONS.md b/PUBLICATIONS.md index a2b2d90a..e8959f8f 100644 --- a/PUBLICATIONS.md +++ b/PUBLICATIONS.md @@ -8,6 +8,8 @@ - ["GPU Load Balancing"](https://arxiv.org/abs/2212.08964). Muhammad Osama. _Doctoral dissertation, University of California, Davis_, December 2022. +- ["Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production"](https://arxiv.org/abs/2211.10017). Young Jin Kim, Rawn Henry, Raffy Fahim, Hany Hassan Awadalla. _Proceedings of the Third Workshop on Simple and Efficient Natural Language Processing_, December 2022. + - ["Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance"](https://arxiv.org/abs/2110.15238). Jiarong Xing, Leyuan Wang, Shang Zhang, Jack Chen, Ang Chen, Yibo Zhu. _Proceedings of the 5th MLSys Conference_, August 2022. - ["Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance"](https://arxiv.org/abs/2203.03341). Hiroyuki Ootomo, Rio Yokota. _International Journal of High Performance Computing_, March 2022.