Add Faster Neighborhood Attention to pubs (#1471)

2024-07-10 11:09:13 -04:00 · 2024-07-10 11:09:13 -04:00 · c5239d8312
commit c5239d8312
parent d6580c3dc0
1 changed files with 4 additions and 0 deletions
--- a/PUBLICATIONS.md
+++ b/PUBLICATIONS.md
@ -1,5 +1,9 @@
 # Publications Using Cutlass
 ## 2024
 - ["Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level"](https://arxiv.org/abs/2403.04690). Ali Hassani, Wen-Mei Hwu, Humphrey Shi. _arXiv_, March 2024.
 ## 2023
 - ["A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library"](https://arxiv.org/abs/2312.11918). Ganesh Bikshandi, Jay Shah. _arXiv_, December 2023.