From c5239d83123297e28d4cf7bb51bbf307bb4eaf4e Mon Sep 17 00:00:00 2001
From: Ali Hassani <68103095+alihassanijr@users.noreply.github.com>
Date: Wed, 10 Jul 2024 11:09:13 -0400
Subject: [PATCH] Add Faster Neighborhood Attention to pubs (#1471)

---
 PUBLICATIONS.md | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/PUBLICATIONS.md b/PUBLICATIONS.md
index 32b76e5f..65d1f08e 100644
--- a/PUBLICATIONS.md
+++ b/PUBLICATIONS.md
@@ -1,5 +1,9 @@
 # Publications Using Cutlass
 
+## 2024
+
+- ["Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level"](https://arxiv.org/abs/2403.04690). Ali Hassani, Wen-Mei Hwu, Humphrey Shi. _arXiv_, March 2024.
+
 ## 2023
 
 - ["A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS Library"](https://arxiv.org/abs/2312.11918). Ganesh Bikshandi, Jay Shah. _arXiv_, December 2023.