From 42f54d884053fc0b01eaad9d649c2edb8a17bb75 Mon Sep 17 00:00:00 2001 From: Tri Dao Date: Mon, 11 Jul 2022 17:02:29 -0700 Subject: [PATCH] Edit mention of Triton implementation Phil Tillet suggests calling it "experimental". --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5fff041..3d7cdba 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ IEEE Spectrum [article](https://spectrum.ieee.org/mlperf-rankings-2022) about ou #### Triton implementation of FlashAttention -Phil Tillet (OpenAI) has an implementation of FlashAttention in Triton: +Phil Tillet (OpenAI) has an experimental implementation of FlashAttention in Triton: https://github.com/openai/triton/blob/master/python/tutorials/06-fused-attention.py As Triton is a higher-level language than CUDA, it might be easier to understand