More explanation

2022-06-14 11:55:14 -07:00 · 2022-06-14 11:55:14 -07:00 · 765741c1ee
commit 765741c1ee
parent 2d5b2483b8
1 changed files with 2 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -77,7 +77,8 @@ As a result, FlashAttention can scale to much longer sequence lengths.

 We show speedup with head dimension 128.
 Here we show batch size 16 with 12 heads.
-Speedup is less than with the smaller head sizes, but speedup is still significant -- especially with a causal mask.
+Speedup is less than with the smaller head sizes, since we have to make the block size smaller in the tiling.
+But speedup is still significant, especially with a causal mask.

 ### RTX 3090