README typo
This commit is contained in:
parent
dc6d130088
commit
4decc3c166
@ -35,6 +35,7 @@ We display FlashAttention speedup using these parameters (similar to BERT-base):
|
|||||||
* Batch size 8
|
* Batch size 8
|
||||||
* Head dimension 64
|
* Head dimension 64
|
||||||
* 12 attention heads
|
* 12 attention heads
|
||||||
|
|
||||||
Our graphs show sequence lengths between 128 and 4096 (when standard attention runs out of memory on an A100), but FlashAttention can scale up to sequence length 64K.
|
Our graphs show sequence lengths between 128 and 4096 (when standard attention runs out of memory on an A100), but FlashAttention can scale up to sequence length 64K.
|
||||||
|
|
||||||
#### Speedup
|
#### Speedup
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user