Make png images into jpg for dark mode
This commit is contained in:
parent
4decc3c166
commit
7025a092d1
@ -40,14 +40,14 @@ Our graphs show sequence lengths between 128 and 4096 (when standard attention r
|
||||
|
||||
#### Speedup
|
||||
|
||||

|
||||

|
||||
|
||||
We generally see 2-4X speedup at sequence lengths between 128 and 4K, and we see more speedup when using dropout and masking, since we fuse the kernels.
|
||||
At sequence lengths that are popular with language models like 512 and 1K, we see speedups up to 4X when using dropout and masking.
|
||||
|
||||
#### Memory
|
||||
|
||||

|
||||

|
||||
|
||||
We show memory savings in this graph (note that memory footprint is the same no matter if you use dropout or masking).
|
||||
Memory savings are proportional to sequence length -- since standard attention has memory quadratic in sequence length, whereas FlashAttention has memory linear in sequence length.
|
||||
|
||||
BIN
images/flashattn_memory.jpg
Normal file
BIN
images/flashattn_memory.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 83 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 64 KiB |
BIN
images/flashattn_speedup.jpg
Normal file
BIN
images/flashattn_speedup.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 113 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 80 KiB |
Loading…
Reference in New Issue
Block a user