From 4KB per buffer to 2KB per buffer. This saves us 8KB of smem (each Q and dO have 2 buffers) |
||
|---|---|---|
| .. | ||
| cutlass@319a389f42 | ||
| src | ||
| fmha_api.cpp | ||
From 4KB per buffer to 2KB per buffer. This saves us 8KB of smem (each Q and dO have 2 buffers) |
||
|---|---|---|
| .. | ||
| cutlass@319a389f42 | ||
| src | ||
| fmha_api.cpp | ||