diff --git a/README.md b/README.md index 5d881baf..67e6d300 100644 --- a/README.md +++ b/README.md @@ -59,6 +59,7 @@ CUTLASS 3.1 is an update to CUTLASS adding: - [Batched B2B GEMM](examples/13_two_tensor_op_fusion) now can run multiple Back-to-Back GEMM with the same problem size in parallel. - [Batched Strided GEMV](test/unit/gemm/device/gemv.cu) support both row major and column major input matrix. - [Permute + GEMM fusion](examples/39_gemm_permute) can fuse Permute with following GEMM now. Before, we only support fusing GEMM with Permute in the epilogue. +- [Row Broadcast](include/cutlass/epilogue/threadblock/predicated_tile_iterator_row_broadcast.h) can be fused in the epilogue. - *Announcement*: - The GitHub branch is renamed from `master` to `main` in this release.