cutlass/include/cutlass/gemm/kernel
ANIKET SHIVAM 90d3b0fb18
CUTLASS 3.2.1 (#1113)
* Updates for 3.2.1 release.

* Minor fix in gemm op profiler for raster order.

* Add scheduler mapping for raster order in the kernels.
2023-09-26 17:24:26 -04:00
..
default_ell_gemm.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_complex.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_grouped_softmax_mainloop_fusion.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_grouped.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_layernorm_mainloop_fusion.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_planar_complex_universal.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_sparse_row_broadcast.h Add support for sparse GEMM with row broadcasted bias vector (#951) 2023-05-24 10:25:05 -04:00
default_gemm_sparse.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_splitk_parallel.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_streamk_with_broadcast.h Stream-K with broadcast (#892) 2023-05-22 19:05:06 -04:00
default_gemm_universal_with_visitor.h CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
default_gemm_universal.h [doc] fix: fix typos in the comment (#1049) 2023-08-16 11:39:25 -04:00
default_gemm_with_broadcast.h CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
default_gemm_with_k_reduction.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_gemm_with_reduction.h CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
default_gemm.h CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
default_gemv.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_rank_2k_complex.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_rank_2k_grouped.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_rank_2k_universal.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_rank_2k.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_rank_k_complex.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_rank_k_universal.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_rank_k.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_symm_complex.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_symm_universal.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_symm.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_trmm_complex.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_trmm_universal.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
default_trmm.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
ell_gemm.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_array.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_batched.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_grouped_problem_visitor.h CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_grouped_softmax_mainloop_fusion.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_grouped.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_layernorm_mainloop_fusion.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_params.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_pipelined.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_planar_complex_array.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_planar_complex.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_splitk_parallel.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_streamk_with_fused_epilogue.h Stream-K with broadcast (#892) 2023-05-22 19:05:06 -04:00
gemm_transpose_operands.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
gemm_universal_streamk.h Fix for dangling pointers (#885) 2023-03-25 01:15:14 -04:00
gemm_universal_with_visitor_streamk.h CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_universal_with_visitor.h CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_universal.h CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
gemm_universal.hpp CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_with_fused_epilogue.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm_with_k_reduction.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemm.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemv_batched_strided.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
gemv.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
grouped_problem_visitor.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
params_universal_base.h CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
rank_2k_grouped_problem_visitor.h CUTLASS 3.1 (#915) 2023-04-14 23:19:34 -04:00
rank_2k_grouped.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
rank_2k_transpose_operands.h New updates for 2.11 (#775) 2023-01-20 16:32:57 -05:00
rank_2k_universal.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
rank_k_universal.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
sm70_gemm.hpp CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_tma_warpspecialized_cooperative.hpp CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_tma_warpspecialized_pingpong.hpp CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_tma_warpspecialized.hpp CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_gemm_tma.hpp CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_tile_scheduler_stream_k.hpp CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sm90_tile_scheduler.hpp CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
sparse_gemm_row_broadcast.h Add support for sparse GEMM with row broadcasted bias vector (#951) 2023-05-24 10:25:05 -04:00
sparse_gemm.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
symm_universal.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
tile_scheduler_params.h CUTLASS 3.2.1 (#1113) 2023-09-26 17:24:26 -04:00
tile_scheduler.hpp CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00
trmm_universal.h CUTLASS 3.2 (#1024) 2023-08-07 20:50:32 -04:00