[examples] Fix typos in SYRK and TRMM examples (#507)
This commit is contained in:
parent
858c735856
commit
0abaac84ea
@ -37,7 +37,7 @@
|
||||
the symmetric rank-k update (SYRK) using double-precision doubleing-point arithmetic and assumes
|
||||
all matrices have column-major layout.
|
||||
|
||||
The threadblock tile size is chosen as 128x128x8 which offers good performance for large matrices.
|
||||
The threadblock tile size is chosen as 16x32x16 which offers good performance for large matrices.
|
||||
See the CUTLASS Parallel for All blog post for more exposition on the tunable parameters available
|
||||
in CUTLASS.
|
||||
|
||||
@ -83,7 +83,7 @@ cudaError_t CutlassSsyrkNN(
|
||||
int ldc) {
|
||||
|
||||
// Define type definition for double-precision CUTLASS SYRK with column-major
|
||||
// input matrices and 128x128x8 threadblock tile size (chosen by default).
|
||||
// input matrices and 16x32x16 threadblock tile size (chosen by default).
|
||||
//
|
||||
// To keep the interface manageable, several helpers are defined for plausible compositions
|
||||
// including the following example for double-precision SYRK. Typical values are used as
|
||||
|
@ -37,7 +37,7 @@
|
||||
the triangular matrix product (TRMM) using double-precision doubleing-point arithmetic and assumes
|
||||
all matrices have column-major layout.
|
||||
|
||||
The threadblock tile size is chosen as 128x128x8 which offers good performance for large matrices.
|
||||
The threadblock tile size is chosen as 64x64x16 which offers good performance for large matrices.
|
||||
See the CUTLASS Parallel for All blog post for more exposition on the tunable parameters available
|
||||
in CUTLASS.
|
||||
|
||||
@ -84,7 +84,7 @@ cudaError_t CutlassStrmmNN(
|
||||
int ldc) {
|
||||
|
||||
// Define type definition for double-precision CUTLASS TRMM with column-major
|
||||
// input matrices and 128x128x8 threadblock tile size (chosen by default).
|
||||
// input matrices and 64x64x16 threadblock tile size (chosen by default).
|
||||
//
|
||||
// To keep the interface manageable, several helpers are defined for plausible compositions
|
||||
// including the following example for double-precision TRMM. Typical values are used as
|
||||
|
Loading…
Reference in New Issue
Block a user