diff --git a/README.md b/README.md index 05a0d3a3..b0f9028a 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ primitives for loading data, computing predicate masks, streaming data at each l the GEMM hierarchy, and updating the output matrix. CUTLASS 1.0 is described in the [Doxygen documentation](https://github.com/NVIDIA/cutlass/docs) -and our talk at the GPU Technology Conference 2018 (login required). +and our talk at the [GPU Technology Conference 2018](http://on-demand.gputechconf.com/gtc/2018/presentation/s8854-cutlass-software-primitives-for-dense-linear-algebra-at-all-levels-and-scales-within-cuda.pdf) (login required). # Performance @@ -162,6 +162,8 @@ Program usage: --append= If true, appends output to existing CSV file. If false, overwrites. --alpha= Value for alpha to be used in GEMM experiments --beta= Value for beta to be used in GEMM experiments + --dist= Describes the random distribution of each of the input matrix operands. + --execution_mode= Specifies execution mode: profile, verify, single --output= Writes summary of profiling to specified .csv file --iterations= maximum number of iterations to execute when profiling --m=[:max height[:step]] Height of GEMM problem (number of rows of C). May specify a range with optional step size. @@ -169,8 +171,9 @@ Program usage: --k=[:max depth[:step]] Size of inner dimension of A and B. May specify a range with optional step size. --kernels=<{s|d|h|i|wmma}gemm_{nn,nt,tn,tt}> Select GEMM datatype and layout to use for tests --peak= If true, only reports peak performance per kernel after profiling specified problem space. + --save_workspace={*never,incorrect,always} Specifies when to save the GEMM inputs and results to the filesystem. --seed= Random seed used by the random number generator in initializing input matrices. - --tags= Inserts leading columns in output table and uniform values for each column. Useful for generating pivot tables. + --tags= Inserts leading columns in output table and uniform values for each column. Example usage: