Haicheng Wu
|
69abafb85a
|
Merge pull request #306 from NVIDIA/fix-profiler-cmd-doc
Fix profiler cmd doc
|
2021-07-30 14:36:54 -04:00 |
|
Haicheng Wu
|
68a078fbbf
|
cleanup
|
2021-07-30 11:27:21 -07:00 |
|
Haicheng Wu
|
10709dbb64
|
clean profiler cmd and doc
|
2021-07-30 11:02:17 -07:00 |
|
Manish Gupta
|
1227351079
|
Merge pull request #305 from NVIDIA/fix_epilogue_spill
fix epilogue register spill
|
2021-07-29 14:30:11 -07:00 |
|
Haicheng Wu
|
a77c658439
|
fix epilogue register spill
|
2021-07-29 14:25:48 -07:00 |
|
Haicheng Wu
|
4516b833ce
|
Merge pull request #303 from Peter9606/doc_typo
Doc typo
|
2021-07-28 20:49:06 -04:00 |
|
Peter Han
|
64dd1e1915
|
Doc typo
Signed-off-by: Peter Han <fujun.han@iluvatar.ai>
|
2021-07-29 08:45:59 +08:00 |
|
Manish Gupta
|
1ac4559d12
|
Cutlass 2.6 Update 1 (#301)
* cutlass 2.6 update
* remove debug prints
|
2021-07-27 17:58:30 -07:00 |
|
Manish Gupta
|
e5d51840e8
|
CUTLASS 2.6 (#298)
CUTLASS 2.6
|
2021-07-23 00:40:53 -04:00 |
|
Haicheng Wu
|
6c29fe20ba
|
Merge pull request #285 from tjingrant/patch-1
Typo Fixes
|
2021-07-05 22:51:19 -04:00 |
|
Tian Jin
|
e3c56b0d6b
|
Update predicated_tile_iterator.h
|
2021-07-05 12:11:53 -04:00 |
|
Tian Jin
|
4647c57243
|
Update predicated_tile_iterator.h
|
2021-07-05 12:06:41 -04:00 |
|
Haicheng Wu
|
856d4db3fb
|
Update basic_gemm.cu
fix the matrix malloc size
|
2021-06-15 09:08:36 -04:00 |
|
Haicheng Wu
|
6a1064093f
|
Merge pull request #274 from mani-ananth/master
Some pending Bug fixes
|
2021-06-02 13:17:39 -04:00 |
|
Manikandan Ananth
|
c5f1ef4dff
|
update contributors
|
2021-06-02 10:11:42 -07:00 |
|
Manikandan Ananth
|
47ebfccbec
|
bug fixes
|
2021-06-02 10:08:25 -07:00 |
|
Haicheng Wu
|
ad9486684f
|
Merge pull request #272 from BernardoCovas/master
Bug in reference conv3d
|
2021-05-28 17:18:27 -04:00 |
|
Bernardo Covas
|
1d8372a8e2
|
fix typo in reference conv3d
|
2021-05-28 21:06:59 +01:00 |
|
Haicheng Wu
|
9cb7d63424
|
Merge pull request #266 from mani-ananth/master
Fixes for public issue #265
|
2021-05-19 15:15:22 -04:00 |
|
Manikandan Ananth
|
da2f110906
|
Fixes for public issue #265
|
2021-05-19 10:16:52 -07:00 |
|
Haicheng Wu
|
b68113f5be
|
Merge pull request #264 from zheng95z/patch-3
Adds `NoBetaScaling` for `LinearCombination`
|
2021-05-17 10:03:30 -04:00 |
|
Zheng Zeng
|
a68d7cd6f1
|
Adds NoBetaScaling for LinearCombination
|
2021-05-12 22:23:55 +08:00 |
|
Haicheng Wu
|
38e8b29f56
|
Merge pull request #259 from hzfan/ignore_pr
Add gitignore
|
2021-05-10 20:07:53 -04:00 |
|
Haozheng Fan
|
ee7349c94f
|
fix
|
2021-05-10 16:39:04 +08:00 |
|
Haozheng Fan
|
8cdd4293d4
|
add gitignore
|
2021-05-10 16:37:59 +08:00 |
|
Haicheng Wu
|
f58b843951
|
Merge pull request #239 from KeDengMS/kedeng/gelu
Fixes to Gelu for half and fusion
|
2021-05-08 12:51:42 -04:00 |
|
Haicheng Wu
|
5fc142296f
|
Merge pull request #237 from Peter9606/issue_236_typo
Typo fix issue#236
|
2021-05-08 07:51:19 -04:00 |
|
Haicheng Wu
|
233d69aa6d
|
Merge pull request #235 from Peter9606/issue_233_tranpose_update
tranpose.h update based on issue#233
|
2021-05-07 07:14:30 -04:00 |
|
Haicheng Wu
|
9840d25269
|
Merge pull request #256 from zheng95z/patch-2
Fixes some typos in utilities.md
|
2021-05-06 11:02:49 -04:00 |
|
Zheng Zeng
|
b878c96421
|
Fixes some typos in utilities.md
|
2021-05-06 22:37:37 +08:00 |
|
Haicheng Wu
|
8f8a80cad5
|
Merge pull request #251 from zheng95z/patch-1
add a missing 'device_memory::' before a function
|
2021-04-25 22:09:44 -04:00 |
|
Zheng Zeng
|
a8f6f8eb07
|
add a missing 'device_memory::' before a function
|
2021-04-25 20:05:39 +08:00 |
|
Haicheng Wu
|
7c783adf53
|
Merge pull request #247 from xue-fc/patch-1
fix a wrong description
|
2021-04-22 09:27:40 -04:00 |
|
xue-fc
|
4000df9567
|
fix a wrong description
|
2021-04-22 20:28:28 +08:00 |
|
KeDengMS
|
0b74c8f473
|
Address CR
|
2021-04-19 23:36:06 +00:00 |
|
KeDengMS
|
83036ed646
|
More clean up
|
2021-04-18 04:29:20 +00:00 |
|
KeDengMS
|
b7e43f5eb9
|
Clean up
|
2021-04-18 04:24:25 +00:00 |
|
KeDengMS
|
5c62d892fa
|
Add test
|
2021-04-18 04:09:34 +00:00 |
|
KeDengMS
|
41a31b404b
|
Fixes to Gelu for half and fusion
|
2021-04-17 22:10:19 +00:00 |
|
Peter Han
|
7320aee17d
|
Typo fix issue#236
Signed-off-by: Peter Han <fujun.han@iluvatar.ai>
|
2021-04-15 15:08:35 +08:00 |
|
Peter Han
|
2142a05d9d
|
tranpose.h update based on issue#233
1. Add 'pragma once' preprocess directive
2. Replace prmt PTX with __byte_perm intrinsic
Signed-off-by: Peter Han <fujun.han@iluvatar.ai>
|
2021-04-14 19:58:00 +08:00 |
|
Haicheng Wu
|
c77a524459
|
Merge pull request #230 from mani-ananth/master
Fix for issue #221
|
2021-04-09 14:45:55 -04:00 |
|
Manikandan Ananth
|
fac6680f31
|
Merge branch 'master' of github.com:NVIDIA/cutlass
|
2021-04-09 11:36:31 -07:00 |
|
Manikandan Ananth
|
08993707da
|
fixing functional bug in fused epilogue
|
2021-04-09 11:36:03 -07:00 |
|
Haicheng Wu
|
c805593ebe
|
Merge pull request #228 from mani-ananth/master
Fix for issue#224 and issue#225
|
2021-04-08 10:08:13 -04:00 |
|
Manikandan Ananth
|
26556d7206
|
fix a broken sparse gemm example. found by the community.
|
2021-04-07 13:32:55 -07:00 |
|
Manikandan Ananth
|
4839b6cb61
|
add 2stage fprop 3d into default file
|
2021-04-07 13:29:32 -07:00 |
|
Haicheng Wu
|
d97214987a
|
Merge pull request #220 from Peter9606/wrong-stride-array-definition
Bugfix: typo, make reduction device cases passed
|
2021-04-02 08:43:52 -04:00 |
|
Haicheng Wu
|
b0bbc6d548
|
Merge pull request #219 from mani-ananth/master
Fix for issue #211
|
2021-04-02 08:42:09 -04:00 |
|
Peter Han
|
7074047a54
|
Bugfix: typo, make reduction device cases passed
Signed-off-by: Peter Han <fujun.han@iluvatar.ai>
|
2021-04-02 09:35:23 +08:00 |
|