Andrew Kerr
|
12f4108ac2
|
CUTLASS 2.9 (#468)
|
2022-04-23 15:02:38 -04:00 |
|
Bing Xu
|
d0d941efc7
|
[hardswish] correct implmentation (#403)
* [hardswish] correct implmentation
* seems working
* hardswish fp32/fp16x2 optimization
* [relu] half2 support
* add relu0; add multiply_add_relu0;
* cleanup
Co-authored-by: Bing Xu <bingxu@fb.com>
Co-authored-by: Haicheng Wu <haichengw@nvidia.com>
|
2022-02-09 14:28:53 -05:00 |
|
Manish Gupta
|
1ac4559d12
|
Cutlass 2.6 Update 1 (#301)
* cutlass 2.6 update
* remove debug prints
|
2021-07-27 17:58:30 -07:00 |
|
Manish Gupta
|
e5d51840e8
|
CUTLASS 2.6 (#298)
CUTLASS 2.6
|
2021-07-23 00:40:53 -04:00 |
|
Andrew Kerr
|
0e13748649
|
CUTLASS 2.5
|
2021-02-26 09:58:26 -05:00 |
|