| .. |
|
__init__.py
|
[CI/Build] Move test_utils.py to tests/utils.py (#4425)
|
2024-05-13 23:50:09 +09:00 |
|
allclose_default.py
|
[ROCm] Fix some kernels failed unit tests (#2498)
|
2024-02-05 14:25:36 -08:00 |
|
conftest.py
|
[Kernel] Use flashinfer for decoding (#4353)
|
2024-05-03 15:51:27 -07:00 |
|
quant_utils.py
|
[ Kernel ] FP8 Dynamic-Per-Token Quant Kernel (#6511)
|
2024-07-18 01:38:35 +00:00 |
|
test_activation.py
|
[Misc] Add CustomOp interface for device portability (#5255)
|
2024-06-05 09:18:19 -07:00 |
|
test_attention_selector.py
|
[Kernel] Correctly invoke prefill & decode kernels for cross-attention (towards eventual encoder/decoder model support) (#4888)
|
2024-07-08 17:12:15 +00:00 |
|
test_attention.py
|
[Kernel][Attention] Separate Attention.kv_scale into k_scale and v_scale (#6081)
|
2024-07-16 15:31:32 -07:00 |
|
test_blocksparse_attention.py
|
[Kernel][Attention] Separate Attention.kv_scale into k_scale and v_scale (#6081)
|
2024-07-16 15:31:32 -07:00 |
|
test_cache.py
|
[Kernel][Attention] Separate Attention.kv_scale into k_scale and v_scale (#6081)
|
2024-07-16 15:31:32 -07:00 |
|
test_cutlass.py
|
[hardware][misc] introduce platform abstraction (#6080)
|
2024-07-02 20:12:22 -07:00 |
|
test_encoder_decoder_attn.py
|
[Kernel] Correctly invoke prefill & decode kernels for cross-attention (towards eventual encoder/decoder model support) (#4888)
|
2024-07-08 17:12:15 +00:00 |
|
test_flash_attn.py
|
[mypy] Enable type checking for test directory (#5017)
|
2024-06-15 04:45:31 +00:00 |
|
test_flashinfer.py
|
[Kernel][Model] logits_soft_cap for Gemma2 with flashinfer (#6051)
|
2024-07-04 16:35:51 -07:00 |
|
test_fp8_quant.py
|
[ Kernel ] FP8 Dynamic-Per-Token Quant Kernel (#6511)
|
2024-07-18 01:38:35 +00:00 |
|
test_int8_quant.py
|
[ Kernel ] FP8 Dynamic-Per-Token Quant Kernel (#6511)
|
2024-07-18 01:38:35 +00:00 |
|
test_layernorm.py
|
[Misc] Add CustomOp interface for device portability (#5255)
|
2024-06-05 09:18:19 -07:00 |
|
test_marlin_gemm.py
|
[ Misc ] Refactor Marlin Python Utilities (#6082)
|
2024-07-11 15:40:11 +00:00 |
|
test_moe.py
|
[ Misc ] Refactor MoE to isolate Fp8 From Mixtral (#5970)
|
2024-07-02 21:54:35 +00:00 |
|
test_pos_encoding.py
|
[mypy] Enable type checking for test directory (#5017)
|
2024-06-15 04:45:31 +00:00 |
|
test_prefix_prefill.py
|
[Bugfix][Kernel] allow non-power-of-2 for prefix prefill with alibi (#4573)
|
2024-05-08 09:19:58 -07:00 |
|
test_rand.py
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
test_sampler.py
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
utils.py
|
[Kernel] Correctly invoke prefill & decode kernels for cross-attention (towards eventual encoder/decoder model support) (#4888)
|
2024-07-08 17:12:15 +00:00 |