| .. |
|
__init__.py
|
[CI/Build] Move test_utils.py to tests/utils.py (#4425)
|
2024-05-13 23:50:09 +09:00 |
|
allclose_default.py
|
[ROCm] Fix some kernels failed unit tests (#2498)
|
2024-02-05 14:25:36 -08:00 |
|
conftest.py
|
[Kernel] Use flashinfer for decoding (#4353)
|
2024-05-03 15:51:27 -07:00 |
|
test_activation.py
|
[Misc] Add CustomOp interface for device portability (#5255)
|
2024-06-05 09:18:19 -07:00 |
|
test_attention_selector.py
|
[Bugfix]: During testing, use pytest monkeypatch for safely overriding the env var that indicates the vLLM backend (#5210)
|
2024-06-03 20:32:57 -07:00 |
|
test_attention.py
|
[Model] Support MAP-NEO model (#5081)
|
2024-05-30 19:24:41 -07:00 |
|
test_blocksparse_attention.py
|
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799)
|
2024-05-24 22:00:52 -07:00 |
|
test_cache.py
|
[Model] Support MAP-NEO model (#5081)
|
2024-05-30 19:24:41 -07:00 |
|
test_cutlass.py
|
[Kernel] Update Cutlass fp8 configs (#5144)
|
2024-06-01 08:46:07 +00:00 |
|
test_flash_attn.py
|
[Kernel] Add flash-attn back (#4907)
|
2024-05-19 18:11:30 -07:00 |
|
test_int8_quant.py
|
[Kernel] Dynamic Per-Token Activation Quantization (#5037)
|
2024-06-07 09:36:26 -07:00 |
|
test_layernorm.py
|
[Misc] Add CustomOp interface for device portability (#5255)
|
2024-06-05 09:18:19 -07:00 |
|
test_marlin_gemm.py
|
Marlin 24 prefill performance improvement (about 25% better on average) (#4983)
|
2024-05-23 02:39:27 -04:00 |
|
test_moe.py
|
[Kernel] Support MoE Fp8 Checkpoints for Mixtral (Static Weights with Dynamic/Static Activations) (#4527)
|
2024-05-04 11:45:16 -07:00 |
|
test_pos_encoding.py
|
[Misc] Add CustomOp interface for device portability (#5255)
|
2024-06-05 09:18:19 -07:00 |
|
test_prefix_prefill.py
|
[Bugfix][Kernel] allow non-power-of-2 for prefix prefill with alibi (#4573)
|
2024-05-08 09:19:58 -07:00 |
|
test_rand.py
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
test_sampler.py
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
utils.py
|
[Bugfix]: During testing, use pytest monkeypatch for safely overriding the env var that indicates the vLLM backend (#5210)
|
2024-06-03 20:32:57 -07:00 |