vllm/tests/models
Alexander Matveev 6979ade384
Add GPTQ Marlin 2:4 sparse structured support (#4790)
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
2024-05-16 12:56:15 -04:00
..
__init__.py [CI/Build] Move test_utils.py to tests/utils.py (#4425) 2024-05-13 23:50:09 +09:00
test_aqlm.py AQLM CUDA support (#3287) 2024-04-23 13:59:33 -04:00
test_big_models.py Revert "[Kernel] Use flash-attn for decoding (#3648)" (#4820) 2024-05-15 11:52:45 +09:00
test_embedding.py [Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734) 2024-05-11 11:30:37 -07:00
test_fp8.py Revert "[Kernel] Use flash-attn for decoding (#3648)" (#4820) 2024-05-15 11:52:45 +09:00
test_gptq_marlin_24.py Add GPTQ Marlin 2:4 sparse structured support (#4790) 2024-05-16 12:56:15 -04:00
test_gptq_marlin.py [Kernel] add bfloat16 support for gptq marlin kernel (#4788) 2024-05-16 09:55:29 -04:00
test_llava.py [Test] Make model tests run again and remove --forked from pytest (#3631) 2024-03-28 21:06:40 -07:00
test_marlin.py [CI/Build] Move test_utils.py to tests/utils.py (#4425) 2024-05-13 23:50:09 +09:00
test_mistral.py [CI/Build] Move test_utils.py to tests/utils.py (#4425) 2024-05-13 23:50:09 +09:00
test_models.py [Misc]Add customized information for models (#4132) 2024-04-30 21:18:14 -07:00
test_oot_registration.py [Core] enable out-of-tree model register (#3871) 2024-04-06 17:11:41 -07:00
utils.py [Kernel] Marlin Expansion: Support AutoGPTQ Models with Marlin (#3922) 2024-04-29 09:35:34 -07:00