vllm/models at 7fc23be81c55ca0570f551871a3adc994aaefc05 - vllm

Mor Zusman 7fc23be81c [Kernel] W8A16 Int8 inside FusedMoE (#7415 )	2024-08-16 10:06:51 -07:00
..
__init__.py	[CI/Build] Move `test_utils.py` to `tests/utils.py` (#4425 )	2024-05-13 23:50:09 +09:00
test_aqlm.py	[CI/Build][REDO] Add is_quant_method_supported to control quantization test configurations (#5466 )	2024-06-13 15:18:08 +00:00
test_bart.py	[Core] Support serving encoder/decoder models (#7258 )	2024-08-09 10:39:41 +08:00
test_big_models.py	[CI/Build] Reuse code for checking output consistency (#5988 )	2024-06-30 11:44:25 +08:00
test_blip2.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_chameleon.py	[Bugfix] Fix weight loading for Chameleon when TP>1 (#7410 )	2024-08-13 05:33:41 +00:00
test_danube3_4b.py	[Model] H2O Danube3-4b (#6451 )	2024-07-26 20:47:50 -07:00
test_embedding.py	[CI/Test] improve robustness of test (vllm_runner) (#5357 )	2024-06-08 08:59:20 +00:00
test_fp8.py	[mypy] Enable type checking for test directory (#5017 )	2024-06-15 04:45:31 +00:00
test_fuyu.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_gguf.py	[Core] Support loading GGUF model (#5191 )	2024-08-05 17:54:23 -06:00
test_gptq_marlin_24.py	[CI/Build][REDO] Add is_quant_method_supported to control quantization test configurations (#5466 )	2024-06-13 15:18:08 +00:00
test_gptq_marlin.py	add gptq_marlin test for bug report https://github.com/vllm-project/vllm/issues/5088 (#5145 )	2024-06-15 13:38:16 -04:00
test_internvl.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_jamba.py	[Kernel] W8A16 Int8 inside FusedMoE (#7415 )	2024-08-16 10:06:51 -07:00
test_llava_image_embeds.py	[Core][VLM] Support image embeddings as input (#6613 )	2024-08-12 16:16:06 +08:00
test_llava_next.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_llava.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_marlin.py	[CI/Build][REDO] Add is_quant_method_supported to control quantization test configurations (#5466 )	2024-06-13 15:18:08 +00:00
test_minicpmv.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_mistral.py	[CI/Test] improve robustness of test (vllm_runner) (#5357 )	2024-06-08 08:59:20 +00:00
test_models.py	[CI/Build] Reuse code for checking output consistency (#5988 )	2024-06-30 11:44:25 +08:00
test_oot_registration.py	[misc][ci] fix cpu test with plugins (#7489 )	2024-08-13 19:27:46 -07:00
test_paligemma.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_phi3v.py	[VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126 )	2024-08-14 17:55:42 +00:00
test_registry.py	[Model] Support SigLIP encoder and alternative decoders for LLaVA models (#7153 )	2024-08-06 16:55:31 +08:00
utils.py	[Core] Support serving encoder/decoder models (#7258 )	2024-08-09 10:39:41 +08:00