vllm/cpu at 1a95f10ee7d2ffa538a6d210b53bf363e039feee - vllm

History

Li, Jiang a6f332d0d9 [Hardware][CPU][bugfix] Fix half dtype support on AVX2-only target (#10108 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>		2024-11-07 18:42:50 +08:00
..
activation.cpp	[Kernel][CPU] Add Quick `gelu` to CPU (#5717 )	2024-06-21 06:39:40 +00:00
attention.cpp	[Hardware][CPU] Update torch 2.5 (#9911 )	2024-11-07 04:43:08 +00:00
cache.cpp	[Kernel][Attention] Separate `Attention.kv_scale` into `k_scale` and `v_scale` (#6081 )	2024-07-16 15:31:32 -07:00
cpu_types_vsx.hpp	Support CPU inference with VSX PowerPC ISA (#5652 )	2024-06-26 21:53:04 +00:00
cpu_types_x86.hpp	[Hardware][CPU][bugfix] Fix half dtype support on AVX2-only target (#10108 )	2024-11-07 18:42:50 +08:00
cpu_types.hpp	Support CPU inference with VSX PowerPC ISA (#5652 )	2024-06-26 21:53:04 +00:00
dnnl_helper.hpp	[Hardware][CPU] Update torch 2.5 (#9911 )	2024-11-07 04:43:08 +00:00
layernorm.cpp	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
pos_encoding.cpp	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
quant.cpp	[Hardware][CPU] Update torch 2.5 (#9911 )	2024-11-07 04:43:08 +00:00
torch_bindings.cpp	[Hardware][CPU] compressed-tensor INT8 W8A8 AZP support (#9344 )	2024-10-17 12:21:04 -04:00
utils.cpp	[Hardware][Intel] Support compressed-tensor W8A8 for CPU backend (#7257 )	2024-09-11 09:46:46 -07:00