youkaichao
|
eebad39f26
|
[torch.compile] support all attention backends (#10558)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-22 14:04:42 -08:00 |
|
youkaichao
|
a111d0151f
|
[platforms] absorb worker cls difference into platforms folder (#10555)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2024-11-21 21:00:32 -08:00 |
|
Mengqing Cao
|
9d827170a3
|
[Platforms] Add device_type in Platform (#10508)
Signed-off-by: MengqingCao <cmq0113@163.com>
|
2024-11-21 04:44:20 +00:00 |
|
Li, Jiang
|
63f1fde277
|
[Hardware][CPU] Support chunked-prefill and prefix-caching on CPU (#10355)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-20 10:57:39 +00:00 |
|
Mengqing Cao
|
8c1fb50705
|
[Platform][Refactor] Extract func get_default_attn_backend to Platform (#10358)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2024-11-19 11:22:26 +08:00 |
|
youkaichao
|
8d74b5aee9
|
[platforms] refactor cpu code (#10402)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-16 23:14:23 -08:00 |
|
Cyrus Leung
|
26a68d5d7e
|
[CI/Build] Add test decorator for minimum GPU memory (#8925)
|
2024-09-29 02:50:51 +00:00 |
|
Cyrus Leung
|
6ffa3f314c
|
[CI/Build] Avoid CUDA initialization (#8534)
|
2024-09-18 10:38:11 +00:00 |
|
Li, Jiang
|
0b952af458
|
[Hardware][Intel] Support compressed-tensor W8A8 for CPU backend (#7257)
|
2024-09-11 09:46:46 -07:00 |
|