Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching). |
||
|---|---|---|
| .. | ||
| lm-eval-harness | ||
| nightly-benchmarks | ||
| check-wheel-size.py | ||
| release-pipeline.yaml | ||
| run-amd-test.sh | ||
| run-benchmarks.sh | ||
| run-cpu-test-ppc64le.sh | ||
| run-cpu-test.sh | ||
| run-multi-node-test.sh | ||
| run-neuron-test.sh | ||
| run-openvino-test.sh | ||
| run-tpu-test.sh | ||
| run-xpu-test.sh | ||
| test-pipeline.yaml | ||