Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching). |
||
|---|---|---|
| .. | ||
| e2e | ||
| __init__.py | ||
| test_batch_expansion.py | ||
| test_dynamic_spec_decode.py | ||
| test_metrics.py | ||
| test_multi_step_worker.py | ||
| test_ngram_worker.py | ||
| test_scorer.py | ||
| test_spec_decode_worker.py | ||
| test_utils.py | ||
| utils.py | ||