From d7263a1bb837648bec67d99ed35db56c58832d3f Mon Sep 17 00:00:00 2001 From: Rafael Vasquez Date: Thu, 7 Nov 2024 02:50:35 -0500 Subject: [PATCH] Doc: Improve benchmark documentation (#9927) Signed-off-by: Rafael Vasquez --- docs/source/dev/profiling/profiling_index.rst | 5 +-- docs/source/index.rst | 4 +-- docs/source/performance/benchmarks.rst | 33 +++++++++++++++++++ .../performance_benchmark/benchmarks.rst | 23 ------------- 4 files changed, 38 insertions(+), 27 deletions(-) create mode 100644 docs/source/performance/benchmarks.rst delete mode 100644 docs/source/performance_benchmark/benchmarks.rst diff --git a/docs/source/dev/profiling/profiling_index.rst b/docs/source/dev/profiling/profiling_index.rst index 9e8b2f18..a422b1fc 100644 --- a/docs/source/dev/profiling/profiling_index.rst +++ b/docs/source/dev/profiling/profiling_index.rst @@ -1,5 +1,6 @@ -Profiling vLLM -================================= +============== +Profiling vLLM +============== We support tracing vLLM workers using the ``torch.profiler`` module. You can enable tracing by setting the ``VLLM_TORCH_PROFILER_DIR`` environment variable to the directory where you want to save the traces: ``VLLM_TORCH_PROFILER_DIR=/mnt/traces/`` diff --git a/docs/source/index.rst b/docs/source/index.rst index 51add1fd..38dad25e 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -126,9 +126,9 @@ Documentation .. toctree:: :maxdepth: 1 - :caption: Performance benchmarks + :caption: Performance - performance_benchmark/benchmarks + performance/benchmarks .. toctree:: :maxdepth: 2 diff --git a/docs/source/performance/benchmarks.rst b/docs/source/performance/benchmarks.rst new file mode 100644 index 00000000..6d4d7b54 --- /dev/null +++ b/docs/source/performance/benchmarks.rst @@ -0,0 +1,33 @@ +.. _benchmarks: + +================ +Benchmark Suites +================ + +vLLM contains two sets of benchmarks: + ++ :ref:`Performance benchmarks ` ++ :ref:`Nightly benchmarks ` + + +.. _performance_benchmarks: + +Performance Benchmarks +---------------------- + +The performance benchmarks are used for development to confirm whether new changes improve performance under various workloads. They are triggered on every commit with both the ``perf-benchmarks`` and ``ready`` labels, and when a PR is merged into vLLM. + +The latest performance results are hosted on the public `vLLM Performance Dashboard `_. + +More information on the performance benchmarks and their parameters can be found `here `__. + +.. _nightly_benchmarks: + +Nightly Benchmarks +------------------ + +These compare vLLM's performance against alternatives (``tgi``, ``trt-llm``, and ``lmdeploy``) when there are major updates of vLLM (e.g., bumping up to a new version). They are primarily intended for consumers to evaluate when to choose vLLM over other options and are triggered on every commit with both the ``perf-benchmarks`` and ``nightly-benchmarks`` labels. + +The latest nightly benchmark results are shared in major release blog posts such as `vLLM v0.6.0 `_. + +More information on the nightly benchmarks and their parameters can be found `here `__. \ No newline at end of file diff --git a/docs/source/performance_benchmark/benchmarks.rst b/docs/source/performance_benchmark/benchmarks.rst deleted file mode 100644 index e5c8d6a5..00000000 --- a/docs/source/performance_benchmark/benchmarks.rst +++ /dev/null @@ -1,23 +0,0 @@ -.. _benchmarks: - -Benchmark suites of vLLM -======================== - - - -vLLM contains two sets of benchmarks: - -+ **Performance benchmarks**: benchmark vLLM's performance under various workloads at a high frequency (when a pull request (PR for short) of vLLM is being merged). See `vLLM performance dashboard `_ for the latest performance results. - -+ **Nightly benchmarks**: compare vLLM's performance against alternatives (tgi, trt-llm, and lmdeploy) when there are major updates of vLLM (e.g., bumping up to a new version). The latest results are available in the `vLLM GitHub README `_. - - -Trigger a benchmark -------------------- - -The performance benchmarks and nightly benchmarks can be triggered by submitting a PR to vLLM, and label the PR with `perf-benchmarks` and `nightly-benchmarks`. - - -.. note:: - - Please refer to `vLLM performance benchmark descriptions `_ and `vLLM nightly benchmark descriptions `_ for detailed descriptions on benchmark environment, workload and metrics.