<h1 align="center">
<a href="https://prompts.chat">
vLLM releases offer a reliable version of the code base, packaged into a binary format that can be conveniently accessed via [PyPI](https://pypi.org/project/vllm). These releases also serve as key milestones for the development team to communicate with the community about newly available features, i
Sign in to like and favorite skills
vLLM releases offer a reliable version of the code base, packaged into a binary format that can be conveniently accessed via PyPI. These releases also serve as key milestones for the development team to communicate with the community about newly available features, improvements, and upcoming changes that could affect users, including potential breaking changes.
We aim to have a regular release every 2 weeks. Since v0.12.0, regular releases increment the minor version rather than patch version. The list of past releases can be found here.
Our version numbers are expressed in the form
vX.Y.Z, where X is the major version, Y is the minor version, and Z is the patch version. They are incremented according to the following rules:
This versioning scheme is similar to SemVer for compatibility purposes, except that backwards compatibility is only guaranteed for a limited number of minor releases (see our deprecation policy for details).
Each release is built from a dedicated release branch.
vX.Y.Z-rc1. This enables us to build and test multiple RCs for each release.vX.Y.Z does not trigger the build but used for Release notes and assets.After branch cut, we approach finalizing the release branch with clear criteria on what cherry picks are allowed in. Note: a cherry pick is a process to land a PR in the release branch after branch cut. These are typically limited to ensure that the team has sufficient time to complete a thorough round of testing on a stable code base.
Please note: No feature work allowed for cherry picks. All PRs that are considered for cherry-picks need to be merged on trunk, the only exception are Release branch specific changes.
Before each release, we perform end-to-end performance validation to ensure no regressions are introduced. This validation uses the vllm-benchmark workflow on PyTorch CI.
Current Coverage:
Performance Validation Process:
Step 1: Get Access Request write access to the pytorch/pytorch-integration-testing repository to run the benchmark workflow.
Step 2: Review Benchmark Setup Familiarize yourself with the benchmark configurations:
Step 3: Run the Benchmark Navigate to the vllm-benchmark workflow and configure:
releases/v0.9.2)Step 4: Review Results Once the workflow completes, benchmark results will be available on the vLLM benchmark dashboard under the corresponding branch and commit.
Step 5: Performance Comparison Compare the current results against the previous release to verify no performance regressions have occurred. Here is an example of v0.9.1 vs v0.9.2.