* add pullrequest faq Co-authored-by: Scott Beddall <scbedd@microsoft.com> Co-authored-by: Wes Haggard <weshaggard@users.noreply.github.com> Co-authored-by: Patrick Hallisey <pahallis@microsoft.com>
5.1 KiB
Pull Request FAQ
The pullrequest pipeline is a single public definition that handles all pull request changes to an azure-sdk-for-X repository. This document is intended to answer some common questions that users may have about the pullrequest definition.
Can I get a bit more context first?
Let's get some basic repo structure discussion out of the way. The azure-sdk team maintains a consistent repo structure for all shipping packages to package managers (Read NPM, Nuget, pypi, Maven, etc)
sdk/
storage
Azure.Storage.Blobs
Azure.Storage.Queues
...
<service>
<service-package-1>
..
<service-package-N>
// the ci.yml is what AZDO build defs are based upon
ci.yml
This necessitates that release definitions on the internal azure devops exist for each service in a repository. However, each build definition can only build and ship packages within the service it was created for.
This service-directory also applied to public build definitions that triggered on pull requests in our repos. Due to this, large changesets that touched multiple service directories would incur a build for every service directory that was touched. The azure-sdk EngSys calls this situation a build storm.
The <language> - pullrequest definitions entirely replace service-specific build definitions. It has the ability to expand and contract the targeted packages for build according to a git diff of the actual changes made. Because of this, any repo that has cut over to pullrequest will enjoy no longer incurring build storms on large cross-cutting changes. While the individual build run will be very long running and batch up tests across a bunch of agents, it will eventually complete. It will be impossible to exhaust GitHub or Azure DevOps token utilization as well, given that it is a single definition triggering checks.
The pullrequest pipeline is currently deployed in the following azure-sdk repositories:
| Pipeline Def | Completed? |
|---|---|
| Java | ❌ |
| JS | ✅ |
| .NET | ❌ |
| Python | ✅ |
| Rust | ✅ |
Only repos that appear in the above list are enabled with a single unified pullrequest pipeline. All other azure-sdk shipping repositories ship and PR using a build definition per service directory.
Pullrequest pipeline order of operations
- Generate a PR diff
- Save Package Properties using the
diff - Run
buildandanalyzesteps only against artifacts that come out of the package-properties folder- The primary change between service build and the pullrequest build is the scoping mechanism. For a service build, a specific service directory is examined for packages. For a pullrequest build, the entire repository is considered before being scoped down to only packages that were actually changed.
- Tests are run against
indirectly anddirectly changed packages separately in batches.
What is a direct vs indirect change?
- A
directly changed package is one whose actual package code has changed. - An
indirectchanged package is a package that has been added for verification of code that is not directly within the package itself.- For example, in
java, when theeng/package is changed, we triggerazure-coreindirectly.
- For example, in
Why do I see jobs with bX or ibY suffixes?
As mentioned above, direct and indirect packages are batched separately. Batching is best explained by the following pseudocode
batchSize = configurable # of packages in each test batch, defaults to 10
directPackages = the list of packages with directly changed code in the PR
group the direct packages by matrix configuration
- each matrix contribution
- group by batch size
- assign the matrix to the full batch
- if multiple batches exist, add suffix
Notice that packages are grouped initially by the matrix associated with their ci.yml. In the pullrequest pipeline, the service directory of a package no longer matters, only what matrix it belongs to.
indirect batching works the same way, but doesn't use the full test matrix by default. It instead deterministically selects a single item from the resolved test matrix and assigns the batch of packages to it.
The suffixes b1 or ib1 or are added automatically as needed by the job pull request matrix creation..
Can I disable this matrix batching?
Yes! Users can entirely disable the batching for a specific matrix by setting PRBatching to false in the matrix configuration.
Example:
MatrixConfigs:
- Name: version_overrides_tests
Path: sdk/core/version-overrides-matrix.json
Selection: all
PRBatching: false # the new key
GenerateVMJobs: true