Commit Graph

69 Commits

Author SHA1 Message Date
James E. Blair 1f026bd49c Finish circular dependency refactor
This change completes the circular dependency refactor.

The principal change is that queue items may now include
more than one change simultaneously in the case of circular
dependencies.

In dependent pipelines, the two-phase reporting process is
simplified because it happens during processing of a single
item.

In independent pipelines, non-live items are still used for
linear depnedencies, but multi-change items are used for
circular dependencies.

Previously changes were enqueued recursively and then
bundles were made out of the resulting items.  Since we now
need to enqueue entire cycles in one queue item, the
dependency graph generation is performed at the start of
enqueing the first change in a cycle.

Some tests exercise situations where Zuul is processing
events for old patchsets of changes.  The new change query
sequence mentioned in the previous paragraph necessitates
more accurate information about out-of-date patchsets than
the previous sequence, therefore the Gerrit driver has been
updated to query and return more data about non-current
patchsets.

This change is not backwards compatible with the existing
ZK schema, and will require Zuul systems delete all pipeline
states during the upgrade.  A later change will implement
a helper command for this.

All backwards compatability handling for the last several
model_api versions which were added to prepare for this
upgrade have been removed.  In general, all model data
structures involving frozen jobs are now indexed by the
frozen job's uuid and no longer include the job name since
a job name no longer uniquely identifies a job in a buildset
(either the uuid or the (job name, change) tuple must be
used to identify it).

Job deduplication is simplified and now only needs to
consider jobs within the same buildset.

The fake github driver had a bug (fakegithub.py line 694) where
it did not correctly increment the check run counter, so our
tests that verified that we closed out obsolete check runs
when re-enqueing were not valid.  This has been corrected, and
in doing so, has necessitated some changes around quiet dequeing
when we re-enqueue a change.

The reporting in several drivers has been updated to support
reporting information about multiple changes in a queue item.

Change-Id: I0b9e4d3f9936b1e66a08142fc36866269dc287f1
Depends-On: https://review.opendev.org/907627
2024-02-09 07:39:40 -08:00
Simon Westphahl b6564e42ce
Fix Github protected check for renamed branches
When a branch is renamed in Github the REST API will redirect the
request to the endpoint for the new branch name. So far the the Github
client automatically followed those redirects and we did not check if
the branch name in the response matched our request.

This lead to cases where an old branch was added to the branch cache as
protected even though the branch no longer existed. This is not a
problem for the schedulers, but since there won't be any cached config
in Zookeeper, zuul-web will display a warning about missing config files
for the branch.

Since the REST endpoint also returns the (new) name of the branch we can
validate this against the requested branch name in addition to disabling
redirects.

However, this fix is not enough as the 'cachecontrol' adapter that we
use also caches the HTTP 301 redirects which is a problem when a new
branch with the same name as the renamed branch is created. To fix this
we will use a cache busting header (no-cache) to not return a cached
response in those cases.

Change-Id: I2670f951cac1bf41c6569f5495a60e9de262d4a4
2024-01-16 09:48:59 +01:00
Ian Wienand 3c2e518c52 github: fallback to api_token when can't find installation
graphql queries (I77be4f16cf7eb5c8035ce0312f792f4e8d4c3e10) require
authentication. Enqueueing changes from GitHub (including Depends-On)
requires we run a graphql query. This means that Zuul must be able to
authenticate either via an application or api_token to support features
like Depends-On.

If the app is setup (app_id in config) but we aren't installed with
permissions on the project we're looking up, then fall back to using a
specified api_token. This will make Depends-On work.

Logging is updated to reflect whether or not we are able to fallback to
the api_token if the application is not installed. We log the lack of an
application installation at info level if we can fallback to the token,
and log at error level if we're falling back to anonymous access.

For backward compatibility we continue to fallback to anonymous access
if neither an application install or api_token are present. The reason
for this is features like Job required-projects: work fine anonymously,
and there may be Zuul installations that don't need additional
functionality.

Keep in mind that authenticated requests to GitHub get larger API rate
limits. Zuul installations should consider setting an API token even
when using an application for this reason. This gives Zuul the best
chance that fallback requests will not be rate limited.

Documentation is updated, a changelog added and several test
configuration files are padded with the required info.

Story: #2008940
Change-Id: I2107aeafc55591eea790244701567569fa6e80d4
2023-09-18 09:29:38 -07:00
James E. Blair 57a9c13197 Use the GitHub default branch as the default branch
This supplies a per-project default value for Zuul's default-branch
based on what the default branch is set to in GitHub.  This means
that if users omit the default-branch setting on a Zuul project
stanza, Zuul will automatically use the correct value.

If the value in GitHub is changed, an event is emitted which allows
us to automatically reconfigure the tenant.

This could be expanded to other drivers that support an indication
of which branch is default.

Change-Id: I660376ecb3f382785d3bf96459384cfafef200c9
2023-08-23 11:07:08 -07:00
Simon Westphahl e3d6cb0724
Don't add PR title in commit message on squash
Github will use the PR title as the commit subject for squash merges, so
we don't need include the title once again in the commit description.

Change-Id: Id5a00701c236235f5a49abd025bcfad1b2da916c
2023-03-20 09:19:08 +01:00
Zuul 934846b9b3 Merge "Report a config error for unsupported merge mode" 2022-12-19 23:11:50 +00:00
Simon Westphahl c8aac6a118
Check if Github detected a merge conflict for a PR
Github uses libgit2 to compute merges without requiring a worktree [0].
In some cases this can lead to Github detecting a merge conflict while
for Zuul the PR merges fine.

To prevent such changes from entering dependent pipelines and e.g. cause
a gate reset, we'll also check if Github detected any merge conflicts.

[0] https://github.blog/2022-10-03-highlights-from-git-2-38/

Change-Id: I22275f24c903a8548bb0ef6c32a2e15ba9eadac8
2022-11-18 11:59:32 +01:00
James E. Blair 640059a67a Report a config error for unsupported merge mode
This updates the branch cache (and associated connection mixin)
to include information about supported project merge modes.  With
this, if a project on github has the "squash" merge mode disabled
and a Zuul user attempts to configure Zuul to use the "squash"
mode, then Zuul will report a configuration syntax error.

This change adds implementation support only to the github driver.
Other drivers may add support in the future.

For all other drivers, the branch cache mixin simply returns a value
indicating that all merge modes are supported, so there will be no
behavior change.

This is also the upgrade strategy: the branch cache uses a
defaultdict that reports all merge modes supported for any project
when it first loads the cache from ZK after an upgrade.

Change-Id: I3ed9a98dfc1ed63ac11025eb792c61c9a6414384
2022-11-11 09:53:28 -08:00
Dong Zhang 79b6252370 Fix bug in getting changed files
The fix including 2 parts:
1. For Gtihub, we use the base_sha instead of target branch to
   be passed as "tosha" parameter to get precise changed files
2. In method getFilesChanges(), use diff() result to filter out
   those files that changed and reverted between commits.

The reason we do not direcly use diff() is that for those
drivers other than github, the "base_sha" is not available yet,
using diff() may include unexpected files when target branch
has diverged from the feature branch.

This solution works for  99.9% of the caseses, it may still get
incorrect list of changed files in following corner case:
1. In non-github connection, whose base_sha is not implented, and
2. Files changed and reverted between commits in the change, and
3. The same file has also diverged in target branch.

The above corner case can be fixed by making base_sha available in
other drivers.

Change-Id: Ifae7018a8078c16f2caf759ae675648d8b33c538
2022-04-25 15:05:48 -07:00
Simon Westphahl c379691533 Include original path of renamed file for a PR
When a file is moved/renamed Github will only return an entry for the
file with the new name. However, the previous path also needs to be
included in the list of files. This is especially important when a Zuul
config file is renamed but also when `job.[irrelevant-]files` is used.

Change-Id: Ieba250bed57c8a9c2e7811476c202d530f2b30f1
2022-03-09 08:20:52 +01:00
Ian Wienand 403d1c2882 github: more complete mocking for app setup
This mocks the /app/installations and /installations/repositories
GitHub API calls to better validate the GitHub project initalization
in the driver.

It implements enough that we can use
GithubClientManager:_prime_installation_map() directly and better
tests the token issuing, etc.

Change-Id: I608f1540ef33b1a95595393e546afba308fef66a
2021-09-23 19:53:48 +10:00
Tobias Henkel ba6d86ada2
Save superfluous api requests in check run reporting
When reporting the check run result zuul is currently doing four
github api requests:
1. Get repository
2. Get head commit of pr
3. Get check runs of head commit
4. (optional) Create initial version of check run in case start
   reporting is disabled
5. Update check run that has been created in start reporting

The first three requests are basically unneeded if we craft a direct
PATCH request and remember the check run id that has been created in
the start reporting.

This is needed since those run in the critical path of the event
processing and can cause large event processing delays in a very busy
zuul system.

Change-Id: I55df1cc28279bb6923e51686dde8809421486c6a
2020-11-04 08:52:28 +01:00
Tobias Henkel 418bf6b462
Remove an unneeded api call when creating check_runs
During the start reporting the check run is being created in status
in_progress. This currently does two api requests due to the
github3.py object model. The first one is to get the repository
object, the second one to actually create the check run. However we
already have all information available that is required to directly
craft a post request to create that check run.

This is a problem because this start reporting runs within the main
loop of zuul and needs to be as fast as possible. In a very busy zuul
api requests are becoming a bottleneck so we need to remove any
request that is unneeded.

Change-Id: Idd166c61a4d0081c284dee5677b4cc2e48932307
2020-11-04 08:52:07 +01:00
Tobias Henkel 42f58abf02
Restore correct reason reporting of merge failures
The optimization of merges in GitHub [1] has a side effect that the
merge failure reason always is 'Method not allowed'. This has been
unnoticed because the test framework did not distinguish between
status message and response body. GitHub reports such failures with
405 Method not allowed and a detailed reason in the response body.

[1] I48dbb5fc1b7e7dec4cb09c779b9661d82935d9df

Change-Id: Ic2e47ee14c7f97defba0905db68ef8706108c081
2020-10-05 14:46:52 +02:00
Zuul 212ed78502 Merge "Optimize GitHub requests on PR merge" 2020-10-02 11:12:58 +00:00
Simon Westphahl eed5820fb3 Optimize GitHub requests on PR merge
Using github3py for merging a pull request there is always one extra
request to get the pull request before the merge can be done due to the
github3py data model.

Since merging is done synchronously in in the scheduler main loop, it
makes sense to save the extra API request by building and executing the
merge call directly.

Change-Id: I48dbb5fc1b7e7dec4cb09c779b9661d82935d9df
2020-09-30 07:24:56 +02:00
Zuul 8a357bfcf8 Merge "Fix multiple prs found when commit is not head" 2020-09-28 00:56:38 +00:00
Zuul 7815953797 Merge "Fetch can-merge info when updating a pull-request" 2020-09-25 10:20:13 +00:00
Simon Westphahl 83281e3356 Fetch can-merge info when updating a pull-request
Instead of synchronously checking via the GitHub API if a pull request
is mergeable, we can already get the necessary information during Github
event processing so the canMerge() check operates on cached data similar
to how this is done for the Gerrit driver already.

With this we can also remove the additional API requests for the commit
status and check runs since this info can be retrieved via the same GraphQL
request.

Change-Id: I6b4848dcb5d27071deeb6a59642c0e3c03a799da
2020-09-23 12:03:55 +02:00
vass 2be96e7a85 Add commit id and owner to Change for mqtt reporter
Change-Id: I3d197996597c70165edbf2e26f621391c87873e6
2020-09-22 10:26:15 +02:00
Tobias Henkel 0093aabd4e
Fix multiple prs found when commit is not head
When receiving a status or checks event we need to find the
corresponding pull request to that event. Github stores the status on
the head commit of a pr and not on the pr itself which leads to
problems if multiple prs exist with the same head but different target
branches. Therefore zuul errors out when more than one pr is found for
that event. However the github search also returns all prs that
contain the sha we search for but have a different head which leads to
the same error as if we had two conflicting prs. This can be solved by
delaying the error and filtering the pr bodies for the head sha first.

Change-Id: Iadafd3cf68a742941e9189e84fca594bc3394084
2020-09-04 13:47:54 +02:00
Tobias Henkel b2f6d48cc5
Handle review requirements in canMerge
Since GitHub 2.21 we now can query the reviewDecision flag on a PR
which tells us if a review is required and if it's approved or not
[1]. This can be leveraged in the canMerge check so we now finally can
accurately check if a change is allowed to enter the gate.

[1] https://developer.github.com/enterprise/2.21/v4/object/pullrequest/

Change-Id: I792a28b9f3c7d40ac21e22438bd7c09d3174cbb2
2020-09-03 19:08:20 +02:00
Tobias Henkel 0986673990
Exercise github auth handling in tests
So far almost all of the github auth handling has been overridden in
the fake test classes. We can increase the test coverage and gain more
confidence when doing changes related to this by not replacing
getGithubClient. Instead call the super method and enhance the
resulting client with the relevant test data.

Change-Id: Icaa9cfb9f28f8faec2dc86f7e6393f5b23765cac
2020-09-03 13:50:33 +02:00
Tobias Henkel 9fc87bbdf5
Remove unneeded api requests when commenting in github
Currently when commenting on a pr in github we first get the repo
object and then the issue object. After that we call create_comment on
this object. This issues two unnecessary api calls to get project and
issue. Instead directly build the correct url and do the post manually
to comment.

Change-Id: I9afb19e4918109e98882aa9d5b716d34c65936f2
2020-08-31 12:12:25 +02:00
Tobias Henkel 688ecf6233
Move reports from FakeGithubConnection to github data
This is needed to handle comment pull as direct post request in the
followup. Further the reports are better located in the github data
since this needs to be global data.

Further this will be needed as preparation for the HA scheduler as
well since there will be more than one connection and the test data
needs to be centralized then.

Change-Id: I3fa96d43f0ffe62d50630856b23972221bf2c69a
2020-08-31 12:12:24 +02:00
Benjamin Schanzel 1ed3992809 GitHub Reporter: Fix User Email in Merge Commit Message
Re-introduce change https://review.opendev.org/#/c/738590/

There was a bug in fetching a GitHub user while loading pull requests.
`getUser` was given a project object instead of a project name string,
so no installation id was found for GitHub app authentication. That lead
to unauthenticated requests to the user API which resulted in HTTP 401
errors.

This change fixes this by passing the correct `project.name` to
`getUser` and also refactors the parameter names of the involved
methods from `project` to `project_name` to be more precise there.

This reverts commit 56f355ab72.

Change-Id: Ib25f0ef5d27115626792890d0ec282ed0d562485
2020-08-26 13:57:51 +02:00
Felix Edel f5790f2d63
Dequeue changes via github checks API
The Github checks API allows us to define custom actions on a check run.
With that, we can define a custom "abort" action, that enables users to
directly abort a running change by clicking on a respective button in
Github's checks tab.

A click on this button will emit a Github webhook event that is now
handleded by Zuul to dequeue the respective change.

To uniquely identify a change represented by a check run, each check run
now comes with an external ID which is a combination of tenant,
pipeline, queue and PR number.

Change-Id: I574479cd06abd1ea4f3daa7850cf35ea6715a23d
2020-08-06 09:02:39 +02:00
Guillaume Chauvel 39425550ff Fix github branch protection while already unprotected
It was causing the branch to be protected, tests were failing

Change-Id: Idb7b3957159684787fe272181eee92af1abe0da1
2020-07-29 12:54:39 +02:00
Gonéri Le Bouder 6623f8316a github: use change.message in squahsed commit message
So far, github use the subject of the first change in the commit
message body. This is most of the time redundant with the PR title.
Ths idea is to use instead the change body.

Change-Id: I9a2bc8aa3ff60a28b2ed62a4ee5fa9f7a2c9dcda
2020-07-10 11:04:14 +02:00
Clark Boylan e255397eb4 Simplify FakeGithubClient and FakeGithubSession
These two fakes are related to each other. Before this change
FakeGithubClient would create a FakeGithubSession and FakeGithubSession
would create FakeGithubClients. THis recursion is confusing and
unnecessary.

Under normal operation the FakeGithubClient exists first so we pass it
into its FakeGithubSessions and the fake session is able to use this
client object rather than recursively creating a new client.

Change-Id: I9ff52061d62e844ca3f2185ea752ca39962b2a9f
2020-04-28 15:45:51 -07:00
Zuul 89e42edeb3 Merge "Improve error reporting when pr merge fails" 2020-03-06 19:12:51 +00:00
Zuul 80bdb338c1 Merge "Optimize canMerge using graphql" 2020-03-06 18:08:03 +00:00
Zuul 66d3b42503 Merge "Refactor branch protection test infrastructure" 2020-03-02 22:07:51 +00:00
Tobias Henkel 4c972f00bd
Optimize canMerge using graphql
The canMerge check is executed whenever zuul tests if a change can
enter a gate pipeline. This is part of the critical path in the event
handling of the scheduler and therefore must be as fast as
possible. Currently this takes five requests for doing its work and
also transfers large amounts of data that is unneeded:

* get pull request
* get branch protection settings
* get commits
* get status of latest commit
* get check runs of latest commit

Especially when Github is busy this can slow down zuul's event
processing considerably. This can be optimized using graphql to only
query the data we need with a single request. This reduces requests
and load on Github and speeds up event processing in the scheduler.

Since this is the first usage of graphql this also sets up needed
testing infrastructure using graphene to mock the github api with real
test data.

Change-Id: I77be4f16cf7eb5c8035ce0312f792f4e8d4c3e10
2020-02-28 09:43:56 +01:00
Tobias Henkel df2e3af642
Refactor branch protection test infrastructure
Branch protection rules are not just about required status so refactor
it in the test framework so we can extend them in the future.

Change-Id: I1e32ebb57b27bce898b074e419299a1803853608
2020-02-25 21:01:25 +01:00
Tobias Henkel 793bb738a9
Improve error reporting when pr merge fails
When merging a pull request in GitHub fails we currently report the
standard message when a merge confict happened. This is actually
rarely the case because when using proper gating a merge should almost
never fail because of a merge conflict. Instead report the error we
get from GitHub so the user knows why the merge failed. In most cases
this is because branch protection refuses to merge because of a
condition zuul doesn't know about. This will then be reported to the
user so he can do something about it without asking a zuul admin for
the internal logs.

Change-Id: I1c3e82a2ac20e2177352b1a4d078839d3e114467
2020-02-23 19:14:43 +01:00
Felix Edel fe3b5e3bae
Support file comments via Github checks API
Zuul is already providing these file comments via the zuul_return value.
So far, the Github reporter wasn't able to use those, but with the help
of the checks API we can add so called "annotations" to each check run.

Change-Id: Iff10172f95dc0430bec8e4dafb9a6c09bbe06077
2020-02-19 14:01:41 +01:00
Felix Edel 33f87bea9c
Implement basic github checks API workflow
Utilizing the checks API to report the build state to Github provides
some additional functionality that is not supported by the status API.

Those are:
 - Defining custom actions to e.g. cancel a running build
 - Report line-based file annotations

This change implements the basic checks API workflow. Once this is in
place, the additional features could simply be added on top.

Change-Id: I7e790783ee35971085863b5487ff094fa0b23d65
Story: 2007268
Task: 38672
2020-02-19 13:31:49 +01:00
Tobias Henkel e78a31e85b
Handle draft pull requests in canMerge
Github now supports draft pull requests which are blocked from merging
by github. Currently those can end up in a gate loop if all
requirements for the gate pipeline are fulfilled except for
un-drafting it.  Zuul needs to handle those in the canMerge check in
order to handle them correctly.

Change-Id: Ie9164ad5127d19ca3d75660b9718979da7cc344c
2020-01-28 09:37:09 +01:00
Tobias Henkel 232b47fbf7
Fix occasionally wrong change url with github
Github returns different urls for pull requests. One for api use and
one for browser use. The change url we're interested in is the one for
browser usage. However zuul takes the wrong one from the pull request
data structure but most of the time overwrites the change url
generated from event metadata. This leads to the issue that the change
url is sometimes not correct.

This can be fixed by taking the html_url from the pull request data
structure. Further due to the fact that the url is always set on a pr
and stays static throughout the whole lifecycle of the pr, be safe and
just don't overwrite the url from event metadata.

Change-Id: I2030b9dddd5bc618231b73d73ae64e2552231769
2020-01-14 10:24:53 +01:00
Clark Boylan 9453df6936 Look for depends-on lines in dependency searches
Prior to this change we looked for the current change/PR's url in any
other change/PR's message body. This meant any cross referencing of urls
would create further lookups to determine if there was a real dependency
there. Restrict this a bit more to require the Depends-On string too
when searching to limit the number of spidering queries that must be
done.

This is particularly useful for the github driver because queries are
expensive there and may be rate limited.

Change-Id: Ie49fe1a72dc844b14003d942684fd3d2a9478d21
2019-12-06 12:07:41 -08:00
Simon Westphahl 945bf41af3 Fix issue search in FakeGithubClient
Mocked issue search was broken in multiple ways:

- tokenize() was wrongly splitting search modifieres (e.g. type:pr)
  into tokens (e.g. [type, pr])
- code for removing search modifiers (type, is, in) from terms was
  never reached and also used set(...) in a wrong way
- set intersection of search terms and body doesn't make any sense since
  this will almost always have ANY overlap

Simply extracting the URLs from the query and checking for in PR body
should make the mock work for most of the tests.

Change-Id: I9f896af85e21770bba80857511aae8505f3e5b84
2019-11-15 15:15:45 +01:00
Tobias Henkel 6ee03a4408
Eliminate two github requests per _updateChange
When processing _updateChange we fetch the statuses of the commit. Due
the object model of Github3.py we currently get the repository
(request 1), then the commit (request 2) and finally the commit
statuses (request 3). However we already have all information needed
to directly perform the request 3. Further we're in any case only
using the json structure of the statuses so we can easily do a direct
request and avoid the first two. This essentially saves us two api
requests per incoming events.

Change-Id: I4185e15dcaf2b66bc1e6629e725dfd508368140a
2019-06-17 22:39:36 +02:00
Tobias Henkel a3e4e83974
Switch getPullBySha to using the search api
We currently cache the sha to pull request mapping locally in order to
be able to map status events to pull requests. This cache is not
effective in some cases. For instance on merge a pull request changes
its sha so it's not known by the cache. Some projects like ansible run
post merge jobs on a pr which in turn emit status events for the now
unknown sha's. This invalidates the cache and we need to sweep through
all pull requests of the project to find the correct one. This can
better be solved by using the search api.

However the search api is more restricted and allows 30 requests per
minute per installation. Based on local testing it's hard but not
impossible to reach the limit so we need to handle it
gracefully. Luckily we get all information we need to handle it when
we reached the rate limit.

Change-Id: Idb6c64caa9f948111a58135754174d898ca1528c
2019-06-17 22:39:35 +02:00
Zuul c58b410cb4 Merge "Add support for submitting reviews on GitHub" 2019-05-16 21:32:46 +00:00
Clint Byrum 6d5615dd60 Add support for submitting reviews on GitHub
GitHub has added a lot of controls around the review object, so it is
useful to be able to run tests and then submit a review rather than
simply merge. One use-case is also to be able to self-approve with a
comment, such as is done in the test code added.

Change-Id: I16872062e627b385f78023878bea348555ec5348
2019-04-30 09:43:08 -07:00
Paul Belanger 75dc1172f9 Use user.html_url for github reporter messages
There is no need to dynamically generate the github users url with zuul,
we can rely on html_url from github. This fixes an issue where we
constructed the wrong url for the merge commit when using a github app.

Change-Id: I65028be07636aeaad48ccb6de80b3e2a0d29b436
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
2019-04-23 12:17:23 -04:00
Zuul 24581a8360 Merge "Switch to LRU based sha to PR cache" 2019-02-19 18:51:37 +00:00
Clark Boylan cee7a4e248 Switch to LRU based sha to PR cache
This updates the github driver to cache PRs by sha using cachetool's
LRUCache.

We make this change because we need to cache closed PRs so can't rely on
the action of closing a PR to remove the PR from the cache. Since we
don't have a good method of evicting entries we fall back to LRU with
a reasonable cache size (2k commits).

Change-Id: I5fb6c8b33f9eed221a8b84e537f02e7dccf2d2df
2019-02-19 09:31:54 -08:00
Clark Boylan 5cfebfa3ed Don't request PR issue data
The github api docs show that labels is now part of the PR dict returned
in pulls API responses. github3 doesn't directly expose this as an
attribute but we can use the as_dict representation to access any json
field. Use this field on the PR dict repr to access the labels of a PR
and stop fetching the issue of a PR for that info.

Change-Id: Ib14bb387dbd233ff252cf57be7f0517770ade037
2019-02-18 16:59:38 -08:00