zuul/zuul - zuul - OpenDev: Free Software Needs Free Tools

Commit Graph

Author	SHA1	Message	Date
James E. Blair	1f026bd49c	Finish circular dependency refactor This change completes the circular dependency refactor. The principal change is that queue items may now include more than one change simultaneously in the case of circular dependencies. In dependent pipelines, the two-phase reporting process is simplified because it happens during processing of a single item. In independent pipelines, non-live items are still used for linear depnedencies, but multi-change items are used for circular dependencies. Previously changes were enqueued recursively and then bundles were made out of the resulting items. Since we now need to enqueue entire cycles in one queue item, the dependency graph generation is performed at the start of enqueing the first change in a cycle. Some tests exercise situations where Zuul is processing events for old patchsets of changes. The new change query sequence mentioned in the previous paragraph necessitates more accurate information about out-of-date patchsets than the previous sequence, therefore the Gerrit driver has been updated to query and return more data about non-current patchsets. This change is not backwards compatible with the existing ZK schema, and will require Zuul systems delete all pipeline states during the upgrade. A later change will implement a helper command for this. All backwards compatability handling for the last several model_api versions which were added to prepare for this upgrade have been removed. In general, all model data structures involving frozen jobs are now indexed by the frozen job's uuid and no longer include the job name since a job name no longer uniquely identifies a job in a buildset (either the uuid or the (job name, change) tuple must be used to identify it). Job deduplication is simplified and now only needs to consider jobs within the same buildset. The fake github driver had a bug (fakegithub.py line 694) where it did not correctly increment the check run counter, so our tests that verified that we closed out obsolete check runs when re-enqueing were not valid. This has been corrected, and in doing so, has necessitated some changes around quiet dequeing when we re-enqueue a change. The reporting in several drivers has been updated to support reporting information about multiple changes in a queue item. Change-Id: I0b9e4d3f9936b1e66a08142fc36866269dc287f1 Depends-On: https://review.opendev.org/907627	2024-02-09 07:39:40 -08:00
Simon Westphahl	c963526560	Add Zuul event id to merge completed events Return the Zuul event ID that is already part of the merge request with the merge result event so logs can be correlated. Change-Id: I018709cd4d4afa562e6851d0d52c1ddd7583dc62	2023-08-08 12:02:36 +02:00
Simon Westphahl	b17dfc13ed	Cleanup leaked git index.lock files on checkout When the git command crashes or is aborted due to a timeout we might end up with a leaked index.lock file in the affected repository. This has the effect that all subsequent git operations that try to create the lock will fail. Since Zuul maintains a separate lock for serializing operations on a repositotry, we can be sure that the lock file was leaked in a previous operation and can be removed safely. Unable to checkout 8a87ff7cc0d0c73ac14217b653f9773a7cfce3a7 Traceback (most recent call last): File "/opt/zuul/lib/python3.10/site-packages/zuul/merger/merger.py", line 1045, in _mergeChange repo.checkout(ref, zuul_event_id=zuul_event_id) File "/opt/zuul/lib/python3.10/site-packages/zuul/merger/merger.py", line 561, in checkout repo.head.reset(working_tree=True) File "/opt/zuul/lib/python3.10/site-packages/git/refs/head.py", line 82, in reset self.repo.git.reset(mode, commit, '--', paths, *kwargs) File "/opt/zuul/lib/python3.10/site-packages/git/cmd.py", line 542, in <lambda> return lambda args, *kwargs: self._call_process(name, args, kwargs) File "/opt/zuul/lib/python3.10/site-packages/git/cmd.py", line 1005, in _call_process return self.execute(call, exec_kwargs) File "/opt/zuul/lib/python3.10/site-packages/git/cmd.py", line 822, in execute raise GitCommandError(command, status, stderr_value, stdout_value) git.exc.GitCommandError: Cmd('git') failed due to: exit code(128) cmdline: git reset --hard HEAD -- stderr: 'fatal: Unable to create '/var/lib/zuul/merger-git/github/foo/foo%2Fbar/.git/index.lock': File exists. Another git process seems to be running in this repository, e.g. an editor opened by 'git commit'. Please make sure all processes are terminated then try again. If it still fails, a git process may have crashed in this repository earlier: remove the file manually to continue.' Change-Id: I97334383df476809c39e0d03b1af50cb59ee0cc7	2022-11-15 07:03:21 +01:00
James E. Blair	8a8502f661	Fix race in merger shutdown We can disconnect from ZK while the merger is still running which can have some adverse effects and cause tests to never exit. This moves the zk disconnect in the merger to the join method so that we ensure that we have exited the main loop. It also adds some improved logging so that not everything just says "Stopped". Change-Id: I459af85ac70ecf1f61645466d0eddc63c7e61ff9	2022-11-08 15:12:22 -08:00
James E. Blair	e68f2bfdb3	Don't trace merge jobs that we don't lock We get a trace from every merger (including executors) for every merge job because we start the trace before attempting the lock. So essentially, we get one trace from the merger that runs the job, and one trace from every other merger indicating that it did not run the job. This is perhaps too much detail for us. While it's true that we can see the response times of every system component here, it may be sufficient to have only the response time of the first merger. This will reduce the noise in trace visualizations significantly. Change-Id: I88c56f00c060eae9316473f4a4e222a0db97e510	2022-10-05 11:16:18 -07:00
Simon Westphahl	f1e3d67608	Trace merge requests and merger operations The span info for the different merger operations is stored on the request and will be returned to the scheduler via the result event. This also adds the request UUID to the "refstat" job so that we can attach that as a span attribute. Change-Id: Ib6ac7b5e7032d168f53fe32e28358bd0b87df435	2022-09-19 11:25:49 +02:00
James E. Blair	458ba317fd	Add pipeline-based merge op metrics So that operators can see in aggregate how long merge, files-changes, and repo-state merge operations take in certain pipelines, add metrics for the merge operations themselves (these exclude the overhead of pipeline processing and job dispatching). Change-Id: I8a707b8453c7c9559d22c627292741972c47c7d7	2022-07-12 10:25:59 -07:00
James E. Blair	61cb275480	Report which repo failed initial merge ops When the initial merge job for a queue item fails, users typically see a message saying "this project or one of dependencies failed to merge". To help users and/or administrators more quickly identify the problem, include connection project and change information in a warning message posted to the code review system. Change-Id: If1bced80b87b908f63867083efb306ebe02ed1ee	2022-02-20 13:06:39 -08:00
James E. Blair	a160484a86	Add zuul-scheduler tenant-reconfigure This is a new reconfiguration command which behaves like full-reconfigure but only for a single tenant. This can be useful after connection issues with code hosting systems, or potentially with Zuul cache bugs. Because this is the first command-socket command with an argument, some command-socket infrastructure changes are necessary. Additionally, this includes some minor changes to make the services more consistent around socket commands. Change-Id: Ib695ab8e7ae54790a0a0e4ac04fdad96d60ee0c9	2022-02-08 14:14:17 -08:00
Clark Boylan	1d4a6e0b71	Add a merger graceful command This command is an alias for merger stop as merger stop is already a graceful stop. We add this command to make this more clear and consistent with the executor. Change-Id: Iffba56b0127575eaadf31753e2a64dfd95f12fa6	2022-02-07 09:39:44 -08:00
James E. Blair	704fef6cb9	Add readiness/liveness probes to prometheus server To facilitate automation of rolling restarts, configure the prometheus server to answer readiness and liveness probes. We are 'live' if the process is running, and we are 'ready' if our component state is either running or paused (not initializing or stopped). The prometheus_client library doesn't support this directly, so we need to handle this ourselves. We could create yet another HTTP server that each component would need to start, or we could take advantage of the fact that the prometheus_client is a standard WSGI service and just wrap it in our own WSGI service that adds the extra endpoints needed. Since that is far simpler and less resounce intensive, that is what this change does. The prometheus_client will actually return the metrics on any path given to it. In order to reduce the chances of an operator configuring a liveness probe with a typo (eg '/healthy/ready') and getting the metrics page served with a 200 response, we restrict the metrics to only the '/metrics' URI which is what we specified in our documentation, and also '/' which is very likely accidentally used by users. Change-Id: I154ca4896b69fd52eda655209480a75c8d7dbac3	2021-12-09 07:37:29 -08:00
James E. Blair	b7e2e49f7f	Use sort_keys with json almost everywhere we write to ZK For almost any data we write to ZK (except for long-standing nodepool classes), add the sort_keys=True so that we can more easily determine whether an update is required. This is in service of zkobject, and is not strictly necessary because the json module follows dict insertion order, and our serialize methods are obviously internally consistent (at least, if they're going to produce the same data, which is all we care about). But that hasn't always been true and might not be true in the future, so this is good future-proofing. Based on a similar thought, the argument is also added to several places which do not use zkobject but which do write to ZK, in case we perform a similar check in the future. This seems like a good habit to use throughout the code base. Change-Id: Idca67942c057ab0e6b629b50b9b3367ccc0e4ad7	2021-11-12 15:50:02 -08:00
Felix Edel	220534c0f7	Store version information in component registry This stores the zuul version of each component in the component registry and updates the API endpoint. Change-Id: I1855b2a6db2bd330343cad69d9d6cf21ea35a1f5	2021-10-20 17:17:02 +02:00
James E. Blair	6fcde31c9e	Try harder to unlock failed build requests An OpenDev executor lost the ZK connection while trying to start a build, specifically at the stage of reading the params from ZK. In this case, it was also unable to unlock the build request after the initial exception was raised. The ZK connection was resumed without losing the lock, which means that the build request stayed in running+locked, so the cleanup method leaves it alone. There is no recovery path from this situation. To correct this, we will try indefinitely to unlock a build request after we are no longer working on it. Further, we will also try indefinitely to report the result to Zuul. There is still a narrow race condition noted inline, but this change should be a substantial improvement until we can address that. Also, fix a race that could run merge jobs twice and break their result There is a race condition in the merger run loop that allows a merge job to be run twice whereby the second run breaks the result because the job parameters where deleted during the first run. This can occur because the merger run loop is operating on cached data. It could be that a merge request is taken into account because it's unlocked but was already completed in a previous run. To avoid running the request a second time, the lock() method now updates the local request object with the current data from ZooKeeper and the merger checks the request's state again after locking it. This change also fixes the executor run loop as this one is using the same methods. Although we've never seen this issue there it might be hidden by some other circumstances as the executor API differs in some aspects from the merger API (e.g. dealing with node requests and node locking, no synchronous results). Change-Id: I167c0ceb757e50403532ece88a534c4412d11365 Co-Authored-By: Felix Edel <felix.edel@bmw.de>	2021-09-07 09:34:44 -07:00
James E. Blair	6a0b5c419c	Several merger cleanups This change contains several merger-related cleanups which seem distinct but are intertwined. * Ensure that the merger API space in ZK is empty at the end of all tests. This assures us that we aren't leaking anything. * Move some ZK untility functions into the base test class. * Remove the extra zk_client created in the component registry test since we can use the normal ZK client. * The null result value in the merger server is initialized earlier to make sure that it is initalized for use in the exception handler. * The test_branch_delete_full_reconfiguration leaked a result node because one of the cat jobs fails, and later cat jobs are run but ignored. To address the last point, we need to make a change to the cat job handling. Currently, when a cat job fails, the exception bubbles up and we simply ignore all the remaining jobs. The mergers will run them, write results to ZK, but no one will see those results. That would be fine, except that we created a "waiter" node in ZK to indicate we want to see those results, and as long as it exists, the results won't be deleted by the garbage collecter, yet we are no longer waiting for them, so we won't delete them either. To correct that, we store the merge job request path on the job future. Then, when the first cat job fails, we "cancel" all the cat jobs. That entails deleting the merge job request if we are able (to save the mergers from having to do useless work), and regardless of whether that succeeds, we delete the waiter node in ZK. If a cat job happens to be running (and if there's more than one, like in this test case, it likely is), it will eventually complete and write its result data. But since we have removed the waiter node, the periodic cleanup task will detect it as leaked data and delete. Change-Id: I49a459debf5a6c032adc60b66bbd8f6a5901bebe	2021-08-19 15:01:49 -07:00
James E. Blair	9fa3c6ec6e	Send merge completed events even in case of error The scheduler depends on merge completed events in order to advance the lifecycle of a queue item. Without them, items can be stuck in the queue indefinitely. In the case of certain merge errors, we may not have submitted a result to the event queue. This change corrects that. Change-Id: I9527c79868ede31f1fa68faf93ff113ac786462b	2021-08-19 10:21:21 -07:00
James E. Blair	15b589c1e4	Merger related cleanup * Include the merge request job uuid in the MergeCompletedEvent so so that it can be associated with the originating request. * repr() the MergeCompletedEvent with interesting information so the logs are more useful. * Remove some unused methods from the scheduler that are no longer needed since merge complete events are submitted directly from the merge server. Change-Id: I94db0d1cecfdcdb3745151f66b11749cd9850955	2021-08-19 09:58:02 -07:00
James E. Blair	e79493c519	Streamline unlocking in merger and builder run loops To help make the lock/unlock cycle a little easier to follow, keep the unlock call as close to the lock call as possible in the merger and executor run loops. Change-Id: Ia4b86d2d23cf0f5e7102714adcf1be6d28d89d47	2021-08-06 15:40:47 -07:00
James E. Blair	a729d6c6e8	Refactor Merger/Executor API The Merger and executor APIs have a lot in common, but they behave slightly differently. A merger needs to sometimes return results. An executor needs to have separate queues for zones and be able to pause or cancel jobs. This refactors them both into a common class which can handle job state changes (like pause/cancel) and return results if requested. The MergerApi can subclass this fairly trivially. The ExecutorApi adds an intermediate layer which uses a DefaultKeyDict to maintain a distinct queue for every zone and then transparently dispatches method calls to the queue object for that zone. The ZK paths for both are significantly altered in this change. Change-Id: I3adedcc4ea293e43070ba6ef0fe29e7889a0b502	2021-08-06 15:40:46 -07:00
Felix Edel	8038f9f75c	Execute merge jobs via ZooKeeper This is the second part of I767c0b4c5473b2948487c3ae5bbc612c25a2a24a. It uses the MergerAPI. Note: since we no longer have a central gearman server where we can record all of the merge jobs, some tests now consult the merger api to get the list of merge jobs which were submitted by that scheduler. This should generally be equivalent, but something to keep in mind as we add multiple schedulers. Change-Id: I1c694bcdc967283f1b1a4821df7700d93240690a	2021-08-06 15:40:41 -07:00
Simon Westphahl	bd2aeec5eb	Log result payload size of merger jobs Change-Id: Ifb611c899edbc4978333a4da79248791816586cd	2021-07-21 08:42:08 +02:00
Felix Edel	040f403e7f	Improve component registry This improves the usage of the component registry in various ways: 1. It adds a tree cache to the registry. The cache is eventual consistent, which should be sufficient for most use cases like calculating stats in the scheduler and getting a list of components without the need to ask ZooKeeper every time for the list of components. 2. Components can now be used as classes rather than dictionaries, which makes using and updating them much easier and nicer. 3. Components can be used without a registry. This makes registering components easier and you only need to instantiate a registry when you need the registry itself (e.g. in the scheduler). With that change the registry itself is not used anywhere in the production code because it's not required at this point. I will add this in the next commit. Change-Id: Ia8efba26114119eecffb9a89264083e4b8a80de0	2021-05-17 16:47:13 -07:00
James E. Blair	b9a6190a45	Support overlapping repos and a flat workspace scheme This adds the concept of a 'scheme' to the merger. Up to this point, the merger has used the 'golang' scheme in all cases. However it is possible with Gerrit to create a set of git repositories which collide with each other using that scheme: root/example.com/component root/example.com/component/subcomponent The users which brought this to our attention intend to use their repos in a flat layout, like: root/component root/subcomponent To resolve this we need to do two things: avoid collisions in all cases in the internal git repo caches of the mergers and executors, and give users options to resolve collisions in workspace checkouts. In this change, mergers are updated to support three schemes: * golang (the current behavior) * flat (new behavior described above) * unique The unique scheme is not intended to be user-visible. It produces a truly unique and non-conflicting name by using urllib.quote_plus. It sacrifices legibility in order to obtain uniqueness. The mergers and executors are updated to use the unique scheme in their internal repo caches. A new job attribute, 'workspace-scheme' is added to allow the user to select between 'golang' and 'flat' when Zuul prepares the repos for checkout. There is one more kind of repo that Zuul prepares: the playbook repo. Each project that supplies a playbook to a job gets a copy of its repo checked out into a dedicated directory (with no sibling repos). In that case there is no risk of collision, and so we retain the current behavior of using the golang scheme for these checkouts. This allows the playbook paths to continue to be self-explanatory. For example: trusted/project_0/example.com/org/project/playbooks/run.yaml Documentation and a release note are added as well. Change-Id: I3fa1fd3c04626bfb7159aefce0f4dcb10bbaf5d9	2021-04-29 17:56:24 -07:00
James E. Blair	d4c7d29360	Clarify merger updates and resets Several changes in an attempt to clarify exactly when updates and resets should and do happen: * Remove the repo_state argument from Merger.getRepo() It was unclear under what circumstances the low-level repo object honored repo_state (not much). Remove it entirely and rely on high-level Merger methods to deal with repo_state. * Have merger.setRepoState() operate on one project instead of a list of items Part of the reason we were passing repo_state to low-level methods was to reset the state for required projects in the executor. Essentially there were three cases: projects of change items, projects of non-change items, and projects of neither but in required-projects. The low-level repo_state usage only handled the last, the first is easy, and the second we handled by creating a list of non-change items and passing it to setRepoState on the merger. A simpler method of handling all of that is to reduce it to two cases: projects of change items (which need to be merged) and the rest (which need to be restored). If we do that, we can maintain a set of projects we've seen while merging in the first case, then iterate over all the remaining projects and call setRepoState on each in the second. * Remove the update call from Repo.reset() This lets us call Repo.reset() frequently (i.e., at the start of any operation that writes to the merger's git repo working dir) without performing a git fetch. We need to make sure we call Repo.update() where necessary. * Remove the reset call from Merger.updateRepo() This will now only call repo.update(), and even that will only happen if the repo_state says we should. So we can safely call this before any significant operations and know that it will update the repo if necessary. * Add an update() call to getRepoState() Because we removed the update() call from Repo.reset(), we need to add one here next to the existing call to reset(). * Add a reset call to getFiles() It relied on the reset in updateRepo. * Set execution_context to False on the executor's main merger The execution_context parameter determines whether we manipulate the origin remotes to point at the previous commit. This should be set for mergers that operate on the build work dir, but it should not be set for the main merger within the executor (so the main merger behaves just like a standalone merger). It previous was erroneously set for the executor's main merger and this change corrects that. * Add Merger.updateRepo() calls in the merger server merge method The merger needs to update and reset each repo before merging changes. Currently _mergeItem resets the repo the first time it encounters it. But we still need to update the repo. We don't want to update within the merger method because the executor performs batch updates in parallel before starting a merge and we don't want to re-do that work. So instead we add it to the merger server invocation, so it's only used in the merger:merge gearman function code path. Change-Id: I740e958357dc7bf0a6506474c5991da12ab6264e	2021-04-21 14:53:54 -07:00
James E. Blair	f7f689c87d	Revert "Revert "Make repo state buildset global"" This reverts commit `02ca9aeb8f`. This makes a couple of changes to make sure we're passing in the full repo_state to updateRepo rather than the project repo state. Change-Id: Ifca2cd48f24b9cf8eec718034c879ffe75fb6ecc	2021-04-21 14:53:54 -07:00
Tobias Henkel	02ca9aeb8f	Revert "Make repo state buildset global" We discovered a regression in the global repo state that can lead to wrong commits checked out on required projects. Further a fix for this needs a slight re-design of the reconfiguration process. In order to have some more time to do this revert it for now. This reverts commit `175990ec42`. Change-Id: Ibcf3758ab886a01468095a8c588cf78db209529e	2021-04-08 16:42:22 +02:00
Zuul	ab9e808def	Merge "Component Registry in ZooKeeper"	2021-03-13 14:35:06 +00:00
Jan Kubovy	22935c1177	Component Registry in ZooKeeper This change adds a component registry which can be used by different components, such as executors, mergers and others to register themselves, report their state and store arbitrary runtime information. This is needed to e.g., monitor components or to share the "accepting_work" state of executors later on. Change-Id: I4b7197d6cb399513e30d314f8a5f4f55ad9266f8	2021-03-12 13:51:48 -08:00
Zuul	591f6c40dc	Merge "Make repo state buildset global"	2021-03-09 18:22:26 +00:00
Felix Edel	2dfb34a818	Initialize ZooKeeper connection in server rather than in cmd classes Currently, the ZooKeeper connection is initialized directly in the cmd classes like zuul.cmd.scheduler or zuul.cmd.merger and then passed to the server instance. Although this makes it easy to reuse a single ZooKeeper connection for multiple components in the tests it's not very realistic. A better approach would be to initialize the connection directly in the server classes so that each component has its own connection to ZooKeeper. Those classes already get all necessary parameters, so we could get rid of the additional "zk_client" parameter. Furthermore it would allow us to use a dedicated ZooKeeper connection for each component in the tests which is more realistic than sharing a single connection between all components. Change-Id: I12260d43be0897321cf47ef0c722ccd74599d43d	2021-03-08 07:15:32 -08:00
Jonas Sticha	175990ec42	Make repo state buildset global Store repo state globally for whole buildset including inherited and required projects. This is necessary to avoid inconsistencies in case, e.g., a required projects HEAD changes between two dependent jobs executions in the same buildset. Change-Id: I872d4272d8a594b2a40dee0c627f14c990399dd5	2021-03-05 13:28:22 +01:00
Guillaume Chauvel	c0d46c2b37	merger cat: remove self._update duplicates similar to https://review.opendev.org/c/zuul/zuul/+/776842 self._update is called in the try section. Change-Id: I4d2991aac74b8ae4e5b8dc2c520c716ae9db645f	2021-02-24 17:48:08 +01:00
Guillaume Chauvel	73093e6d4b	merger fileschanges: remove self._update duplicates self._update is called in the try section. Change-Id: I8347e1fb964f86a99c118452145ea10f776387e7	2021-02-21 23:10:04 +01:00
Jan Kubovy	7ae2805a5a	Connect merger to Zookeeper Part of point 5 in https://etherpad.openstack.org/p/zuulv4 Connection is idle for now. Also update component documentation. Change-Id: I97a97f61940fab2a555c3651e78fa7a929e8ebfb	2021-02-15 14:44:18 +01:00
Clark Boylan	0f7982fee0	Clean up stale git index.lock files on merger startup We've noticed that if zuul executors (and presumably mergers) don't shut down gracefully that they may leak git index.lock files in the .git dirs of the merger repos. Since these repos should be dedicated to zuul's use without outside interference we can reasonably safely remove any present index.lock files when starting zuul mergers (and executors). This implementation does an os.walk under the merger repos root looking for .git dirs and once it has found them checks for any index.lock files. This happens before starting the gearman worker which should avoid any races with these resources. Change-Id: Ie043453bcdf4500a3718da6f705c882431acafdf	2020-09-17 15:19:16 -07:00
Simon Westphahl	48a64cfaa2	Correctly fail cat/fileschanges when update fails cat and fileschanges jobs were reported as updated even in cases were the repo update faild. The `Merger.updateRepo()` method will now let the Exception bubble up so it can be dealt with in the merger server handlers. The `ExecutorServer._innerUpdateLoop()` already handles exceptions properly. Change-Id: If2e44dc0449d427d16d6995b7cae9f4482984f48	2020-07-16 15:35:18 +02:00
Guillaume Chauvel	ab08ae3c7a	Fix quickstart gating, Add git name and email to executor When using quickstart tutorial to test gating, a merge commit cannot be created because git user.name and user.email are not set Change-Id: I62df8839e9637c10d3fd656cf6a3cb02cae40af1 Story: 2007603 Task: 39586	2020-05-31 15:01:18 +02:00
James E. Blair	04ac8287b6	Match tag items against containing branches To try to approach a more intuitive behavior for jobs which apply to tags but are defined in-repo (or even for centrally defined jobs which should behave differently on tags from different branches), look up which branches contain the commit referenced by a tag and use that list in branch matchers. If a tag item is enqueued, we look up the branches which contain the commit referenced by the tag. If any of those branches match a branch matcher, the matcher is considered to have matched. This means that if a release job is defined on multiple branches, the branch variant from each branch the tagged commit is on will be used. A typical case is for a tagged commit to appear in exactly one branch. In that case, the most intuitive behavior (the version of the job defined on that branch) occurs. A less typical but perfectly reasonable case is that there are two identical branches (ie, stable has just branched from master but not diverged). In this case, if an identical commit is merged to both branches, then both variants of a release job will run. However, it's likely that these variants are identical anyway, so the result is apparently the same as the previous case. However if the variants are defined centrally, then they may differ while the branch contents are the same, causing unexpected behavior when both variants are applied. If two branches have diverged, it will not be possible for the same commit to be added to both branches, so in that case, only one of the variants will apply. However, tags can be created retroactively, so that even if a branch has diverged, if a commit in the history of both branches is tagged, then both variants will apply, possibly producing unexpected behavior. Considering that the current behavior is to apply all variants of jobs on tags all the time, the partial reduction of scope in the most typical circumstances is probably a useful change. Change-Id: I5734ed8aeab90c1754e27dc792d39690f16ac70c Co-Authored-By: Tobias Henkel <tobias.henkel@bmw.de>	2020-03-06 13:29:18 -08:00
Tobias Henkel	1d1da5ae50	Centralize merge handling We have quite some duplicated code to support the merge functions on the executor and merger. Merge the common functionality into a BaseMergeServer class that can be used as a base class of MergeServer and ExecuteServer. Change-Id: I86d7053a5095baf32fc0da76af639667fb760c33	2020-02-14 13:20:55 +01:00
Tobias Henkel	130708b43c	Support pausing merge jobs Currently an executor still executes merge jobs even when it's paused. This is surprising to the user and an operational problem when having a misbehaving executor for some reason. Further the merger now also can be paused explicitly. Change-Id: I7ebf2df9d6648789e6bb2d797edd5b67a0925cfc	2020-02-14 13:20:15 +01:00
Tobias Henkel	5d35195b65	Unify gearman worker handling We currently have five gearman worker in the system which are all similar but different. In preparation of adding a sixth worker refactor them to all re-use a central class and the same config and dispatch mechanism. Change-Id: Ifbb4c5aec28fe5b044569d365a4e3fe31150eb3b	2019-07-15 10:09:15 +02:00
Tobias Henkel	5f423346aa	Filter out unprotected branches from builds if excluded When working with GitHub Enterprise the recommended working model is branch&pull within the same repo. This is especially necessary for workflows that combine multiple repos in a single workspace. This has the side effect that those repos can contain a large number of branches that never will be part of a job. Having many branches in a repo can have a large impact on the executor performance so exclude them from the repo state if we exclude them in the tenant config. This change only affects branches, not tags or other references. Change-Id: Ic8e75fa8bf76d2e5a0b1779fa3538ee9a5c43411	2019-06-25 20:49:54 +02:00
Tobias Henkel	7639053905	Annotate merger logs with event id If we have an event we should submit its id also to the merger so we're able to trace merge operations via an event id. Change-Id: I12b3ab0dcb3ec1d146803006e0ef644e485a7afe	2019-05-17 06:11:04 +02:00
Tobias Henkel	e69c9fe97b	Make git clone timeout configurable When dealing with large repos or slow connections to the scm the default clone timeout of 5 minutes may not be sufficient. Thus a configurable clone/fetch timeout can make it possible to handle those repos. Change-Id: I0711895806b7cbcc8b9fa3ba085bcf79d7fb6665	2019-01-31 11:17:05 +01:00
Zuul	91e7e680a1	Merge "Use gearman client keepalive"	2019-01-28 20:09:30 +00:00
Paul Belanger	47aa6b12b2	Ensure command_socket is last thing to close This updates all services to how zuul-scheduler works, we close the command_socket at the last possible moment. This also means we can now use the command socket on the filesystem as an idicator that zuul properly shutdown. Change-Id: I5fe1bc96c87e1177a2b94d73a9cbe505a7807202 Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2019-01-07 10:19:48 -05:00
Tobias Henkel	fb4c6402a4	Use gearman client keepalive If the gearman server vanishes (e.g. due to a VM crash) some clients like the merger may not notice that it is gone. They just wait forever for data to be received on an inactive connection. In our case the VM containing the zuul-scheduler crashed and after the restart of the scheduler all mergers were waiting for data on the stale connection which blocked a successful scheduler restart. Using tcp keepalive we can detect that situation and let broken inactive connections be killed by the kernel. Depends-On: I8589cd45450245a25539c051355b38d16ee9f4b9 Change-Id: I30049d59d873d64f3b69c5587c775827e3545854	2018-12-11 21:28:59 +01:00
James E. Blair	4e70bebafb	Map file comment line numbers After a build finishes, if it returned file comments, the executor will use the repo in the workspace (if it exists) to map the supplied line numbers to the original lines in the change (in case an intervening change has altered the files). A new facility for reporting warning messages is added, and if the executor is unable to perform the mapping, or the file comment syntax is incorrect, a warning is reported. Change-Id: Iad48168d41df034f575b66976744dbe94ec289bc	2018-08-15 14:38:03 -07:00
Fabien Boucher	194a2bf237	Git driver This patch improves the existing git driver by adding a refs watcher thread. This refs watcher looks at refs added, deleted, updated and trigger a ref-updated event. When a refs is updated and that the related commits from oldrev to newrev include a change on .zuul.yaml/zuul.yaml or zuul.d/*.yaml then tenants including that ref is reconfigured. Furthermore the patch includes a triggering model. Events are sent to the scheduler so jobs can be attached to a pipeline for running jobs. Change-Id: I529660cb20d011f36814abe64f837945dd3f1f33	2017-12-15 14:32:40 +01:00
Paul Belanger	765061143d	Add command socket support to zuul-merger Like we have in zuul-executor, add command socket support for zuul-merger. Change-Id: I66a2cb2ba3f55bdd03e884f47648278e30d2f6ab Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2017-12-06 16:05:27 -05:00

1 2

78 Commits