When the dependency graph exceeds the configured size we will raise an
exception. Currently we don't handle those exceptions and let them
bubble up to the pipeline processing loop in the scheduler.
When this happens during trigger event processing this is only aborting
the current pipeline handling run and the next scheduler will continue
processing the pipeline as usual.
However, in case where the item is already enqueued this exception can
block the pipeline processor and lead to a hanging pipeline:
ERROR zuul.Scheduler: Exception in pipeline processing:
Traceback (most recent call last):
File "/opt/zuul/lib/python3.11/site-packages/zuul/scheduler.py", line 2370, in _process_pipeline
while not self._stopped and pipeline.manager.processQueue():
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/zuul/lib/python3.11/site-packages/zuul/manager/__init__.py", line 1800, in processQueue
item_changed, nnfi = self._processOneItem(
^^^^^^^^^^^^^^^^^^^^^
File "/opt/zuul/lib/python3.11/site-packages/zuul/manager/__init__.py", line 1624, in _processOneItem
self.getDependencyGraph(item.changes[0], dependency_graph, item.event,
File "/opt/zuul/lib/python3.11/site-packages/zuul/manager/__init__.py", line 822, in getDependencyGraph
self.getDependencyGraph(needed_change, dependency_graph,
File "/opt/zuul/lib/python3.11/site-packages/zuul/manager/__init__.py", line 822, in getDependencyGraph
self.getDependencyGraph(needed_change, dependency_graph,
File "/opt/zuul/lib/python3.11/site-packages/zuul/manager/__init__.py", line 822, in getDependencyGraph
self.getDependencyGraph(needed_change, dependency_graph,
[Previous line repeated 8 more times]
File "/opt/zuul/lib/python3.11/site-packages/zuul/manager/__init__.py", line 813, in getDependencyGraph
raise Exception("Dependency graph is too large")
Exception: Dependency graph is too large
To fix this, we'll handle the exception and remove the affected item.
We'll also handle the exception during enqueue and ignore the trigger
event in this case.
Change-Id: I210c5fa4c568f2bf03eedc18b3e9c9a022628dc3
This error is used in some places, but not all. Correct that to
improve config error structured data.
Change-Id: Ice4fbee679ff8e7ab05042452bbd4f45ca8f1122
This error exception class went unused, likely due to complications
from circular imports.
To resolve this, move all of the configuration error exceptions
into the exceptions.py file so they can be imported in both
model.py and configloader.py.
Change-Id: I19b0f078f4d215a2e14c2c7ed893ab225d1e1084
Using a badly formatted token resulted in an error 500 from zuul-web.
Return a more precise error message and an error 401 in zuul-web when
this occurs.
Also fix a typo in default messages for some auth-related exceptions.
Change-Id: I4abe013e76ac51c3dad7ccd969ffe79f5cb459e3
This removes the RPC call (Gearman) in zuul-web to look up the live log
streaming address from the build objects in the scheduler and instead
uses the build requests stored in ZooKeeper.
As the address lookup is implemented as a shared library function which
is used by zuul-web and the fingergw, the fingergw is also switched from
RPC to ZooKeeper. The implementation itself was moved from
zuul.rpclistener to zuul.lib.streamer_utils.
To make the lookup via ZooKeeper work, the executor now stores its
worker information (hostname, log_port) on the build request when it
locks the request.
Additionally, the rpc client was removed from the fingergw as it's not
needed anymore. Instead the fingergw has now access to the component
registry and the executor api in ZooKeeper as both are needed to look up
the streaming address.
To not create unnecessary watches for build requests in each fingergw
and zuul-web component, the executor api (resp. the job_request_queue
base class) now provides a "use_cache" flag. The cache is enabled by
default, but if the flag is set to False, no watches will be created.
Overall this should reduce the load on the scheduler as it doesn't need
to handle the related RPC calls anymore.
Change-Id: I359b70f2d5700b0435544db3ce81d64cb8b73464
A user with the right JSON Web Token (JWT) can trigger a autohold,
reenqueue or dequeue a buildset from the web API.
The Token is expected to include a key called "zuul.admin" that
contains a list of the tenants the user is allowed to perform
these actions on.
The Token must be passed as a bearer token in an Authorization header.
The Token is validated thanks to authenticator declarations in Zuul's
configuration file.
Change-Id: Ief9088812f44368f14234ddfa25ba872526b8735
Currently zuul-cloner does not support post jobs, as it does not know
what to checkout. This adds the ability on a per project basis to
specify a revision to be checked out. When specified zuul-cloner
will successfully check out the same repo as gerrit-git-prep.sh does
in post jobs.
Sample usage:
clonemap:
- name: openstack/neutron
dest: ./neu
- name: openstack/requirements
dest: ./reqs
export ZUUL_PROJECT="openstack/neutron"
export ZUUL_NEWREV="a2Fhc2Rma2FzZHNkZjhkYXM4OWZhc25pb2FzODkK"
export ZUUL_BRANCH="stable/liberty"
zuul-cloner -m map.yaml git://git.openstack.org $ZUUL_PROJECT \
openstack/requirements
This results with openstack/neutron checked out at rev a2Fhc2 and
openstack/requirements at 'heads/stable/liberty'
Change-Id: Ie9b03508a44f04adfbe2696cde136439ebffb9a6
This is a large refactor and as small as I could feasibly make it
while keeping the tests working. I'll do the documentation and
touch ups in the next commit to make digesting easier.
Change-Id: Iac5083996a183d1d8a9b6cb8f70836f7c39ee910