This uses the abide in zuul-web to directly access the tenant list
instead of asking the scheduler via RPC.
Relevant RPC listener methods are removed as they are not used
anywhere else.
Change-Id: I918cc3b9b8a6e0ba95a24dc55bf06976c6e2cfb5
This uses the abide in zuul-web to directly access the project listing
instead of asking the scheduler via RPC.
Relevant RPC listener methods are removed as they are not used anywhere
else.
Change-Id: Ie91d04b98a5e6cb6d54f80b71534daca050c02f8
This uses the abide in zuul-web to directly access the job
listing instead of asking the scheduler via RPC.
To allow encoding of MappingProxyTypes within cherrypy's response
handling, this implements a custom cherrypy JSON encoder based on
the ZuulJSONEncoder.
Relevant RPC listener methods are removed as they are not used
anywhere else.
Change-Id: I69d580876cdd290618e5c7b81c5fd0a35488cd12
Ensure that the provided claims are authorized by directly checking the
admin rules from the abide in zuul-web.
Change-Id: I00df437e3086706f481ae9fe18b7ecd1e2bb6c0b
Replace the RPC call for listing pipelines in zuul-web by generating the
JSON response directly from the abide.
Change-Id: Ia19f04c397e727824a0ccdf5cf7ce65df464bc54
With this change zuul-web will generate the status JSON on its own by
directly using the data from Zookeeper. This includes the event queue
lengths as well as the pipeline summary.
Change-Id: Ib80d9c019a15dd9de9d694cb62fd34030016c311
This implements the job freezing API directly in zuul-web, so there is
no need to call the scheduler via RPC.
Change-Id: Ibc7737a51fe5428bacdcb4763b3e6155cea29036
This adds (de)serialization to some helper classes in some cases,
but in others we simplify the attributes we store on the FrozenJob
so that we don't have to deal with objects any more.
Notably, the process of serializing playbooks is incorporated into
creating the frozen job (rather than creating the executor params)
since it uses quite a few helper classes each of which would need
to be (de)serialized as well.
We could probably turn the rest of these helper classes into
dictionaries too (which would be more efficient), but as they are
simple and only one level deep, it seemed easier to implement
serialization for them right now.
Change-Id: I924c94ede1b0b6747b1f5cf92e658a45e6956a93
As part of the work to freeze jobs into ZooKeeper, it was noted that
the freeze job API called setBase on a frozen job. This is not what
happens during actual job freezing and therefore the API produces
output that differs from what would actually be run. The setBase
method is already called at the start of the job freezing process and
sets up frozen playbooks and the initial inheritance path for the
first variant that is applied. Subsequent variants add their
playbooks using applyVariant. It is not clear why setBase was being
called a second time after all variants are applied. By removing this
call (which should not be necessary on a frozen job) the only change
in the test output is to remove a non-sensical item from the end of
the inheritance path (job project-test1 on common-config/zuul.yaml
line 53 -- that line is actually the 'base' job; it correctly appears
as the first entry in the list).
To resolve this, the apparently unnecessary setBase call is removed
from the freeze job API.
Change-Id: Ibb2bb8d17bf1a665467affe583f2b7780fb54322
These are serialized as attributes of the BuildSet. Essentially,
the JobGraph is a logic encapsulation class with a small amount of
data. These extra data are stored on the BuildSet object, and
the JobGraph is reconstructed from them as necessary. This saves
extra ZK network traffic for a small amount of data that has the
same lifecycle as the BuildSet.
It might be structurally simpler to just move all of this into the
BuildSet directly, but the logic is fairly complex and I think
benefits from staying in its own class.
The ProjectMetadata object needs to be serializable to complete
this.
Change-Id: Id451d2b24556f62927120ccdd2657772f12c787b
Currently the job_graph is an attribute of a QueueItem. However
it's really a characteristic of a BuildSet. It has a 1:1
correspondence with a BuildSet.
This doesn't make a big difference in the scheme of things, but
I think it does help keep our mental models in sync with the data
models. The only practical outcome of this is that we don't need
to explicitly zero-out the job_graph attribute on QueueItem when
we create a new BuildSet. Otherwise, this change is effectively
a no-op.
But since we're about to serialize the jobgraph data, I think it
makes sense to attach it to the object it's closest to.
Every remaining attribute of the QueueItem is independent of the
BuildSet.
Change-Id: Iaa8dd37d73471223459f3179bf38c87f95a6fd38
This is a ZKContext that can be used to tell the ZKObjects that they
shouldn't actually save to ZK. This makes dealing with transient
ZKObjects for job freezing (and unit testing) easier.
Change-Id: I66ec9a0417645a72e18311ac207fcd22c0849662
Build sets are stored in the following Zookeeper path:
/zuul/<tenant>/pipeline/<pipeline>/item/<item-uuid>/buildset/<buildset-uuid>
Since we need a build set's UUID for the path in Zookeeper we will no
longer use the UUID to determine if the build set was already
configured. Instead we'll use a simple flag.
Change-Id: If788d07ace51a01c5310c5c2d66ba1c4aba76d31
The change queue state is store in the following Zookeeper path:
/zuul/<tenant>/pipeline/<pipeline>/queue/<layout-uuid>/<queue-uuid>
Change-Id: I0a64bd9adc7b9f8f7a775280bb7a01ace22baac4
Save the state of a queue item in Zookeeper. The queue item is stored in
the following path:
/zuul/<tenant>/pipeline/<pipeline>/items/<item-uuid>
Since items can move between queues during a reconfiguration, we store
them direcly below the pipeline instead of the queue.
Change-Id: I50e8ae66026a099c0148d910359249653b3ca16d
This removes the RPC call (Gearman) in zuul-web to look up the live log
streaming address from the build objects in the scheduler and instead
uses the build requests stored in ZooKeeper.
As the address lookup is implemented as a shared library function which
is used by zuul-web and the fingergw, the fingergw is also switched from
RPC to ZooKeeper. The implementation itself was moved from
zuul.rpclistener to zuul.lib.streamer_utils.
To make the lookup via ZooKeeper work, the executor now stores its
worker information (hostname, log_port) on the build request when it
locks the request.
Additionally, the rpc client was removed from the fingergw as it's not
needed anymore. Instead the fingergw has now access to the component
registry and the executor api in ZooKeeper as both are needed to look up
the streaming address.
To not create unnecessary watches for build requests in each fingergw
and zuul-web component, the executor api (resp. the job_request_queue
base class) now provides a "use_cache" flag. The cache is enabled by
default, but if the flag is set to False, no watches will be created.
Overall this should reduce the load on the scheduler as it doesn't need
to handle the related RPC calls anymore.
Change-Id: I359b70f2d5700b0435544db3ce81d64cb8b73464
To make things simpler for schedulers to handle node provisioned
events for node requests which they may not have in their local
pipeline state, we need to make the pipeline storage of node requests
simpler. That starts by removing the nodeset object as an attribute
of the NodeRequest object. This means that the scheduler can work
with a node request object without relying on having the associated
nodeset. It also simplifies the ZooKeeper code that deserializes
NodeRequests (as it doesn't have to create fake NodeSet objects too).
And finally, it simplifies what must be stored in the pipeline
and queue item structures, which will also come in handy later.
Two tests designed to verify that the request->nodeset magic
deserialization worked have been removed since they are no longer
applicable.
Change-Id: I70ae083765d5cd9a4fd1afc2442bf22d6c52ba0b
If we're performing internal routing to a zoned fingergw, choose
one at random if there's more than one in order to load balance.
Change-Id: I46e32426d3a6dc2b2ebac7fdc54b5a61ad231d20
When using fingergw for inter region log streaming it can be desirable
to support ssl encrypted connections with client auth just like we do
with gearman. This will also make it easy to route traffic to the
finger gateway via an openshift route using SNI and pass-through.
Docs and release note added in a subsequent change.
Change-Id: Ia5c739a3fcf229da140c4e2ebbe1a771c63b0489
The serialized nodeset is now supplied as a build parameter,
which makes the synthetic hosts and groups parameters which are
derived from it redundant.
Update the executor to rely entirely on the deserialized nodeset.
We also rename the method which creates the parameters since they
are not used for gearman any more.
A subsequent change can remove the hosts and nodes parameters.
Change-Id: Ied7f78c332485e5c66b5721c1007c25660d4238e
Since we no longer create a new abide instance during reconfigurations
and just replace the tenant instance, the tenants are not longer sorted
in the order they appear in the config but depending on which one was
recently reconfigured.
From a users perspective it makes sense that the list of tenants is
sorted alphabetically.
Change-Id: Ie9ae0700fd3928cc1f8f93d1de77537ce97518cd
This uses the component registry rather than gearman to perform
fingergw routing lookups. It also adjusts the logic for routing
to match the latest version of the spec, where unzoned fingergw
process are expected to route to zoned fingergws if they exist
(because the unzoned fingergw might be a public gateway outside
of the zone).
Change-Id: I2f9fed03159db59cc4e496802b9dab05f746e1a2
In some distributed deployments we need to route traffic via single
entry points that need to dispatch the traffic. For this use case make
all components aware of their zone so it is possible to compute if
traffic needs to go via an intermediate finger gateway or not.
Therefore we register the gearman function 'fingergw:info:<zone>' if
the fingergw is zoned. That way the scheduler will be able to route
streams from different zones via finger gateways that are responsible
for their zone.
Change-Id: I655427283205ea02de6f0f271b4aa5092ac05278
We want to restrict layout access to job freezing so that fewer
actions (like creating a build request after a job is frozen) require
access to it.
To prevent components other than the pipeline manager from accessing the
layout, we will remove the layout as an attribute from the queue item
and store it in an internal cache of the manager.
Queue items will reference their (dynamic) layout via the layout's UUID.
An item's layout UUID will change if any of it's input to the layout
creation changed. Those inputs are the tenant layout and list of items
ahead. This means that during a re-enqueue and in case of a gate reset
we will set the layout UUID of an item back to None.
This also prepares us for the scale-out scheduler where we have to
re-calculate a layout on a scheduler if it is not available in the
cache, as we are not storing the layout in Zookeeper due to it's size.
Since the project metadata is needed by the executor client, which
should not access to layout anymore, the metadata is now available via
the job graph.
Change-Id: I93c7a3932fbf9ad8915d1d3d1fff9682778b28f8
Cleanup the implementation of enqueue and dequeue events in order to
simplify the event processing in the scheduler.
This removes the associated trigger event from the enqueue event and
introduces a common base class for de-/enqueue events.
Change-Id: I9b3edd1e6c45ecb1ce5ab9251bafe807bbefbbf7
Move the param generation for an execution job into a library for reuse.
The param preparation takes care of determining projects and connections
for dependant roles which saves a zuul-runner from needing to understand
canonical names for which it would need to query a scheduler.
Implement a basic API to freeze and grab these job params. These could
then be passed to another zuul-executor or other runner.
Change-Id: I681f2a3384c9a65ae0acc3fce966e8ec47005b64
Co-Authored-By: Tristan Cacqueray <tdecacqu@redhat.com>
On the way towards a fully scale out scheduler we need to move the
times database from the local filesystem into the SQL
database. Therefore we need to make at least one SQL connection
mandatory.
SQL reporters are required (an implied sql reporter is added to
every pipeline, explicit sql reporters ignored)
Change-Id: I30723f9b320b9f2937cc1d7ff3267519161bc380
Depends-On: https://review.opendev.org/621479
Story: 2007192
Task: 38329
We don't have any logging of status_get requests that are handled
within the scheduler. This can be a bottleneck in larger zuul
deployments so add logging of them and timing and payload sizes so we
can judge on further optimization efforts.
Change-Id: I50971b89959b26e60b754198f5f6de96e7ffacbd
The RPCListener processes all jobs serially within a single
thread. However we have some rpc calls like enqueue, dequeue that wait
until the action is done which can take multiple minutes if a
reconfiguration is in progress. During this time all of zuul-web is
blocked since it relies on the rpc mechanism to get information from
the scheduler. This can be solved generally when working towards the
HA scheduler by doing stuff using zk. Until then we can separate long
taking rpc calls into a different listener so status calls are not
blocked by enqueue/dequeue.
Change-Id: Ie5b07b7913d3c88bd267801b2edf09c39fedbe79
Operators can use the "{tenant.name}" special word when setting conditions'
values. This special word will be replaced at evaluation time by the
name of the tenant for which the authorization check is being done.
Change-Id: I6f1cf14ad29e775d9090e54b4a633384eef61085
To allow a tenant to use any labels *except* some pattern, add the
disallowed-labels tenant option. Both this and the allowed-labels
option use re2, and therefore lookahead assertions are not supported.
A complimentary option to allowed-labels is the only way to support
this use case.
Change-Id: Ic722b1d2b0b609ec7de583dab159094159f00630
The patchset or ref, pipeline and project should be enough to trigger an
enqueue. The trigger argument is not validated or used anymore when
enqueueing via RPC.
Change-Id: I9166e6d44291070f01baca9238f04feedcee7f5b
An authenticated user can query this endpoint to get an authorization
tree, letting her know which actions are available to her. This is
useful for frontends.
Change-Id: Ibda4eabe496f2c37a17a8ce2a2acfcf3e4cb97e3
New command for the zuul CLI client to retrieve autohold details.
Currently, the only new information included is the 'current_count'
field, but this will later be extended to include held nodes.
Change-Id: Ieae2aea73123b5467d825d4738be07481bb15348
Storing autohold requests in ZooKeeper, rather than in-memory,
allows us to remember requests across restarts, and is a necessity
for future work to scale out the scheduler.
Future changes to build on this will allow us to store held node
information with the change for easy node identification, and to
delete any held nodes for a request using the zuul CLI.
A new 'zuul autohold-delete' command is added since hold requests
are no longer automatically deleted.
This makes the autohold API:
zuul autohold: Create a new hold request
zuul autohold-list: List current hold requests
zuul autohold-delete: Delete a hold request
Change-Id: I6130175d1dc7d6c8ce8667f9b14ae9377737d280
Add an "authorize_user" RPC call allowing to test a set of claims
against the rules of a given tenant. Make zuul-web use this call
to authorize access to tenant-scoped privileged actions.
Change-Id: I50575f25b6db06f56b231bb47f8ad675febb9d82
Sometimes, e.g. during reconfiguration, it can take quite some time
between the trigger event and when a change is enqueued.
This change allows tracking the time it takes from receiving the event
until it is processed by the scheduler.
Change-Id: I347acf56bc8d7671d96f6be444c71902563684be