In some distributed deployments we need to route traffic via single
entry points that need to dispatch the traffic. For this use case make
all components aware of their zone so it is possible to compute if
traffic needs to go via an intermediate finger gateway or not.
Therefore we register the gearman function 'fingergw:info:<zone>' if
the fingergw is zoned. That way the scheduler will be able to route
streams from different zones via finger gateways that are responsible
for their zone.
Change-Id: I655427283205ea02de6f0f271b4aa5092ac05278
A Gearman client can set a client id which is then used on the server
side to identify the connection. Lack of a client_id makes it harder to
follow the flow when looking at logs:
gear.Connection.b'unknown' INFO Connected to 127.0.0.1 port 4730
gear.Server Accepted connection
<gear.ServerConnection ... name: None ...>
^^^^
In RPCClient, introduce a client_id argument which is passed to
gear.Client().
Update callers to set a meaningful client_id.
Change-Id: Idbd63f15b0cde3d77fe969c7650f4eb18aec1ef6
The patchset or ref, pipeline and project should be enough to trigger an
enqueue. The trigger argument is not validated or used anymore when
enqueueing via RPC.
Change-Id: I9166e6d44291070f01baca9238f04feedcee7f5b
New command for the zuul CLI client to retrieve autohold details.
Currently, the only new information included is the 'current_count'
field, but this will later be extended to include held nodes.
Change-Id: Ieae2aea73123b5467d825d4738be07481bb15348
Storing autohold requests in ZooKeeper, rather than in-memory,
allows us to remember requests across restarts, and is a necessity
for future work to scale out the scheduler.
Future changes to build on this will allow us to store held node
information with the change for easy node identification, and to
delete any held nodes for a request using the zuul CLI.
A new 'zuul autohold-delete' command is added since hold requests
are no longer automatically deleted.
This makes the autohold API:
zuul autohold: Create a new hold request
zuul autohold-list: List current hold requests
zuul autohold-delete: Delete a hold request
Change-Id: I6130175d1dc7d6c8ce8667f9b14ae9377737d280
Users can set the [webclient] section in their zuul.conf file so that the CLI
relies on REST calls rather than RPC. The CLI accepts a new --auth-token
argument allowing remote users to use privileged REST endpoints.
Change-Id: I5a07fccfd787246c4c494db592b575fbdf90ddb1
If the gearman server vanishes (e.g. due to a VM crash) some clients
like the merger may not notice that it is gone. They just wait forever
for data to be received on an inactive connection. In our case the VM
containing the zuul-scheduler crashed and after the restart of the
scheduler all mergers were waiting for data on the stale connection
which blocked a successful scheduler restart. Using tcp keepalive we
can detect that situation and let broken inactive connections be
killed by the kernel.
Depends-On: I8589cd45450245a25539c051355b38d16ee9f4b9
Change-Id: I30049d59d873d64f3b69c5587c775827e3545854
Add the ability for an operator to dequeue a change from a pipeline.
Change-Id: I4524291807c8b97b62cfaa31fb5d46dc48adbac9
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
Add the option --node-hold-expiration to `zuul autohold`. This parameter
allows an operator to specify how long a node set should remain in
HOLD state after a build failure.
Change-Id: I25020d1722de97426e6699653ff72eba03c46b16
Depends-On: I9a09728e5728c537ee44721f5d5e774dc0dcefa7
Add support for "scoped" autohold requests, making it possible
to create requests that match a given ref filter and hold nodes
for build that matches the filter.
Introduce two new arguments to `zuul autohold`: --ref and --change,
with --ref accepting regex that builds are matched against, and
--change being a shortcut that creates a filter for the specific change.
Change-Id: I801ba1af4b1bda46abff07791dd96828f2738621
Adds the 'autohold' client option, the scheduler implementation
of it, and a unit test for it.
The autohold is automatically removed from the in-memory data
structure once we've reached the number of requested runs of
the job.
Story: 2000905
Change-Id: Ieac0b5fee6801313fa23cce69520eb348735ad99
zuul now provides socket-based console streaming, which is super cool.
In order to have jenkins parity with web streaming, we need to provide a
websocket (javascript in browsers can't really connect to random ports
on servers)
After surveying the existing python websocket options, basically all of
them are based around twisted, eventlet, gevent or asyncio. It's not
just a thing we can easily deal with from our current webob/paste
structure, because it is a change to the fundamental HTTP handling.
While we could write our own websocket server implementation that was
threaded like the rest of zuul, that's a pretty giant amount of work.
Instead, we can run an async-based server that's just for the
websockets, so that we're not all of a sudden putting async code into
the rest of zuul and winding up frankensteined. Since this is new code,
using asyncio and python3 seems like an excellent starting place.
aiohttp supports running a websocket server in a thread. It also
supports doing other HTTP/REST calls, so by going aiohttp we can set
ourselves up for a single answer for the HTTP tier.
In order to keep us from being an open socket relay, we'll expect two
parameters as the first message on the websocket - what's the zuul build
uuid, and what log file do we want to stream. (the second thing,
multiple log files, isn't supported yet by the rest of zuul, but one can
imagine a future where we'd like to support that too, so it's in the
protocol) The websocket server will then ask zuul over gearman for the
IP and port associated with the build and logfile and will start
streaming it to the socket.
Ultimately we'll want the status page to make links of the form:
/console.html?uuid=<uuid>&logfile=console.log
and we'll want to have apache map the websocket server to something like
/console.
Co-Authored-By: Monty Taylor <mordred@inaugust.com>
Change-Id: Idd0d3f9259e81fa9a60d7540664ce8d5ad2c298f
Enable SSL support for gearman. We also created an new SSLZuulBaseTest
class to provide a simple way to use SSL end to end where possible. A
future patch will enable support in zookeeper.
Change-Id: Ia8b89bab475d758cc6a021988f8d79ead8836a9d
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
This makes the transition to python3 much smoother.
Change-Id: I9d8638dd98502bdd91cbe6caf3d94ce197f06c6f
Depends-On: If6bfc35d916cfb84d630af59f4fde4ccae5187d4
Depends-On: I93bfe33f898294f30a82c0a24a18a081f9752354
Here we are adding tenant support and re-enabling unit tests for
enqueue and promote.
Change-Id: I384128b9b14be1dc3c4a0c914dcaf13d30f1792f
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
In practice we are seeing that geard can occasionally get disrupted
and then temporarily backlogged enough that it exceeds the 30 second
timeout for submitting a job. To make Zuul less fragile in this case,
increase the timeouts for any requests submitted to gearman.
Change-Id: I12741bb259c1a78fa2446d764318f84df34bac67
Takes one or more changes and promotes them to the head of the queue.
Also, change the command line syntax for the enqueue command to accept
change IDs in the form 'change,patchset' in order to match the syntax
of promote, as well as be potentially more compatible with future
triggers.
Change-Id: Ic7ded9587c68217c060328bf4c3518e32fe659e3
Add a command line client called 'zuul' that supports one command
to start with: 'enqueue'. It allows an operator (one with access
to the gearman server) to enqueue an arbitrary change in a specified
pipeline. It uses gearman to communicate with the Zuul server, which
now has an added RPC listener component to answer such requests via
gearman.
Add tests for the client RPC interface.
Raise an exception if a Gerrit query does not produce a change. Unlike
events from Gerrit, user (or admin) submitted events over the RPC bus
are more likely to reference invalid changes. To validate those, the
Gerrit trigger will raise an exception (and remove from its cache) changes
which prove to be invalid.
Change-Id: Ife07683a736c15f4db44a0f9881f3f71b78716b2