Reorganizing docs as recommended in:
https://www.divio.com/blog/documentation/
This is simply a reorganization of the existing documents and changes
no content EXCEPT to correct the location of sphinx doc references.
Expect followup changes to change document names (to reflect the new
structure) and to move content from existing guides (e.g., to move the
pipeline/project/job structure definitions out of the "Project Configuration"
reference guide into their own reference documents for easier locatability).
All documents are now located in either the "overview", "tutorials",
"discussions", or "references" subdirectories to reflect the new structure
presented to the user. Code examples and images are moved to "examples" and
"images" root-level directories.
Developer specific documents are located in the "references/developer"
directory.
Change-Id: I538ffd7409941c53bf42fe64b7acbc146023c1e3
Currently we only can modify the tenant configuration by triggering a
full reconfiguration. However with many large tenants this can take a
long time to finish. Zuul is stalled during this process. Especially
when the system is at quota this can lead to long job queues that
build up just after the reconfiguration. This adds support for a smart
reconfiguration that only reconfigures tenants that changed their
config. This can speed up the reconfiguration a lot in large
multi-tenant systems.
Change-Id: I6240b2850d8961a63c17d799f9bec96705435f19
Under ansible 2.8, ansible_python_interpreter now defaults to auto. This
could break some users playbooks. Lets add an upgrade note.
Change-Id: Ia7c36ed364f553e4c30347c72cde59879f5ac5d9
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
The Zuul admin can configure authenticators with an optional
"max_validity_time" field, which is the maximum age in seconds
for a valid authentication token. By default there is no
maximum age set for tokens, except the one deduced from
the token's "exp" claim.
If "max_validity" is set, tokens without an "iat" claim will
be rejected.
This is meant as an extra security to avoid accidentally issueing
very long lived tokens through the CLI.
The "skew" field can be used to mitigate clocks discrepancies
between Zuul and a JWT emitter.
Change-Id: I9351ca016b60050b5f3b3950b840d5f719e919ce
In some cases, especially on systems under heavy load, it is
helpful to start executors in paused mode. Preventing them
to start accepting new jobs right away until such executors
are unpaused manually allows to test new features, configuration
or with analysing production problems.
Change-Id: I64c39e3b58c802577201280c855fdf7f13cc7538
Updates the environment variable processing to only affect variables
prefixed with ZUUL_.
Adds a test showing the os.environ with % in it.
This reverts commit b3929b5633.
Change-Id: Ic6c3dd0327ef70dc1375486827e4503a4cea9bfc
In prod for OpenDev we're seeing things like this:
http://paste.openstack.org/show/785704/
which lead us to believe this is somehow connected.
This reverts commit f2229705f3.
Change-Id: I0b73b71f72483e6c6e511411c3c59729761cec9b
Add the correct libre2 package name for Debian buster, and also
update the quickstart playbook and documentation to deal with
the change in default rsa key encoding format from newer
versions of ssh-keygen.
Change-Id: I6ada88cd896d844c1171f7bcaf4691dea023d51f
This will allow users to set environment variables with sensitive
strings like passwords, but keep a single config file. This comes
in handy when using Kubernetes in particular, as it wants to
handle sensitive data and templated config files in a very different
manner.
Change-Id: I38f6c4da82e1647ad197908f19ea6df23e04fc32
The nodepool "python-path" config variable makes it's way through from
the node arguments and ends up as the "ansible_python_interpreter"
variable for the inventory when running the job.
Notably, Python 3 only distributions require this to be set to
/usr/bin/python3 to avoid what can often be confusing red-herring
errors (e.g. things like dnf packages incorrectly appearing to be
missing on Fedora, for example [1]).
Upstream is aware of this often confusing behaviour and has made an
"ansible_python_interpreter" value of "auto" to, essentially, "do the
right thing" [2] and choose the right python for the target
environment. This is available in Ansible >=2.8 and will become
default in 2.12.
This allows, and defaults to, an interpreter value of "auto" when
running with Ansible >=2.8. On the supported prior Ansible releases,
"auto" will be translated into "/usr/bin/python2" to maintain
backwards compatability. Of course a node explicity setting
"python-path" already will override this.
Nodepool is updated to set this by default with
I02a1a618c8806b150049e91b644ec3c0cb826ba4.
I think this is much more user friendly as it puts the work of
figuring out what platform has what interpreter into Ansible. It
alleviates the need for admins to know anything at all about
"python-path" for node configurations unless they are actually doing
something out of the ordinary like using a virtualenv. At the moment,
if you put a modern Python-3 only distro into nodepool, Zuul always
does the wrong thing by selecting /usr/bin/python2; you are left to
debug the failures and need to know to go and manually update the
python-path to Python 3.
Documentation is updated. Detailed discussion is moved into the
executor section; the README is simplified a bit to avoid confusion.
A release note is added.
A test-case is added. Note that it is also self-testing in that jobs
using Ansible 2.8 use the updated value
(c.f. I7cdcfc760975871f7fa9949da1015d7cec92ee67)
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1696404
[2] https://docs.ansible.com/ansible/2.8/reference_appendices/interpreter_discovery.html
Change-Id: I2b3bc6d4f873b7d653cfaccd1598464583c561e7
This adds max_hold_expiration and default_hold_expiration as
scheduler options.
max_hold_expiration sets the absolute maximum age, in seconds,
a node placed in the hold state will remain available. This
defaults to 0, which means there is no maximum.
default_hold_expiration sets the default value used if no value
is supplied. This defaults to max_hold_expiration.
Change-Id: Ia483ac664e0a2adcec9efb29d3d701f6d315ef3b
This change adds a zuul.conf option to disable the global merge
jobs from running on an executor node.
Change-Id: Icd6374a6c97a404662b39de9df54f4b7c5ab36aa
An executor is accepting up to twice as many starting builds as defined
by the load_multiplier option. On system with high CPU/vCPU count an
executor may accept too many starting builds. This can be overwritten
using a new max_starting_builds option.
Change-Id: Ic7c121e795e4e3cecec25b2b06dd1a26aa798439
This adds a tenant option to use the Zuul web build page as the
URL reported to the code review system when a build completes.
The setting is per-tenant (because it requires that the tenant
have a working SQL reporter configured in all pipelines) and
defaults to false, since we can't guarantee that. In the future,
we expect to make SQL reporting implicit, then this can default
to true and eventually be deprecated.
A new zuul.conf option is added and marked required to supply
the root web URL. As we perform further integration with the web
app, we may be able to deprecate other similar settings, such
as "status_url".
Change-Id: Iaa3be10525994722d020d2aa5a7dcf141f2404d9
Add an "authorize_user" RPC call allowing to test a set of claims
against the rules of a given tenant. Make zuul-web use this call
to authorize access to tenant-scoped privileged actions.
Change-Id: I50575f25b6db06f56b231bb47f8ad675febb9d82
Users can set the [webclient] section in their zuul.conf file so that the CLI
relies on REST calls rather than RPC. The CLI accepts a new --auth-token
argument allowing remote users to use privileged REST endpoints.
Change-Id: I5a07fccfd787246c4c494db592b575fbdf90ddb1
A user with the right JSON Web Token (JWT) can trigger a autohold,
reenqueue or dequeue a buildset from the web API.
The Token is expected to include a key called "zuul.admin" that
contains a list of the tenants the user is allowed to perform
these actions on.
The Token must be passed as a bearer token in an Authorization header.
The Token is validated thanks to authenticator declarations in Zuul's
configuration file.
Change-Id: Ief9088812f44368f14234ddfa25ba872526b8735
To increase the chances that job_dir and git_dir are on the
same filesystem in the default configuration, set the default
job_dir to /var/lib/zuul/builds.
Also, all zuul container images specify a volume at /var/lib/zuul,
therefore the docker-compose file does not need to specify the
same for the scheduler container.
Due to this, we were inadvertently running the executor with the
git_dir and job_dir on different filesystems in the quick-start.
That should no longer be the case.
Change-Id: I2fe5eea588006da7181c3ea8ad2637598764e8f1
Update our diagram to show the connections needed if running a database.
Change-Id: I67e47b1916ac1c3ad1f06b9b65c4b1e78aa6a55f
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
Show how zuul-web and zuul-fingergw need to connect to zuul-executors
for log streaming.
Change-Id: Ia985979c16d8276c13b1ba7ffbbb5a2224ccff01
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
I found myself looking into source code to see which services I needed
to open firewall ports for. It seems only the executors and scheduler
send traffic to statsd today.
Change-Id: If7a02bb2658435d3ce7435e5ad061cd1224eb3da
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
In order to make running zuul easier we want the possiblility that the
executor installs the supported ansible versions during startup. This
adds this functionality as well as a config switch to disable it and a
config option to optionally specify the install location. The default
location is <state_dir>/ansible-bin.
Change-Id: I1858e4fb40190626d001e20b48cf7e69ad35d634
Currently the default ansible version is selected by the version of
zuul itself. However we want to make this configurable per deployment
(zuul.conf), tenant and job.
Change-Id: Iccbb124ac7f7a8260c730fbc109ccfc1dec09f8b
This default is unlikely to be correct and has caused confusion
for us in the past. Remove it (which matches the documentation).
Change-Id: I3453b0e918fb1c6783514c470f40f4e973fd683a
Our current approach to enforce the disk limit per job can be very
expensive by running 'du' in a loop. When having many repos in the
cache and many running jobs this can poison the cache and induce a
large amount of IO load. This can influence overall performance
especially if zuul is running on a shared storage like ceph.
Change-Id: Ic03168e30e0cba4a4adb42eebf4709ceba0d8c3e
We're presuming, by default, that a user named "zuul" exists, which
is not the case in some environments. Change the default to avoid
dropping privileges, and require that this be explicitly set in order
to do so.
Change-Id: Ia677d2615dd9292a809df4c8859a60b7f4df6243
When dealing with large repos or slow connections to the scm the
default clone timeout of 5 minutes may not be sufficient. Thus a
configurable clone/fetch timeout can make it possible to handle those
repos.
Change-Id: I0711895806b7cbcc8b9fa3ba085bcf79d7fb6665
In some environments the setup playbook can sometimes take more than
60 seconds. In order to avoid retry limits in these cases make this
timeout configurable.
Change-Id: Ib45957df12b34ddaeec79eb10f7b2e99091dcad1
This adds a reno note for upgrading to 3.4.0 about zuul-web needing
access to zookeeper now. Also update our components diagram too.
Change-Id: I60e9eaa6cc78306e71869602e330b4bec435d158
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
When calculating relative_priority for independent pipelines,
use shared change queues just as is done for dependent pipelines.
To implement this, we now calculate shared change queues for all
pipelines, not just dependent ones, though we don't use those
queues for any purpose other than this.
Change-Id: I59b1090ca1f4fcc72276445e6ff4c5cf4f2f5030
Add a relative_priority field to node requests and continuously
adjust it for each queue item based on the contents of queues.
This allows for a more fair distribution of build resources between
different projects. The first item in a pipeline from a given
project (or, in the case of a dependent pipeline, group of projects)
has equal priority to all other first-items of other projcets in
the same pipeline. Second items have a lower priority, etc.
Depends-On: https://review.openstack.org/620954
Change-Id: Id3799aeb2cec6d96a662bfa394a538050f7ea947
Create a new config setting to allow zuul executors to be grouped into
zones. By default, this setting is disabled (set to None), to keep
backwards compat.
Story: 2001125
Task: 4817
Change-Id: I345ee2d0c004afa68858eb195189b56de3d41e97
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
Both the operation and the read timeout can now be configured in the
zuul-executor main configuration file. If the network is flaky,
increasing these numbers from their defaults might help to lower the
rate of aborted Windows builds.
Change-Id: I4c25ca6027fc4150ec1c9c49ed286e7b4f20d4dd
This makes a number of changes to the installation/configuration
documentation in the admin manual.
Remove quick-start guide. The process of quick-starting is
covered by the installation and setup tutorial, which is now the
first of the installation sections. The reference material from
quick-start is now in the tutorial.
Rename the tutorial quick-start. It's nice to have something
named quick-start, and the tutorial fits the bill.
Rename the installation section "Installation Reference". This
now has more detailed information about installation and
deployment choices, but has very little procedural documentation.
Make zuul-from-scratch more internally consistent in style (use
code-block:: shell and heredocs wherever possible).
Change-Id: I7e4714ce5e775dc9ac0988c3470eef1f74fb36d6