This commit in Ansible:
9142be2f6c
now allows Python modules to specify their interpreter with the shebang.
We expect our roles to use the discovered python interpreter on remote
nodes, and on the executor, we need them to use the virtualenv. Removing
the specific shebang accomplishes this under Ansible 6, and has no effect
under older versions of Ansible.
Without this, for example, the log upload roles would not have access to
their cloud libraries.
Also update our ansible/cli check in our module files. Many of our modules
can be run from the command line for ease of testing, but the check that we
perform to determine if the module is being invoked from the command line
or Ansible fails on Ansible 5. Update it to a check that should work in
all 4 versions of Ansible that Zuul uses.
Change-Id: I4e6e85156459cca032e6c3e1d8a9284be919ccca
Include calls to `df -i` (inode counts) and `df -m` (megabytes data)
in validate-host, to aid in troubleshooting build failures where the
builds start out with too little free space. This way the initial
capacity and utilization of all available filesystems will be
recorded with other basic node diagnostic data.
Change-Id: Iba195e7c5cec372c6ba4daf7059da5b6fb6740ec
Make it possible for a site to demand that the validate-host role
finds IPv4 and/or IPv6 routes, making one or both explicitly
mandatory, instead of the default behavior of succeeding as long as
at least one is available. This allows a site to, for example,
discard nodes during a pre playbook if they lack IPv4 connectivity.
Change-Id: Icaa82212468a659a3756ed51cac442de33065b55
The argument here is an integer "limit", not the exception.
I think that we only notice this on Python 3 because of exception
chaining. It causes a real failure though because the exception
handler that is meant to fall into "pass" raises another exception
when ipv6 doesn't work.
Change-Id: I0908a0a3dbb2356caabbffd062379751a0b61c41
Split the network testing component of the validate-host rule into a
separate task, so it can be retried a couple of times in case
something is a bit slow about bringing up external networking. Add
failure collection of unbound logs if they appear to be in some common
locations (such as will be there on infra nodes).
Change-Id: Id12f1ba064fa2e5f75b9a5cfba76d238d23d3f57
These traceroutes currently fail in a very opaque way. We are
occasionally seeing Fedora nodes fail in here (which is odd, because
obviously networking is up enough for zuul to connect) and don't have
much to go on.
When an exception is caught, add the output, return code and basic
traceback to the return attributes, and include them in the failure
case.
Depends-On: https://review.openstack.org/563702
Change-Id: I047bf2b1daa22a5b6bfc12b3f42b108975097409
The zuul_debug_info library calls traceroute, which is in /usr/sbin
and not in /sbin on SUSE (and those two are not linked to each other).
Also capture the OSError that occurs when the binary isn't there.
Change-Id: Ic5e31a417415f830d7697abfbb2ae71f2ae20935
Change the default parameters to the role to be zuul site variables.
Because of variable precedence, having these not be site variables means
someone could override them in a job. Since one of the actions is to
read and log the contents of a file, we likely don't want to give people
the ability to do that with an arbitrary file.
The traceroute host isn't as important to be a site variable, but it's
still not actually something that jobs should override - it's a feature
of the deployment.
Both variables work if they are not set, so deployers should still be
able to use this role without defining site-variables. But it should be
made clear to them that if they want those features they really should
define the locations in a site-variable and not in a normal job
variable.
configure-mirror similarly allows in-job override, but maybe that's ok
for now and leaving the site-variable value as a default is fine?
Finally, add a new zuul_site_image_manifest_files list, so that we can
specify more than one file to read. Set the defaults of it to be the
files that the dib nodepool elements emit. We'll also look in to pushing
those manifest files up a level into dib so that expecting nodepool
nodes to have them is even more reasonable.
Change-Id: I632a32fdfac4bfe57eb269ac8e183fb8df34d48f