zuul/zuul - zuul - OpenDev: Free Software Needs Free Tools

Commit Graph

Author	SHA1	Message	Date
James E. Blair	179fa02ed0	Build a new skopeo for the zuul-executor container image New versions of docker are no longer compatible with old versions of skopeo. To correct this, build a new version of skopeo for the container images. We need 1.14+ which is not available in debian yet, so we build 1.15 (the latest tagged release) from source. Change-Id: I5a5c351e90b06d3acdd02f3117aa29eafb72445e	2024-03-21 12:48:32 -07:00
Simon Westphahl	e3104f3e5c	Prevent exception when getting namespace PIDs ERROR zuul.AnsibleJob: [e: ...] [build: ...] Unable to list namespace pids Traceback (most recent call last): File "/opt/zuul/lib/python3.11/site-packages/zuul/executor/server.py", line 2868, in runAnsible ns, pids = context.getNamespacePids(self.proc) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/zuul/lib/python3.11/site-packages/zuul/driver/bubblewrap/__init__.py", line 89, in getNamespacePids for child in pid_to_child_list.get(proc.pid): TypeError: 'NoneType' object is not iterable Change-Id: Ic8f4daac064da20b921189774d31859424a21fa0	2023-09-15 10:22:29 +02:00
James E. Blair	1c92165ab7	List process ids in bwrap namespace If the kernel kills a process due to an out of memory error, it can be difficult to track the process back to the build that triggered it. The kernel error just gives us a PID, but we don't know any of the Ansible process ids. Further, since they are in bwrap, Ansible only knows its namespaced pid rather than the host pid, so we can't simply output it in one of our callback plugins. To aid in debugging, output all of the process ids within a namespace right at the start of an ansible-playbook execution. At this time, it is certain that the Ansible process will have started, and it is very likely that it is still running. That should provide a way to map from an OOM message back to an Ansible process id. (Note that Ansible forks and this is unlikely to catch any forked processes, so we will only see the main Ansible process id. Typically this is what the kernel should elect to kill, but if it does not, we may need a futher change to repeat this process each time Ansible forks. Since that is more costly, let's see if we can avoid it.) Change-Id: I9f262c3a3c5410427b0fb301cb4f1697b033ba2f	2023-06-28 13:31:06 -07:00
Clark Boylan	ff166e3ea9	Document the source of the afs 0x40084301 ioctl magic number During debugging of ioctl failures one of the things we explored was that this magic number may no longer be correct. Turns out it is correct, but documenting the source of this value may aid future debugging. Change-Id: I87504ee5763bbdc819e68f9defee3df5277eec51	2023-06-05 10:54:46 -07:00
James E. Blair	eb550597b0	Use os.open with setpag When we open the ioctl file to run the openafs setpag syscall, we previously used the high-level open method, which apparently issues an unwanted TCGETS ioctl which crashes the program with a kernel error under certain versions of python+openafs+linux (3.10.6, 1.8.9, 5.15.0, respectively). Switch to a low-level open to avoid this call. Change-Id: I5e08a6020cf6cd4ad2a0084effb697aa39dae9c6	2023-06-05 10:25:00 -07:00
Clark Boylan	c1b0a00c60	Only check bwrap execution under the executor The reason for this is that containers for zuul services need to run privileged in order to successfully run bwrap. We currently only expect users to run the executor as privilged and the new bwrap execution checks have broken other services as a result. (Other services load the bwrap system bceause it is a normal zuul driver and all drivers are loaded by all services). This works around this by add a check_bwrap flag to connection setup and only setting it to true on the executor. A better longer term followup fixup would be to only instantiate the bwrap driver on the executor in the first place. This can probably be accomplished by overriding the ZuulApp configure_connections method in the executor and dropping bwrap creation in ZuulApp. Temporarily stop running the quick-start job since it's apparently not using speculative images. Change-Id: Ibadac0450e2879ef1ccc4b308ebd65de6e5a75ab	2023-05-17 13:45:23 -07:00
Clark Boylan	0937872119	Use bwrap --disable-userns if possible Newer bwrap has added the ability to disable additional nested user namespace creation from with the bwrap execution context. Take advantage of this feature in Zuul if we are able to in order to fortify Zuul's security position. In particular we need two conditions to take advantage of this. 1) bwrap must be new enough to support the feature (>=0.8.0) and 2) we must be running with user namespaces enabled. We explicitly check for both conditions and add the appropriate invocation flags to bwrap when the conditions are met. Change-Id: Idf933a0847cb8570b551892186ca9c0057be127f	2023-05-16 10:12:21 -07:00
Clark Boylan	4ea5c621b9	Set default SSH_AUTH_SOCK in zuul-bwrap command The zuul-bwrap command is useful for debugging things under the zuul bwrap environment. Unfortunately, the way things are written it assumes there will be an SSH_AUTH_SOCK. For much debugging you might manually do in this environment an SSH_AUTH_SOCK is unnecessary. Instead of throwing a obtuse error simply set the value to /dev/null if not otherwise set. Change-Id: Iec0ee93c6e6b1b647a27c9a7fdf280d14d5d2596	2022-09-29 08:41:56 -07:00
James E. Blair	a190e35bb8	Add a note about bwrap and setsid https://github.com/containers/bubblewrap/issues/142 is relevant to us, however our use of start_new_session in popen effectively avoids the issue. Add a note to that effect so that we don't accidentally open a vulnerability later. Also, clean up some py2-only code. Change-Id: Icd4adee32f35c478661dc2d657cf6c9e55e1f7b5	2022-03-28 15:44:19 -07:00
Albin Vass	39305393c0	Drop ambient capabilities when running bwrap Having ambient capabilties causes bwrap to error on start [1] unless the bwrap executable also has the setuid bit set or is run as root. This can cause issues in openshift or podman unless ambient capabilities are dropped [2]. [1] - `bae85baf72/bubblewrap.c (L742)` [2] - https://github.com/containers/bubblewrap/issues/380 Change-Id: I15455fb400448d7672638f911d6cf045fa683a9b	2021-11-01 19:13:37 +01:00
Paul Belanger	927857082b	Stop bind mounting zuul dir into bwrap Once we landed the multi-ansible spec, we no longer need to include the zuul directory where zuul-executor is run from. This is because we now install ansible into its own virtualenv. Change-Id: I35c66d7249841e32478b26b60d6e840fe3f2750d Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2019-06-22 18:36:21 -04:00
Tobias Henkel	74c1ba73ba	Mount tmpfs on ansible tmp dir We explicitly set the ansible local_tmp dir to {work}/tmp. Since ansible writes many small files in there we should mount a tmpfs there to save iops. Change-Id: Ia17d9dac8e7f5d8fb8e294c37a7b0a6621ee7c7c	2019-06-04 14:09:15 +02:00
Tristan Cacqueray	6fd6b6b57d	bubblewrap: bind mount /etc/subuid This file may be required by recent containers tool when doing unshare actions. Also make the image build jobs non-voting temporarily since they are broken by the issue this change fixes. Also, pin docker image to 2.16.8 for quick-start (squashed in here to be able to merge again): The new version 3.0.0 needs some configuration adjustment, git-review is failing with: remote: error: branch refs/publish/master: remote: You need 'Create' rights to create new references. remote: User: user remote: Contact an administrator to fix the permissions Change-Id: Iab45bf2322edf8a10d2d41a1fc9a098e17a39ea7	2019-05-16 09:33:16 +02:00
Monty Taylor	7fe0e780cf	Build zuul containers with dockerfile not pbrx While pbrx is nice and all, it's quite the divergence from how the rest of the container ecosystem works. Switch to using Dockerfile and the python-builder image. Bind mount ld.so.cache into bwrap context When using images based on the python:slim base image, python is installed in /usr/local and the linker needs to know to look in /usr/local/lib for shared libraries. Depends-On: https://review.openstack.org/632187 Change-Id: I84f6dd2a8e3222f7807103dcbb61bdadedfdd22d	2019-01-24 16:11:31 +00:00
Andreas Jaeger	d9059524e0	Fix flake 3.6.0 warnings flake 3.6.0 introduces a couple of new tests, handle them in the zuul base: * Disable "W504 line break after binary operator", this is a new warning with different coding style. * Fix "F841 local variable 'e' is assigned to but never used" * Fix "W605 invalid escape sequence" - use raw strings for regexes. * Fix "F901 'raise NotImplemented' should be 'raise NotImplementedError'" * Ignore "E252 missing whitespace around parameter equals" since it reports on parameters like: def makeNewJobs(self, old_job, parent: Job=None): Change "flake8: noqa" to "noqa" since "flake8: noqa" is a file level noqa and gets ignored with flake 3.6.0 if it's not at beginning of line - this results in many warnings for files ./zuul/driver/bubblewrap/__init__.py and ./zuul/cmd/migrate.py. Fix any issues there. Change-Id: Ia79bbc8ac0cd8e4819f61bda0091f4398464c5dc	2018-10-28 16:39:30 +01:00
Tobias Henkel	5a4db84e5a	Log cpu times of ansible executions We need to be able to compare and discover ansible performance regressions or improvements of ansible. Currently we have no way of detecting changes there other than observing the overall system load of executors. One way to get some metrics is to log the cpu times used by individual ansible runs and the sum of them over the whole job execution. With this one could grab that data from the log and analyse them. Change-Id: Ib0b62299c741533f0d1615f67eced9601498f00d	2018-07-14 10:32:06 +02:00
Fabien Boucher	0e01048069	Add /etc/localtime to bubblewrap default ro bind This change lets programs running on the executor discover the system default timezone. Change-Id: Icc28d2103fe663b27a0842cd36efc6eeb38caa2b	2018-06-26 13:41:32 +02:00
Tobias Henkel	ee9c392b40	Add standard ca certificate paths When using the uri module in a base job it cannot validate ssl certs unless you add the ca certificate paths to (un)trusted-ro-paths. This seems a common use case so it makes sense to mount them into the bwrap context by default if they are existing. Change-Id: I2277374cdb8455dd9e39222ef0ecbab4c8ac786e	2018-03-16 16:34:18 +01:00
James E. Blair	1b22179d20	Add /etc/alternatives to bwrap On some systems, some fairly fundamental binaries route through here. Change-Id: I6258fbe8e7a4728bf85a6b918cf6518d2643d5ed	2017-08-31 10:10:38 -07:00
James E. Blair	d5f7b74588	Add proc to bubblewrap And set the AFS pag. We would like to use AFS within our playbooks (generally in trusted jobs on the executor). Ideally, such usage should be, like everything else in bubblewrap, completely separate from any other processes. However, by default OpenAFS stores authentication credentials by UID, meaning that once any process obtained tokens, any other process on the executor would be able to use them. Fortunately, the concept of a PAG (process authentication group) helps us out here. That scopes tokens to a single process and its children. Normally this is done by PAM when a user logs in, but there is an ioctl that we can use to request a new PAG at any time. It is this method that we use to ensure each ansible process runs in its own PAG. When a new PAG is created, it is actually bound to the thread that created it. Because of this, we don't need to be concerned with thread synchronization around PAG creation. This is useful in the executor which has potentially hundreds of threads in various stages of preparing to execute a subprocess. It is sufficient to request the new PAG at any time before the Popen call, and that thread will use it during the next invocation. The --proc argument is added to the bubblewrap invocation in order to permit aklog to run (it needs to access /proc/fs/openafs/afs_ioctl in order to store the tokens). Change-Id: I2687629f964af11c9da261875f2ec735082b8836	2017-08-24 16:37:54 -07:00
James E. Blair	d6a71ca2b4	Write secrets to tmpfs So that we may avoid writing the decrypted contents of secrets to disk, write them to a file in a tmpfs. Change-Id: I7c029b67d0fc2fa3827dc811137dd4f3a90706d8	2017-08-19 08:08:19 -07:00
James E. Blair	ce56ff9756	Add wrapper driver execution context We recently began altering the mount map used by the wrapper driver for each execution run (so that we can only include the current playbook). However, the setMountsMap method operates on the global driver object rather than an object more closely bound to the lifetime of the playbook run. The fact that this works at all is just luck (executing process is slow enough that hitting a race condition where the wrong directories are mounted is unlikely). To correct this, add a new layer which contains the context for the current playbook execution. Change-Id: I3a06f19e88435a49c7b9aea4e1221b812f5a43d0	2017-08-18 16:35:12 -07:00
Paul Belanger	5d993ed71d	Bindmount /etc/lsb-release into bubblewrap Things like pip use lsb_release, so it is helpful to include this in bubblewrap. This conditionally includes similar files on both debuntu and fedora. Change-Id: Ibfed3ace26163da6484966e348e757f7268811f0 Signed-off-by: Paul Belanger <pabelanger@redhat.com> Co-Authored-By: James E. Blair <jeblair@redhat.com>	2017-08-10 16:45:31 -07:00
James E. Blair	892cca6afa	Bind secrets to their playbooks Secrets are proving less useful than originally hoped because they can not be effectively used in any jobs with untrusted children. This change binds the secrets to the playbooks which use them, so that child jobs are unable to access the secrets. This allows us to create jobs with pre/post playbooks which use secrets which are suitable for other jobs to inherit from. Change-Id: I67dd12563f3abd242d6356675afed1de0cb144cf	2017-08-10 09:13:46 -07:00
Monty Taylor	01380dd885	Change name and document the bind_mount config paths The content in these can be a file or a directory - so _dirs is confusing. Change it to _paths and document it. Change-Id: Ida38766cd3d440d75a6dc55035a54e0804e03760	2017-07-28 17:30:45 -05:00
Monty Taylor	b41a5d9e8f	Replace singleton lists with None defaults Hit an issue trying to run zuul-bwrap locally and I didn't pass --ro-bind or --rw-bind which meant setMountsMap was being passed None values. In fixing that, it seems to me that perhaps the list values were not intentionally singletons. However - it's possible they were and I'm breaking some intended logic here. Change-Id: I3a30bd3d4439c27483c45f86d3d9ae1741a40a38	2017-07-28 16:04:19 -05:00
Jenkins	6d9385829b	Merge "Use mypy to do static type checking" into feature/zuulv3	2017-07-28 03:58:33 +00:00
Monty Taylor	fb8f5a44bd	Use mypy to do static type checking python3 includes support for optional type annotations which can be used by static analysis tools to perform type checking. The mypy tool is a static type checking tool that can also infer type information in many cases, but which will use explicit type information if it is present. Add mypy to test-requirements and to the pep8 job so that our pep8 job can do more analysis work and less with the code style. To support this, there were a few places in the current codebase that needed an explicit type hint. For variables/attributes in 3.5 this is done via comments. There is a conditional import that was confusion that just got marked with an 'ignore'. Our ansible action and lookup plugins confuse mypi with the way they import the ansible base classes. That's ok - they confuse us with that too. The .pyi files are 'typeshed' files, which are a way that one can provide static type annotations without putting the information into the file itself. mypy will always prefer a .pyi file over a .py file (since the point of them is to be external annotion/interface description) So in order to get mypy to not barf on the ansible import weirdness, just add a corresponding empty .pyi file. We could potentially actually put interface descriptions in them - but I don't think there is very much value in that. It should be amusing to at least someone that we have to flake8: noqa an import from typing that was done to provide a type hint in a comment. Change-Id: I6c4ac3dcfc6fd990e6c6886749de147ad28389d1	2017-07-27 14:34:07 -05:00
Jamie Lennox	7655b5550f	Allow loading additional variables file for site config It would be useful to allow deployment specific configuration that can be fed into the project-config deployments so that we can customize things like host ip without having to change job definitions for each site. Also, add a method to display the build log from a failed assertion in the Ansible test (this was used in the development of the tests for this change). Change-Id: I87e8bffc540bcafab543c46244f3d5327b56fcae Co-Authored-By: James E. Blair <jeblair@redhat.com>	2017-07-25 07:27:19 -07:00
James E. Blair	69eab24d1d	Remove state_dir from setMountsMap The setMountsMap command required the state_dir argument, presumably so that the zuul ansible path (ie, our custom modules) is available. Unfortunately, it set it as a read-write bind, not read-only. We certainly don't want jobs (even trusted jobs) modifying the ansible code that we run. Switch it to a read-only bind mount. Also, remove it from special handling inside of the setMountsMap method and instead, handle it on the executor site for increased visibility. Finally, add options to the zuul-bwrap command to set the ro and rw binds to make interactive testing easier. Change-Id: I4a0fdae546a2307d78a5c29b5a62a6d223ecb9e9	2017-07-24 14:45:31 -07:00
Tristan Cacqueray	a19e8c57c7	Add /etc/hosts and /etc/nsswitch.conf to the bubblewrap This change adds dns resolution helpers to the bubblewrap so that hosts locally defined are resolvable in executor playbooks. Change-Id: I5efad8749ff25cdbe6a142f9616422d96b7bbf33	2017-07-13 06:29:34 +00:00
Tobias Henkel	7206a511e4	Optionally bind /lib64 On some systems like alpine the /lib64 directory doesn't exist. Bind that conditionally. Change-Id: I504f140524421770b2512182e83c7da1e89e3378	2017-07-07 15:33:54 +02:00
James E. Blair	2ee4770337	Don't automatically mount user home in executor We're starting to treat the work directory as a substitute home directory (we put .ssh/ into it, for example), and we set $HOME to that directory. Complete this process by updating our bwrap passwd entry to point to that as the home directory and stop mounting the real home dir. Change-Id: I0fdb1913634d3902cac58112c5d683f12675c6f7	2017-06-28 17:39:18 -07:00
Jenkins	34de171669	Merge "executor: run trusted playbook in a bubblewrap" into feature/zuulv3	2017-06-26 21:25:48 +00:00
Jenkins	a7516afe58	Merge "bubblewrap: adds --die-with-parent option" into feature/zuulv3	2017-06-26 21:25:22 +00:00
Jenkins	e9c12ee0ce	Merge "Remove use of six library" into feature/zuulv3	2017-06-20 16:33:21 +00:00
Tobias Henkel	88e0305d52	Add linebreak to generated passwd/group file For running in bwrap the /etc/passwd and /etc/group files are generated on the fly to only show the executing user. This needs to add a linebreak at the end. Otherwise ssh (as well as getent) cannot read the file. In case of ssh this results in the error 'No user exists for uid x'. Change-Id: I0e75dd423f2ffb93da1de4dfc064ff22991f1793	2017-06-20 12:09:06 +02:00
Monty Taylor	b934c1a052	Remove use of six library It exists only for py2/py3 compat. We do not need it any more. This will explicitly break Zuul v3 for python2, which is different than simply ceasing to test it and no longer declaring we support it. Since we're not testing it any longer, it's bound to degrade overtime without us noticing, so hopefully a clean and explicit break will prevent people from running under python2 and it working for a minute, then breaking later. Change-Id: Ia16bb399a2869ab37a183f3f2197275bb3acafee	2017-06-19 10:34:57 -05:00
Tristan Cacqueray	44aef15d6e	executor: run trusted playbook in a bubblewrap This change renames untrusted_wrapper to execution_wrapper and uses bubblewrap for both trusted and untrusted playbooks by default. This change adds new options to the zuul.conf executor section to let operators define what directories to mount ro or rw for both context: * trusted_ro_dirs/trusted_rw_dirs, and * untrusted_ro_dirs/untrusted_rw_dirs Change-Id: I9a8a74a338a8a837913db5e2effeef1bd949a49c Story: 2001070 Task: 4687	2017-06-17 02:43:19 +00:00
Tristan Cacqueray	2438860823	bubblewrap: adds --die-with-parent option This change ensures that no processes leak from the bubblewrapdriver. Change-Id: Ica388ad2595cbd237d074fd54cc99d1685f6e729	2017-06-17 02:43:19 +00:00
Jenkins	c95cf7fb80	Merge "Default bubblewrap to work_root" into feature/zuulv3	2017-06-15 17:10:57 +00:00
Jamie Lennox	1ef9ca67ef	Show debug logging when running zuul-bwrap If you've gotten to the point of running zuul-bwrap manually you're almost certainly debugging a problem and so having the debug output here helps a lot. Change-Id: I770b5466ad15356570572b50dd64a0252ebb3b06	2017-06-14 11:08:16 +10:00
Paul Belanger	bcdc4d0939	Default bubblewrap to work_root Default chdir to jobdir.work_dir for bubblewrap and start running our commands from there. Change-Id: Ied3d13bc4257c669a6bbb30750f154dcf5e3b970 Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2017-06-12 17:22:40 -04:00
Paul Belanger	9d9023f254	Add untrusted-projects ansible test We want to properly flex our bubblewrap implementation, this job does so. Change-Id: I6647d71434a8d8f6621d3fd34883683ef149775a Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2017-06-01 18:47:18 -07:00
Clint Byrum	5870ccae62	Add support for bwrap This will be the minimum "batteries included" bubblwrap driver. It does not do any MAC configuration, since these vary by system. Operators may wish to wrap it further in a MAC wrapper driver. Because we set bubblewrap as the default wrapper, test_playbooks tests it. However, it lacks a negative test, so we won't know if we're not actually containing things. Users who don't have bubblewrap or don't wish to use it can set the untrusted_wrapper to 'nullwrap' which will just execute things as they're done before this change. Change-Id: I84dd7c8cc55d2110b58609784007ffda0d135716 Story: 2000910 Task: 3540 Signed-off-by: Paul Belanger <pabelanger@redhat.com>	2017-06-01 09:26:45 -07:00

45 Commits