nodepool

Commit Graph

Author	SHA1	Message	Date
James E. Blair	c78fe769f2	Allow custom k8s pod specs This change adds the ability to use the k8s (and friends) drivers to create pods with custom specs. This will allow nodepool admins to define labels that create pods with options not otherwise supported by Nodepool, as well as pods with multiple containers. This can be used to implement the versatile sidecar pattern, which, in a system where it is difficult to background a system process (such as a database server or container runtime) is useful to run jobs with such requirements. It is still the case that a single resource is returned to Zuul, so a single pod will be added to the inventory. Therefore, the expectation that it should be possible to shell into the first container in the pod is documented. Change-Id: I4a24a953a61239a8a52c9e7a2b68a7ec779f7a3d	2024-01-30 15:59:34 -08:00
Benjamin Schanzel	4660bb9aa7	Kubernetes/OpenShift drivers: allow setting dynamic k8s labels Just like for the OpenStack/AWS/Azure drivers, allow to configure dynamic metadata (labels) for kubernetes resources with information about the corresponding node request. Change-Id: I5d174edc6b7a49c2ab579a9a0b1b560389d6de82	2023-09-11 10:49:27 +02:00
mbecker	3fa6821437	Add gpu support for k8s/openshift pods This adds the option to request GPUs for kubernetes and openshift pods. Since the resource name depends on the GPU vendor and the cluster installation, this option is left for the user to define it in the node pool. To leverage the ability of some schedulers to use fractional GPUs, the actual GPU value is read as a string. For GPUs, requests and limits cannot be decoupled (cf. https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/), so the same value will be used for requests and limits. Change-Id: Ibe33b06c374a431f164080edb34c3a501c360df7	2023-07-11 07:10:30 -07:00
James E. Blair	eedd6b9d2a	Add extra-resources handling to openshift drivers This adds the extra-resources handling that was just added to the k8s driver to openshift. Change-Id: I56e5eaf6ec22d10e88420094e92041c0b39b04e5	2023-06-27 14:06:11 -07:00
mbecker	1822976350	Add k8s annotations to pods This allows adding key/value pairs under metadata.annotations in the kubernetes resource specification. This information can be used by different tools to govern handling of resources. One particular use-case is the runai-scheduler which uses annotations to allocate fractional GPU resources to a pod. Change-Id: Ib319caffe51e00bedda2861e8e1f2bbe04340322	2023-06-27 14:06:01 -07:00
James E. Blair	669552f6f9	Add support for specifying pod resource limits We currently allow users to specify pod resource requests and limits for cpu, ram, and ephemeral storage. But if a user specifies one of these, the value is used for both the request and the limit. This updates the specification to allow the use of separate request and limit values. It also normalizes related behavior across all 3 pod drivers, including adding resource reporting to the openshift drivers. Change-Id: I49f918b01f83d6fd0fd07f61c3e9a975aa8e59fb	2023-02-12 07:14:30 -08:00
James E. Blair	9bf44b4a4c	Add scheduler, volumes, and labels to k8s/openshift This adds support for specifying the scheduler name, volumes (and volume mounts), and additional metadata labels to the Kubernetes and OpenShift (and OpenShift pods) drivers. This also extends the k8s and openshift test frameworks so that we can exercise the new code paths (as well as some previous similar settings). Tests and assertions for both a minimal (mostly defaults) configuration as well as a configuration that uses all the optional settings are added. Change-Id: I648e88a518c311b53c8ee26013a324a5013f3be3	2023-02-11 12:03:45 -08:00
James E. Blair	aa8580ce32	Add support for privileged containers To allow users to run docker-in-docker style workloads on k8s and openshift clusters, add support for adding the privileged flag to containers created in k8s and openshift pods. Change-Id: I349d61bf200d7fb6d1effe112f7505815b06e9a8	2023-01-25 11:09:25 -08:00
Clark Boylan	2a231a08c9	Add idle state to driver providers This change adds an idle state to driver providers which is used to indicate that the provider should stop performing actions that are not safe to perform while we bootstrap a second newer version of the provider to handle a config update. This is particularly interesting for the static driver because it is managing all of its state internally to nodepool and not relying on external cloud systems to track resources. This means it is important for the static provider to not have an old provider object update zookeeper at the same time as a new provider object. This was previously possible and created situtations where the resources in zookeeper did not reflect our local config. Since all other drivers rely on external state the primary update here is to the static driver. We simply stop performing config synchronization if the idle flag is set on a static provider. This will allow the new provider to take over reflecting the new config consistently. Note, we don't take other approaches and essentially create a system specific to the static driver because we're trying to avoid modifying the nodepool runtime significantly to fix a problem that is specific to the static driver. Change-Id: I93519d0c6f4ddf8a417d837f6ae12a30a55870bb	2022-10-24 15:30:31 -07:00
James E. Blair	6d3b5f3bab	Add missing cloud/region/az/host_id info to nodes To the greatest extent possible within the limitation of each provider, this adds cloud, region, az, and host_id to nodes. Each of AWS, Azure, GCE, IBMVPC have the cloud name hard-coded to a value that makes sense for each driver given that each of these are singleton clouds. Their region and az values are added as appropriate. The k8s, openshift, and openshiftpods all have their cloud names set to the k8s context name, which is the closest approximation of what the "cloud" attribute means in its existing usage in the OpenStack driver. If pods are launched, the host_id value is set to the k8s host node name, which is an approximation of the existing usage in the OpenStack driver (where it is typically an opaque uuid that uniquely identifies the hypervisor). Change-Id: I53765fc3914a84d2519f5d4dda4f8dc8feda72f2	2022-08-25 13:41:05 -07:00
Zuul	2926807c65	Merge "Add option of configuring imagePullSecrets for openshift drivers"	2022-04-19 08:58:53 +00:00
James E. Blair	9bcc046ffc	Add QuotaSupport to drivers that don't have it This adds QuotaSupport to all the drivers that don't have it, and also updates their tests so there is at least one test which exercises the new tenant quota feature. Since this is expected to work across all drivers/providers/etc, we should start including at least rudimentary quota support in every driver. Change-Id: I891ade226ba588ecdda835b143b7897bb4425bd8	2022-01-27 10:11:01 -08:00
Albin Vass	700cf38db0	Add option of configuring imagePullSecrets for openshift drivers Change-Id: If1c877e86a020b4ee1b4dbf795c8ac2e3079b43f	2022-01-11 14:19:29 +01:00
Tristan Cacqueray	05927dae03	kubernetes: refactor client creation to utils_k8s This change moves the kubernetes client creation to a common function to re-use the exception handling logic. Change-Id: I5bdd369f6c9a78e5f79a926d8690f285fda94af9	2021-06-15 16:13:53 +00:00
James E. Blair	63f38dfd6c	Support threadless deletes The launcher implements deletes using threads, and unlike with launches, does not give drivers an opportunity to override that and handle them without threads (as we want to do in the state machine driver). To correct this, we move the NodeDeleter class from the launcher to driver utils, and add a new driver Provider method that returns the NodeDeleter thread. This is added in the base Provider class so all drivers get this behavior by default. In the state machine driver, we override the method so that instead of returning a thread, we start a state machine and add it to a list of state machines that our internal state machine runner thread should drive. Change-Id: Iddb7ed23c741824b5727fe2d89c9ddbfc01cd7d7	2021-03-21 14:39:01 -07:00
Clark Boylan	e7f831c34e	Bump openshift dep The openshift library has been completely redesigned with recent releases so bump the dep and adapt to the new api. The update is necessary in order to fix a urllib3 version conflict [1]. [1] Trace: ERROR: nodepool 3.14.1.dev3 has requirement urllib3<1.26,>=1.25.4, but you'll have urllib3 1.24 which is incompatible. ERROR: kubernetes 8.0.2 has requirement urllib3>=1.24.2, but you'll have urllib3 1.24 which is incompatible. ERROR: botocore 1.19.30 has requirement urllib3<1.27,>=1.25.4; python_version != "3.4", but you'll have urllib3 1.24 which is incompatible. Change-Id: Ia4d09fd0a4a49d644bb575b74184de930c62ce89 Co-Authored-By: Tobias Henkel <tobias.henkel@bmw.de> Story: 2008427 Task: 41373	2021-01-11 17:26:31 +01:00
Benjamin Schanzel	d4cf0572e6	k8s/OpenShift Provider: Remove workingDir Attribute For users to be able to specifiy a custom working dir for their container nodes this change removes the hard-coded /tmp workingDir attribute from the container specs. The user-specified WORKDIR from the respective Dockerfile is then used. Change-Id: I0e2c0ca5be0af2360f54336340a40fa37ffe1001	2020-11-02 10:23:12 +01:00
Benjamin Schanzel	19be1a2e26	OpenShift/k8s Provider: Basic Support for k8s nodeSelectors This adds support to specify node selectors on Pod node labels. They are used by the k8s scheduler to place a Pod on specific nodes with corresponding labels. This allows to place a build node/Pod on k8s nodes with certain capabilities (e.g. storage types, number of CPU cores, etc.) Change-Id: Ic00a84181c8ef66189e4259ef6434dc62b81c3c6	2020-08-14 16:39:04 +02:00
Benjamin Schanzel	b76a0f458e	OpenShift/k8s Provider: Allow passing env vars to Pods For the OpenShift and Kubernetes drivers, allow passing env vars to the Pod nodes via their label config. It is not possible to set persistent env vars in containers on run time because there is no login shell available. Thus, we need to pass in any env vars during node launch. This allows to set, e.g., ``http_proxy`` variables. The env vars are passed as a list of dicts with ``name`` and ``value`` fields as per the k8s Pod YAML schema. [1] ``` - name: pod-fedora type: pod image: docker.io/fedora:28 env: - name: foo value: bar ``` [1] https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/ Change-Id: Ibbd9222fcd8f7dc5be227e7f5c8d8772a4c594e2	2020-07-13 17:11:01 +02:00
Benjamin Schanzel	bc172f0471	k8s/OKD Provider: Don't Set ca_cert if TLS verification is skipped Kubernetes does not allow to set a ca_cert in a kubeconfig if TLS certificate verifiaction is disabled. Doing so results in an error message: `error: specifying a root certificates file with the insecure flag is not allowed` This change makes sure we skip the ca_cert option nodepool-launcher generates for the Zuul executor if nodepools kubeconfig is set to skip TLS cert verification. Change-Id: I458c054fc9fae340d187ce40ea1236efdf65d50f	2020-04-08 13:22:30 +02:00
Benjamin Schanzel	7d7b08fadf	Kubernetes/OpenShift Provider: Don't Require Bash in Container Images Currently the Kubernetes and OpenShift providers set the entrypoint of their build node pods to `/bin/bash`, which then requires `bash` to be available in the respective container image. This might not always be the case (e.g. with Alpine based images). This change makes sure the entrypoint is set to `/bin/sh`, which we can more reliably assume to be available in the container image. Change-Id: I799ea95b715e50d9c22e66cc80579cf119db8f38	2020-03-10 11:17:43 +01:00
Tristan Cacqueray	fc15740286	Ensure both kubernetes and openshift token are b64decoded This change decodes the kubernetes secret and also use a similar token for openshift project: secret.data.token instead of the token-secret.value. Change-Id: Ie846d362a648268e52b5f56e29567cbff9c84930	2019-10-23 17:31:29 +00:00
Tristan Cacqueray	159038503a	Implement an OpenShift Pod provider This change implements a single project OpenShift pod provider usable by a regular user service account, without the need for a self-provisioner role. Change-Id: I84e4bdda64716f9dd803eaa89e576c26a1667809	2019-05-07 02:25:15 +00:00
Tristan Cacqueray	c1378c4407	Implement an OpenShift resource provider This change implements an OpenShift resource provider. The driver currently supports project request and pod request to enable both containers as machine and native containers workflow. Depends-On: https://review.openstack.org/608610 Change-Id: Id3770f2b22b80c2e3666b9ae5e1b2fc8092ed67c	2019-01-10 05:05:46 +00:00

24 Commits