Some drivers were missing docs and/or validation for options that
they actually support. This change:
adds launch-timeout to:
metastatic docs and validation
aws validation
gce docs and validation
adds post-upload-hook to:
aws validation
adds boot-timeout to:
metastatic docs and validation
adds launch-retries to:
metastatic docs and validation
Change-Id: Id3f4bb687c1b2c39a1feb926a50c46b23ae9df9a
This change adds the ability to use the k8s (and friends) drivers
to create pods with custom specs. This will allow nodepool admins
to define labels that create pods with options not otherwise supported
by Nodepool, as well as pods with multiple containers.
This can be used to implement the versatile sidecar pattern, which,
in a system where it is difficult to background a system process (such
as a database server or container runtime) is useful to run jobs with
such requirements.
It is still the case that a single resource is returned to Zuul, so
a single pod will be added to the inventory. Therefore, the expectation
that it should be possible to shell into the first container in the
pod is documented.
Change-Id: I4a24a953a61239a8a52c9e7a2b68a7ec779f7a3d
This adds support for specifying the scheduler name, volumes (and
volume mounts), and additional metadata labels to the Kubernetes
and OpenShift (and OpenShift pods) drivers.
This also extends the k8s and openshift test frameworks so that we
can exercise the new code paths (as well as some previous similar
settings). Tests and assertions for both a minimal (mostly defaults)
configuration as well as a configuration that uses all the optional
settings are added.
Change-Id: I648e88a518c311b53c8ee26013a324a5013f3be3
To allow users to run docker-in-docker style workloads on k8s
and openshift clusters, add support for adding the privileged
flag to containers created in k8s and openshift pods.
Change-Id: I349d61bf200d7fb6d1effe112f7505815b06e9a8
Ansible needs to know which shell type the node uses to operate
correctly, especially for ssh connections for windows nodes because
otherwise ansible defaults to trying bash.
Change-Id: I71abfefa57aaafd88f199be19ee7caa64efda538
This adds support to specify node selectors on Pod node labels.
They are used by the k8s scheduler to place a Pod on specific nodes with
corresponding labels.
This allows to place a build node/Pod on k8s nodes with certain
capabilities (e.g. storage types, number of CPU cores, etc.)
Change-Id: Ic00a84181c8ef66189e4259ef6434dc62b81c3c6
For the OpenShift and Kubernetes drivers, allow passing env vars to the
Pod nodes via their label config.
It is not possible to set persistent env vars in containers on run time
because there is no login shell available. Thus, we need to pass in any
env vars during node launch. This allows to set, e.g., ``http_proxy``
variables.
The env vars are passed as a list of dicts with ``name`` and ``value``
fields as per the k8s Pod YAML schema. [1]
```
- name: pod-fedora
type: pod
image: docker.io/fedora:28
env:
- name: foo
value: bar
```
[1] https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/
Change-Id: Ibbd9222fcd8f7dc5be227e7f5c8d8772a4c594e2
In the OpenShift and OpenShiftPods drivers, it is possible to configure
resource requests and limits for the container per label attributes.
This feature was missing in the Kubernetes driver, thus this change
introduces it analogously to the OpenShift driver.
Change-Id: I7e67aebf892d10939672bdf76b8b3eb543124f9a
This change enables setting configuration values through
environment variables. This is useful to manage user defined
configuration, such as user password, in Kubernetes deployment.
Change-Id: Iafbb63ebbb388ef3038f45fd3a929c3e7e2dc343
There are several scenarios where it can be useful hook into nodepool
after an image got uploaded but before it is taken into use by the
launchers. One use case is to be able to run validations on the image
(e.g. image size, boot test, etc.) before nodepool tries to use that
image and causing potentially node_failures. Another more advanced use
case is to be able to pre-distribute an image to all compute nodes in
a cloud before an image is used at scale.
To facilitate these use cases this adds a new config option
post-upload-hook to the provider config. This takes a path to a user
defined executable script which then can perform various tasks. If the
process fails with an rc != 0 the image gets deleted again and the
upload fails.
Change-Id: I099cf1243b1bd262b8ee96ab323dbd34c7578c10
There are some edge cases where the port cleanup logic is too
aggressive. This change attempts to address both of them in one commit:
* Some providers might spawn instances very slowly. In the past this was
handled by hardcoding the timeout to 10 minutes. This allows a user to
tweak the timeout in config.
* In the esoteric combination of using Ironic without the Ironic Neutron
agent, it's normal for ports to remain DOWN indefinitely. Setting the
timeout to 0, will work around that edge case.
Change-Id: I120d79c4b5f209bb1bd9907db172f94f29b9cb5d
This change implements a single project OpenShift pod provider usable by a
regular user service account, without the need for a self-provisioner role.
Change-Id: I84e4bdda64716f9dd803eaa89e576c26a1667809
This change adds a new python_path Node attribute so that zuul executor
can remove the default hard-coded ansible_python_interpreter.
Change-Id: Iddf2cc6b2df579636ec39b091edcfe85a4a4ed10
We have a use case where we have a single pool, due to quota reasons,
but need the ability to selectively choose which network a label will
use. Now a nodepool operator will be able to define which networks are
attached to labels (in our case network appliances).
Change-Id: I3bfa32473c76b9fd59deee7d05b492e7cf67f69d
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
The AWS term is instance-type, not flavor-name. Rename this while
we don't have a huge userbase.
While we're in there, rename a variable from image_name to image_id
since we use image_id everywhere else.
Change-Id: I1f7f16d2873982626d2434cf5ca1f6280adf739c
This reverts commit ccf40a462a.
The previous version would not work properly when daemonized
because there was no stdout. This version maintains stdout and
uses select/poll with non-blocking stdout to capture the output
to a log file.
Depends-On: https://review.openstack.org/634266
Change-Id: I7f0617b91e071294fe6051d14475ead1d7df56b7
This change adds an experimental AWS driver. It lacks some of the deeper
features of the openstack driver, such as quota management and image
building, but is highly functional for running tests on a static AMI.
Note that the test base had to be refactored to allow fixtures to be
customized in a more flexible way.
Change-Id: I313f9da435dfeb35591e37ad0bec921c8b5bc2b5
Co-Authored-By: Tristan Cacqueray <tdecacqu@redhat.com>
Co-Authored-By: David Moreau-Simard <dmsimard@redhat.com>
Co-AUthored-By: Clint Byrum <clint@fewbar.com>
A builder thread can wedge if the build process wedges. Add a timeout
to the subprocess. Since it was the call to readline() that would block,
we change the process to have DIB write directly to the log. This allows
us to set a timeout in the Popen.wait() call. And we kill the dib
subprocess, as well.
The timeout value can be controlled in the diskimage configuration and
defaults to 8 hours.
Change-Id: I188e8a74dc39b55a4b50ade5c1a96832fea76a7d
This change implements an OpenShift resource provider. The driver currently
supports project request and pod request to enable both containers as machine
and native containers workflow.
Depends-On: https://review.openstack.org/608610
Change-Id: Id3770f2b22b80c2e3666b9ae5e1b2fc8092ed67c
This config option, available under each provider pool section, can
contain static key-value pairs that will be stored in ZooKeeper on
each Node znode. This will allow us to pass along abitrary data from
nodepool to any user of nodepool (specifically, zuul). Initially, this
will be used to pass along zone information to zuul executors.
Change-Id: I126d37a8c0a4f44dca59c11f76a583b9181ab653
This changes implements a Kubernetes resource provider.
The driver supports namespace request and pod request to enable both
containers as machine and native containers workflow.
Depends-On: https://review.openstack.org/605823
Change-Id: I81b5dc5abe92b71cc98b0d71c8a2863cddff6027
This allows us to set parameters for server boot on various images.
This is the equivalent of the "--property" flag when using "openstack
server create". Various tools on the booted servers can then query
the config-drive metadata to get this value.
Needed-By: https://review.openstack.org/604193/
Change-Id: I99c1980f089aa2971ba728b77adfc6f4200e0b77
In some installations it might be unreal to rely on the default security
group (security concerns). In order to also enable possibility to share
one tenant between zuul and other resources a support for specifying
security_groups on the driver.openstack.pool level is added.
Change-Id: I63240049cba295e15f7cfe75b7e7a7d53aa4e37d
The static driver currently assumes ssh connectivity. Add a
connection-type parameter and rename the ssh-port to connection-port to
match the diskimages setting name.
Keep the old 'ssh-port' setting for backwards compat.
Change-Id: I1a96f03f9845b0d99d9ce89d2213be4d483afdd9
When using 'rate: 1' in the OpenStack driver, the validation fails with:
MultipleInvalid: expected float for dictionary value @ data['rate']
This change fixes that issue by auto converting rate to float.
Change-Id: Id1e95127014ad24807d629d358ae340e5720bb89
The connection port should be included in the privider diskimage.
This makes it possible to define images using other ports for
connections winrm for Windows which run on a different port than 22.
Change-Id: Ib4b335ffbcc4dc71704c06387377675a4206c663
In some cases nodepool-launcher uses public API to launch nodes, but
doesn't have access to the private networks of nodes it launches.
Rather then failing, expose an option for operators to disable
ssh-keyscan and allow nodes to become ready.
Change-Id: I764398aa21461ef44048e9e6565d2ee3e01aaaf8
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
The connection type should be included in the provider diskimage. This
makes it possible to define images using other connection methods than
ssh like winrm for Windows.
Change-Id: Ica0b9afe39d347028eb66c069b8dbd56a8c0ec8c
When we deploy nodepool and zuul instances in virtual machine of
cloud provider, the provisioned nodes may be in the same internal
network with nodepool and zuul instances, in that case we don't
have to allocate floating ip for nodes, zuul can talk with nodes
via fixed ip of virtual machines. So if we can customize the behavior,
save the quota of floating ip, it's awesome.
Note: Although option "floating_ip_source: None" in clouds.yaml can
decide to apply floating ip or not for specified cloud, but that impact
all the SDKs and tools that use the clouds.yaml, we should control
nodepool behavior flexibly and independently.
This patch add a bool option "auto-floating-ip" into each pool of
"provider" section in nodepool.conf
Change-Id: Ia9a1bed6dd4f6e39015bde660f52e4cd6addb26e
This change adds a webapp settings to nodepool.yaml to enable custom setting
for port and listen_address.
Change-Id: I0f41a0b131bc2a09c47a448c65471e052c0a9e88
For example, a cloud may get better preformance from a cinder volume
then the local compute drive. As a result, give nodepool to option to
choose if the server should boot from volume or not.
Change-Id: I3faefe99096fef1fe28816ac0a4b28c05ff7f0ec
Depends-On: If58cd96b0b9ce4569120d60fbceb2c23b2f7641d
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
Sadly, I missed this on our previous commit. Also update coverage from
nodepool dsvm job.
Change-Id: I6966957ac8162a588531c38bd69a93fb58a15258
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
This adds the max-ready-age setting to the label config. With this one
can specify how long nodes should live unused in READY state. This
enables the following use cases:
- When switching nodepool between a 'working-hours' and a
'non-working-hours' configuration with high or low min-ready
settings this can trigger a (delayed) scale down of unused
resources. this can be important when using a cloud provider with
ondemand billing model.
- Renewing old nodes without having to run a job on it. This can be
useful for capping the age of the cached data inside the nodes.
Change-Id: Id705f0a5e478ab658ed3a396f92d6eb6694c1c8f
We require clouds.yaml files now. It's just the way it is. If we don't
have one, os-client-config will become unpleased - but it will do so in
a hard to understand error message (that's the best we can do there for
$reasons) ... so make sure that we present a config validation error and
not "keystoneauth1.exceptions.auth_plugins.MissingRequiredOptions: Auth
plugin requires parameters which were not given: auth_url"
Change-Id: I84e36400f38eecd5d798b772c09d768002f535f5