Commit Graph

49 Commits

Author SHA1 Message Date
James E. Blair e097731339 Remove hostname-format option
This option has not been used since at least the migratio to the
statemachine framework.

Change-Id: I7a0e928889f72606fcbba0c94c2d49fbb3ffe55f
2024-02-08 09:40:41 -08:00
James E. Blair 3b434098c6 Add an image upload timeout to the openstack driver
Some uploads in opendev are taking hours.

We used to wait 6 hours for this, but we ended up using the SDK
default of 1 hour in recent versions.  Since we're seeing so much
disparity in time, make it user configurable.

Remove the unused 6 hour constant.

Change-Id: I9ca5fdbf7c66f176eb4f650fd287514708f46c16
2023-09-06 08:04:51 -07:00
James E. Blair de02ac5a20 Add OpenStack volume quota
This adds support for staying within OpenStack volume quota limits
on instances that utilize boot-from-volume.

Change-Id: I1b7bc177581d23cecd9443a392fb058176409c46
2023-02-13 06:56:03 -08:00
James E. Blair be3edd3e17 Convert openstack driver to statemachine
This updates the OpenStack driver to use the statemachine framework.

The goal is to revise all remaining drivers to use the statemachine
framework for two reasons:

1) We can dramatically reduce the number of threads in Nodepool which
is our biggest scaling bottleneck.  The OpenStack driver already
includes some work in that direction, but in a way that is unique
to it and not easily shared by other drivers.  The statemachine
framework is an extension of that idea implemented so that every driver
can use it.  This change further reduces the number of threads needed
even for the openstack driver.

2) By unifying all the drivers with a simple interface, we can prepare
to move them into Zuul.

There are a few updates to the statemachine framework to accomodate some
features that only the OpenStack driver used to date.

A number of tests need slight alteration since the openstack driver is
the basis of the "fake" driver used for tests.

Change-Id: Ie59a4e9f09990622b192ad840d9c948db717cce2
2023-01-10 10:30:14 -08:00
James E. Blair 6320b06950 Add support for dynamic tags
This allows users to create tags (or properties in the case of OpenStack)
on instances using string interpolation values.  The use case is to be
able to add information about the tenant* which requested the instance
to cloud-provider tags.

* Note that ultimately Nodepool may not end up using a given node for
the request which originally prompted its creation, so care should be
taken when using information like this.  The documentation notes that.

This feature uses a new configuration attribute on the provider-label
rather than the existing "tags" or "instance-properties" because existing
values may not be safe for use as Python format strings (e.g., an
existing value might be a JSON blob).  This could be solved with YAML
tags (like !unsafe) but the most sensible default for that would be to
assume format strings and use a YAML tag to disable formatting, which
doesn't help with our backwards-compatibility problem.  Additionally,
Nodepool configuration does not use YAML anchors (yet), so this would
be a significant change that might affect people's use of external tools
on the config file.

Testing this was beyond the ability of the AWS test framework as written,
so some redesign for how we handle patching boto-related methods is
included.  The new approach is simpler, more readable, and flexible
in that it can better accomodate future changes.

Change-Id: I5f1befa6e2f2625431523d8d94685f79426b6ae5
2022-08-23 11:06:55 -07:00
Joshua Watt a6bb2fff42 openstack: Remove metadata limit checks
The limit of 5 metadata items is 8 years old and outdated for modern
OpenStack. Instead of trying to guess what the OpenStack limits are
(which may depend on quota settings), remove all the checks and let
OpenStack reject the image at upload time.

Change-Id: Ifa2e429db3bac2e3cad73dce09e01c901ea133c4
Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
2022-05-23 11:44:19 -05:00
James E. Blair 94fcc70a59 Azure: reconcile config objects
The config objects in the Azure driver have drifted a bit.  This
updates them to match the actual used configuration.  It also
reorganizes them to be a little easier to maintain by moving the
initializers into the individual objects.

Finally, the verbose __eq__ methods are removed in favor of a
simpler __eq__ method in the superclass.

Since the OpenStack, k8s, and OpenShift drivers calls super() in
__eq__ methods, they need to be updated at the same time.

This also corrects an unrelated error with a misnamed parameter
in the fake k8s used in the k8s tests.

Change-Id: Id6971ca002879d3fb056fedc7e4ca6ec35dd7434
2021-03-22 10:39:53 -07:00
Albin Vass 0c84b7fa4e Add shell-type config
Ansible needs to know which shell type the node uses to operate
correctly, especially for ssh connections for windows nodes because
otherwise ansible defaults to trying bash.

Change-Id: I71abfefa57aaafd88f199be19ee7caa64efda538
2021-03-05 15:14:29 +01:00
Tobias Henkel 0dc40d33e4
Support optional post upload hooks
There are several scenarios where it can be useful hook into nodepool
after an image got uploaded but before it is taken into use by the
launchers. One use case is to be able to run validations on the image
(e.g. image size, boot test, etc.) before nodepool tries to use that
image and causing potentially node_failures. Another more advanced use
case is to be able to pre-distribute an image to all compute nodes in
a cloud before an image is used at scale.

To facilitate these use cases this adds a new config option
post-upload-hook to the provider config. This takes a path to a user
defined executable script which then can perform various tasks. If the
process fails with an rc != 0 the image gets deleted again and the
upload fails.

Change-Id: I099cf1243b1bd262b8ee96ab323dbd34c7578c10
2019-11-25 13:37:28 +01:00
Zuul b72a9195e1 Merge "Set default python-path to "auto"" 2019-10-17 05:26:10 +00:00
Ian Wienand db87a0845f Set default python-path to "auto"
The "python-path" configuration option makes its way through to Zuul
where it sets the "ansible_interpreter_path" in the inventory.
Currently this defaults to "/usr/bin/python2" which is wrong for
Python 3-only distributions.

Ansible >=2.8 provides for automated discovery of the interpreter to
avoid runtime errors choosing an invalid interpreter [1].  Using this
should mean that "python-path" doesn't need to be explicitly for any
common case.  As more distributions become Python 3 only, this should
"do the right thing" without further configuration.

This switches the default python-path to "auto".  The dependent change
updates Zuul to accept this and use it when running with Ansible
>=2.8, or default back to "/usr/bin/python2" for earlier Ansible
versions.

Testing and documentation is updated, and a release note added.

[1] https://docs.ansible.com/ansible/2.8/reference_appendices/interpreter_discovery.html

Depends-On: https://review.opendev.org/682275
Change-Id: I02a1a618c8806b150049e91b644ec3c0cb826ba4
2019-10-17 09:17:50 +11:00
Jan Gutter 6789c4b618 Add port-cleanup-interval config option
There are some edge cases where the port cleanup logic is too
aggressive. This change attempts to address both of them in one commit:

* Some providers might spawn instances very slowly. In the past this was
  handled by hardcoding the timeout to 10 minutes. This allows a user to
  tweak the timeout in config.
* In the esoteric combination of using Ironic without the Ironic Neutron
  agent, it's normal for ports to remain DOWN indefinitely. Setting the
  timeout to 0, will work around that edge case.

Change-Id: I120d79c4b5f209bb1bd9907db172f94f29b9cb5d
2019-10-09 17:06:48 +02:00
Paul Belanger 3a5cabedcb Toggle host-key-checking for openstack provider.labels
This adds the ability for a pool.label to override the
pool.host-key-checking value, while having a label exist in the same
pool.  This is helpful because it is possible for 1 pool to mix network
configuration, and some nodes maybe missing a default gateway (making
them unroutable by default).

Change-Id: I934d42b8e48aedb0ebc03137b7305eb2af390fc7
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
2019-06-06 14:15:53 -04:00
Tristan Cacqueray 76aa62230c Add python-path option to node
This change adds a new python_path Node attribute so that zuul executor
can remove the default hard-coded ansible_python_interpreter.

Change-Id: Iddf2cc6b2df579636ec39b091edcfe85a4a4ed10
2019-05-07 02:22:45 +00:00
Paul Belanger aaf36db8c6 Allow openstack provider labels to configure networks
We have a use case where we have a single pool, due to quota reasons,
but need the ability to selectively choose which network a label will
use. Now a nodepool operator will be able to define which networks are
attached to labels (in our case network appliances).

Change-Id: I3bfa32473c76b9fd59deee7d05b492e7cf67f69d
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
2019-04-29 17:31:50 -04:00
Sagi Shnaidman d5027ff6a9 Support userdata for instances in openstack
Use "userdata" from Nova API to pass cloud-init config to nova
instances in openstack.

Change-Id: I1c6a1cbc5377d268901210631a376ca26f4887d8
2019-01-22 19:14:52 +02:00
David Shrewsbury d6ef934b70 Extract common config parsing for ProviderConfig
Adds a ProviderConfig class method that can be called to get
the config schema for the common config options in a Provider.
Drivers are modified to call this method.

Change-Id: Ib67256dddc06d13eb7683226edaa8c8c10a73326
2019-01-07 12:34:05 +00:00
David Shrewsbury a19dffd916 Extract out common config parsing for ConfigPool
Our driver code is in a less-than-ideal situation where each driver
is responsible for parsing config options that are common to all
drivers. This change begins to correct that, starting with ConfigPool.
It changes the driver API in the following ways:

1) Forces objects derived from ConfigPool to implement a load() method
   that should call super's method, then handle loading driver specific
   options from the config.

2) Adds a ConfigPool class method that can be called to get the config
   schema for the common config options leaving drivers to have to only
   define the schema for their own config options.

Other base config objects will be modeled after this pattern in
later changes.

Change-Id: I41620590c355cacd2c4fbe6916acfe80f20e3216
2019-01-03 11:05:26 -05:00
David Shrewsbury 16325d5c4c Add arbitrary node attributes config option
This config option, available under each provider pool section, can
contain static key-value pairs that will be stored in ZooKeeper on
each Node znode. This will allow us to pass along abitrary data from
nodepool to any user of nodepool (specifically, zuul). Initially, this
will be used to pass along zone information to zuul executors.

Change-Id: I126d37a8c0a4f44dca59c11f76a583b9181ab653
2018-11-29 09:35:59 -05:00
Ian Wienand 7015bd9af4 Add instance boot properties
This allows us to set parameters for server boot on various images.
This is the equivalent of the "--property" flag when using "openstack
server create".  Various tools on the booted servers can then query
the config-drive metadata to get this value.

Needed-By: https://review.openstack.org/604193/

Change-Id: I99c1980f089aa2971ba728b77adfc6f4200e0b77
2018-09-21 16:29:16 +10:00
Artem Goncharov fc1f80b6d1
Replace shade and os-client-config with openstacksdk.
os-client-config is now just a wrapper around openstacksdk. The shade
code has been imported into openstacksdk. To reduce complexity, just use
openstacksdk directly.

openstacksdk's TaskManager has had to grow some features to deal with
SwiftService. Making nodepool's TaskManager a subclass of openstacksdk's
TaskManager ensures that we get the thread pool set up properly.

Change-Id: I3a01eb18ae31cc3b61509984f3817378db832b47
2018-07-14 08:44:03 -05:00
Jesse Pretorius 4c8b5f4f99 Add ability to ignore provider quota for a pool
In some circumstances it is useful to tell the launcher to
ignore the provide quota and just trust the max-* settings
for that pool instead.

This particular need arises when using Rackspace public cloud
for both instances and OnMetal nodes. In this situation the
quota for instances and for OnMetal nodes is different, but
nodepool only queries the quota for instances. When trying to
build OnMetal nodes, the quota check fails - but should not.

In this circumstance, instead of making shade/nodepool
complicated by figuring out how to calculate disparate quota
types, it makes sense to rather just allow nodepool to ignore
the quota for a pool to try executing the build instead.

While this is our use-case, it may also be useful to others
for other reasons.

Change-Id: I232a1ab365795381ab180aceb48e8c87843ac713
2018-07-11 17:37:55 +00:00
David Shrewsbury d39cc6d7ce Fix for referencing cloud image by ID
For pre-existing cloud images (not managed by nodepool), referencing
them by ID was failing since they could not be found with this data,
only by name.

Current code expects the shade get_image() call to accept a dict with
an 'id' key, which will return that same dict without any provider API
calls. This dict can then be used in createServer() to bypass looking
up the image to get the image ID. However, shade does not accept a dict
for this purpose, but an object with an 'id' attribute. This is
possibly a bug in shade to not accept a dict. But since nodepool knows
whether or not it has an ID (image-id) vs. an image name (image-name),
it can bypass shade altogether when image-id is used in the config.

Note: There is currently no image ID validation done before image
creation when an image-id value is supplied. Not even shade validated
the image ID with a passed in object. Server creation will fail with
an easily identifiable message about this, though.

Change-Id: I732026d1a305c71af53917285f4ebb2beaf3341d
Story: 2002013
Task: 19653
2018-07-03 15:26:33 -04:00
David Shrewsbury c08eb291f7 Fix for pools with different labels
Pools within the same provider that have different labels are
attempting to handle requests for ALL labels defined within the provider.
This is due to change I61263f51be5e957de71d1e2dabaa7391bbe7bddf which
incorrectly changed from getting supported labels from per-pool to
per-provider. This modifies the getSupportedLabels() API to support
pools.

Change-Id: I7a0d472928c6b528f6faa6dd3b9cf1479874cb22
2018-06-28 15:04:35 -04:00
James E. Blair 82d8c51250 Create a base Driver class
This is, like the drivers in zuul, designed to be a single instance
per driver that survives for the life of the process.  It is used
to further instantiate driver-specific interfaces.

Here we have it return the config object for the driver (replacing
the previous system which loaded it from specific config files).

We also move the reset method from the ProviderConfig to the Driver
object.  It's currently only used to clear a global os client config
object, so this better matches its lifecycle.

Change-Id: I1f5a7be9c597be842bfe4dbea8f153d7a96d7b9a
2018-06-06 14:11:43 -04:00
Artem Goncharov 674c9516dc Add support for specifying security_group in nodepool
In some installations it might be unreal to rely on the default security
group (security concerns). In order to also enable possibility to share
one tenant between zuul and other resources a support for specifying
security_groups on the driver.openstack.pool level is added.

Change-Id: I63240049cba295e15f7cfe75b7e7a7d53aa4e37d
2018-06-05 10:00:06 +02:00
David Shrewsbury 814160fb71 Fix ConfigValue comparisons
Our code was terribly inconsistent with implementing the __eq__()
method for ConfigValue based objects (if we did at all). If this
was implemented, not all of the attributes were used in comparison.

This change forces a ConfigValue derived object to implement the
comparison method, and fixes the places we neglected attributes.

Adds some very simple tests to at least exercise this code.

Change-Id: Ia2f600a9a4f3770087372bbc9f07531d5ea569e1
2018-05-11 15:17:33 -04:00
David Shrewsbury 6edb78a36b Make manage_images a property of ProviderConfig
The current driver API does not make it clear that a driver must
set the 'manage_images' property on the driver of the ProviderConfig
to indicate that it can manage external images. Since this is really
a function of the provider, not the driver, make this an abstract
property on the ProviderConfig that must be defined by each driver
implementation.

Change-Id: Iefe257f943aaa5740d0326ca6150632c21a2a8cf
2018-05-09 11:35:36 -04:00
David Shrewsbury 2332724205 Force driver provider configs to define pool attr
There is currently nothing forcing a driver's ProviderConfig
to define a 'pools' attribute, yet the code clearly expects
it to be defined (e.g., PoolWorker.getPoolConfig() method).
Force this by defining an abstract property, which helps define
the interface more clearly, too.

Change-Id: I2d6a0ee98076540cc722835ab6b5c3a077f83edc
2018-05-08 14:15:49 -04:00
Zuul 20c6646bf0 Merge "Refactor run_handler to be generic" 2018-04-18 15:38:11 +00:00
Zuul 7ddee72b51 Merge "Add connection-port to provider diskimage" 2018-04-13 15:20:58 +00:00
Tristan Cacqueray f42f65d7f5 openstack: convert rate to float
When using 'rate: 1' in the OpenStack driver, the validation fails with:
MultipleInvalid: expected float for dictionary value @ data['rate']

This change fixes that issue by auto converting rate to float.

Change-Id: Id1e95127014ad24807d629d358ae340e5720bb89
2018-04-12 02:59:48 +00:00
Tristan Cacqueray f5dc749f2a Refactor run_handler to be generic
This change refactors the OpenStackNodeRequestHandler.run_handler method
so that it is part of the generic interface. A nodepool driver handler now
only needs to implement the launch() method.

Moreover, this change merge the NodeLaunchManager into the RequestHandler.

This change adds the following method to the handler interface:

* nodeReused(node): notify handler a node is re-used
* checkReusableNode(node): verify if a node can be re-used
* setNodeMetadata(node): store extra node metadata such as az
* pollLauncher()

Change-Id: I5b18b49f50e2b416b5e5c4e28e1a4a6a1df1c654
2018-04-12 02:50:49 +00:00
Tobias Henkel 687f120b3c
Add connection-port to provider diskimage
The connection port should be included in the privider diskimage.
This makes it possible to define images using other ports for
connections winrm for Windows which run on a different port than 22.

Change-Id: Ib4b335ffbcc4dc71704c06387377675a4206c663
2018-04-03 17:48:52 +02:00
Paul Belanger 2286f2432c Add host-key-checking option to openstack providers
In some cases nodepool-launcher uses public API to launch nodes, but
doesn't have access to the private networks of nodes it launches.
Rather then failing, expose an option for operators to disable
ssh-keyscan and allow nodes to become ready.

Change-Id: I764398aa21461ef44048e9e6565d2ee3e01aaaf8
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
2018-03-26 22:29:14 +00:00
David Shrewsbury c204313801 Extend config API with getSupportedLabels()
New config API method to get all labels that are supported by a
configured provider. Used in the launcher to not have to know
what driver is used and the configuration of its config data
structure.

Change-Id: Ib792e84c546bc52507141835ed57bd72ad9f09d3
2018-03-07 21:42:19 -05:00
David Shrewsbury f19f89777d Rename get_schema to getSchema
This corrects an inconsistency with a class method name, where
camel case is the norm.

Change-Id: I0e02f2425c89b46a780ec99d8053fad4b04d3f9a
2018-03-07 21:24:24 -05:00
Tobias Henkel 5444b6ab1f
Default max pool resources to math.inf
At least for max-servers there is a check outside of the quota
calculation which fails when comparing to None. Instead default the
values to math.inf to make comparisons work. Otherwise running a pool
with no max-servers raises an exception [1].

[1] Trace:
  Traceback (most recent call last):
    File "/opt/nodepool-source/nodepool/driver/__init__.py", line 142, in run
      self.run_handler()
    File "/opt/nodepool-source/nodepool/driver/openstack/handler.py", line 564, in run_handler
      if current_count >= self.pool.max_servers:
  TypeError: unorderable types: int() >= NoneType()

Change-Id: I7ca422c109c6c4aea55bd2ce14be4f2930601a4b
2018-01-25 08:23:55 +01:00
Tristan Cacqueray d0a67878a3 Add a plugin interface for drivers
This change adds a plugin interface so that driver can be loaded dynamically.
Instead of importing each driver in the launcher, provider_manager and config,
the Drivers class discovers and loads driver from the driver directory.

This change also adds a reset() method to the driver Config interface to
reset the os_client_config reference when reloading the OpenStack driver.

Change-Id: Ia347aa2501de0e05b2a7dd014c4daf1b0a4e0fb5
2018-01-19 00:45:56 +00:00
Tobias Henkel 7d79770840 Do pep8 housekeeping according to zuul rules
The pep8 rules used in nodepool are somewhat broken. In preparation to
use the pep8 ruleset from zuul we need to fix the findings upfront.

Change-Id: I9fb2a80db7671c590cdb8effbd1a1102aaa3aff8
2018-01-17 02:17:45 +00:00
Zuul 046eea4df8 Merge "Support ram limit per pool" into feature/zuulv3 2017-12-13 21:33:33 +00:00
Zuul 2a9c608905 Merge "Support cores limit per pool" into feature/zuulv3 2017-12-13 21:33:32 +00:00
Zuul 2bf595088d Merge "Make max-servers optional" into feature/zuulv3 2017-12-13 21:33:31 +00:00
Tobias Henkel 2f183926ca Support ram limit per pool
Nodepool supports pretty generic limits on tenant and pool side. Make
the ram limit configurable.

Change-Id: Ie9f1ece75106fa936a737ed2b289188d9a594fb0
2017-12-13 20:44:13 +01:00
Tobias Henkel 4b51ac6f3e Support cores limit per pool
Nodepool supports pretty generic limits on tenant and pool side. Make
the cores limit configurable.

Change-Id: Ia0e577a710de5dc319e8c51f3353882e3ca186cc
2017-12-13 20:44:13 +01:00
Tobias Henkel 9a4570844a Make max-servers optional
Nodepool now respects the quotas of the tenant it is safe to make
max-servers optional for the pool.

Change-Id: I17731036ad0d8e33f35edb395a0caa2632026c24
2017-12-13 20:44:13 +01:00
Tobias Henkel b707e7218e Add connection-type to provider diskimage
The connection type should be included in the provider diskimage. This
makes it possible to define images using other connection methods than
ssh like winrm for Windows.

Change-Id: Ica0b9afe39d347028eb66c069b8dbd56a8c0ec8c
2017-12-06 21:02:34 +01:00
Tobias Henkel 9065905296 Support username also for unmanaged cloud images
The username should also be configurable for unmanaged cloud images.

Change-Id: Ib4b8878a7fc49ed0016f0e90ff076c057216f740
2017-12-06 20:57:55 +01:00
Tristan Cacqueray 6a716af6a2 Refactor provider config to driver module
This change adds a new ProviderConfig driver interface so that driver can
load and validate their config.

This change also adds a new provider abstract method 'cleanupLeakedResources'
that the openstack driver implements to clean floating ip. This removes the
need for a shared clean-floating-ip provider config.

Change-Id: I20319aa660ebf5fbe8df5d6af1d77028e1b18350
2017-11-29 05:22:12 +00:00