Commit Graph

7 Commits

Author SHA1 Message Date
Tobias Henkel 2dde4404e4
Fix missing semaphore release on node failure
Currently when a node failure occurs on a job with a semaphore it is
not getting released properly. This is only recoverable by a scheduler
restart.

Change-Id: Ifa463824f4a394e015a6ee11fcd51bee163492f8
2019-01-18 08:34:58 +01:00
Tobias Henkel ae887dab58
Improve resource usage with semaphores
Currently when jobs use semaphores they first get and lock the build
nodes and then aquire the semaphore. If there are many jobs waiting
for the semaphore this can block a substantial part of the available
resources. In order to make this safe default to acquire the semaphore
before requesting the nodes.

However in some cases when jobs with a semaphore shall run as fast as
possible in a consecutive manner then it might be preferrable to
accept some waste of resources. In order to support this use case the
job using a semaphore can override this behavior and still acquire the
semaphore after getting the nodes.

Change-Id: Id6f582ec29219d280d05319d1b822c7934437b7a
2018-11-20 15:20:59 +01:00
Tobias Henkel c5e6f5cefe
Fix missing semaphore release on zk error
During problems with zk connectivity jobs can fail locking nodes
[1]. In this case the build doesn't get created and attached to the
queue item. However semaphores are already aquired at this point and
don't get released in this case. Fix this by releasing the semaphore
when hitting this exception.

[1] Trace:
2018-04-05 10:56:56,936 ERROR zuul.Pipeline.example.check: Exception while executing job example-test for change <Change 0x7f65e9dd59e8 14,55692b4a936fff57e33036399927332849a53a92>:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/zuul/manager/__init__.py", line 396, in _executeJobs
    self.sched.nodepool.useNodeSet(nodeset)
  File "/usr/lib/python3.6/site-packages/zuul/nodepool.py", line 117, in useNodeSet
    self.sched.zk.storeNode(node)
  File "/usr/lib/python3.6/site-packages/zuul/zk.py", line 213, in storeNode
    self.client.set(path, self._dictToStr(node.toDict()))
  File "/usr/lib/python3.6/site-packages/kazoo/client.py", line 1242, in set
    return self.set_async(path, value, version).get()
  File "/usr/lib/python3.6/site-packages/kazoo/handlers/utils.py", line 79, in get
    raise self._exception
kazoo.exceptions.NoNodeError

Change-Id: I851876ece318aa047e523c50f4c721417d1af6b7
2018-04-10 18:49:07 +02:00
James E. Blair 2f589fec6b Update test fixtures to use explicit run
Change-Id: I3060a2bf57cef10a5a7ec5299e3491f1f6751221
2017-10-26 15:59:41 -07:00
James E. Blair 2bab6e7361 Require a base job
This makes base jobs required and allows for a per-tenant default.

Story: 2001110
Task: 4793
Change-Id: I26ffddad8358c156cfac749ce98af70f3447f671
2017-08-07 14:52:37 -07:00
Tobias Henkel ea98a194cc Case sensitive label matching
After upgrading Gerrit to 2.13 our gate stopped working. The reason
for this is that after a successful gate run zuul does something like
'gerrit review --label verified=2 --submit'. The verified label in
Gerrit by default is configured as 'Verified'. The newer version of
gerrit behaves different now. It accepts the +2 vote on verified but
doesn't submit the patch anymore if the casing is not correct. This
forces us to specify the label in the same casing as gerrit
expects. In that case the tolower() in canMerge prevents the patch
from entering the gate.

In order to avoid confusion and be consistent, avoid any case
conversions and use the labels exactly as defined in Gerrit.

Note that this patch requires changes to the pipelines such that the
labels are spelled exactly as defined in Gerrit.

Change-Id: I9713a075e07b268e4f2620c0862c128158283c7c
2017-07-27 07:46:35 +02:00
James E. Blair 9ea0d0b937 Move semaphore tests to their own class
Create a dedicated config directory for the semaphore tests and
remove them from the single-tenant configuration.

Create a simplified form of commitLayoutUpdate which accepts a
path to a replacement zuul.yaml and commits it to the specified
config repository to aid in reconfiguration tests.  The existing
similar methods rely on an entire shadow git repository which
requires additional git filesystem operations in tests.

Change-Id: I0f8e99b6ad262ece5a5649a480e0393872761903
2017-04-20 10:48:56 -07:00