Commit Graph

93 Commits

Author SHA1 Message Date
James E. Blair 95bd778fcc Split build/buildset list queries into two
The only way to get reasonable performance out of all three databases
(mariadb, mysql, postgres) appears to be to make the following changes:

* Use group-by instead of distinct.
  Some versions of mysql produce the wrong output with distinct;
  this is more descriptive of what we want anyway.
* Split the query into two statements instead of joining on a subquery.
  Mariadb seems to be unable to produce a good query plan for the
  buildset list query.  Neither mariadb nor mysql support using
  "IN" with a subquery (so the idea of using IN instead of JOIN for
  the subquery is out).

These methods now perform a query to get the ids of the builds or
buildsets that match the criteria, then perform a second query to load
the ORM objects that match those ids.  This appears to be quite
fast for all three queries with the latest versions of all three database
systems.

Change-Id: I30bb3214807dfa8b26a848f85bb7a7bc660c6c1d
2024-04-09 06:39:08 -07:00
James E. Blair 014717bde6 Restore mysql hint conditional
This restores the conditional around adding the mysql index hint
for the builds query.  It was thought (incorrectly) that the hint
is now safe in all cases, but it is still the case that it can
slow down the query if some columns appear in the where clause.

Change-Id: Ieafab7e529e305f084ebf8c7844c8fa78da86902
2024-04-07 09:06:48 -07:00
Zuul 4a7a534191 Merge "Zuul-web: Fix literal filter queries being esacped" 2024-04-06 15:52:09 +00:00
James E. Blair eab0be8519 Restore mysql index hint
We previously included this mysql index hint under some circumstances
for both build and buildset queries.  It initially appeared to no
longer be necessary after recent table changes, however that does not
seem to be the case, and in fact, it is necessary for some simple
build table queries.

It does no longer appear to be necessary for the buildset table, so
it is not added back here.

When applied to MySQL 5.7, these hints result in the build and buildset
queries returning the first N rows rather than the last N rows as
expected.

We should be able to resolve that issue as well, but it will require
further changes.  We're going to merge this as a short-term fix while
we work on a better solution.  The Zuul community has been notified
that master is currently a work in progress and to temporarily avoid
continuous deployment under mysql/mariadb.

Change-Id: Ia6d962957af3cac87d174145e69571a405ed505b
2024-04-05 09:35:37 -07:00
Benjamin Schanzel 800171a205
Zuul-web: Fix literal filter queries being esacped
https://review.opendev.org/c/908420 introduced a bug where literal
filters on builds and buildsets with sql wildcard characters (_, %) get
escaped, altering the search string. That leads to missing filter
results, e.g. when the user searches for a job_name with an underscore
in the name, because zuul escapes the underscore, searching for
`foo$_bar` instead of `foo_bar` (where `$` is the escape char). Fix this
by only applying the sanitation if we're going to filter with the sql
`LIKE` operator.

Change-Id: Iccef71fdf8a5a1150ff957be68358882e16e9da8
2024-04-03 12:12:57 +02:00
Simon Westphahl 8bcfb8bc4a Correctly limit buildsets with multiple refs
The current SQL query will not correctly limit the number of buildsets
when some of the buildsets are related to multiple refs (circular
dependencies). The problem is that LIMTI works on number of rows, but we
want to limit only on the number of buildsets.

This corrects the problem by using a subquery to identify the distinct
buildsets, limiting that, and then querying for all of the information
about those buildsets.

This also updates the methods which perform the same function for builds,
even though we are not yet seeing an issue in practice.  It is
theoretically possible to call the getBuilds method with 'provides' and
'limit' arguments, which would produce the same problem as the buildsets
query.  That is not possible in practice, as the REST API doesn't support
provides, and the scheduler which does pass the provides argument doesn't
pass limit.  However, that could easily change in the future.  Additionally,
this future-proofs us if we add more queryable one-to-many relationships
to builds in the future (such as if we linked builds to multiple refs).
Also, it's easier to maintain these methods if they follow the same pattern.

There does not appear to be a performance loss in either mysql or postgres
with local testing on large data sets.  There may actually be an improvement
(but it's within the margin of error, so hard to say).

The index hints previously needed for mysql appear to no longer be
necessary and are removed.

Change-Id: Ib19e4cb8171f5d4d2873fb6b9c0301eb5d4ee43d
Co-Authored-By: James E. Blair <jim@acmegating.com>
2024-03-25 13:31:19 -07:00
Benjamin Schanzel f9ebf6a1c9
Zuul-Web: substring search for builds, buildsets
Allow to search for builds and buildsets using substrings of job_name,
project, branch, and pipeline. This is done by placing wildcard
characters (*) into the filter string which get translated to SQL
wildcards (%), representing zero, one, or multiple characters.

Translating SQL style wildcards (%) to asterisks is done because
asterisks as wildcard chars might feel more intuitive, cf. shell file
globbing or regexp.

The SQL LIKE operator is only used if a wildcard is present in the
filter string. This is to not rely on the underlying SQL implementation
of optimizing queries with a LIKE op but no wildcard (ie. exact match),
so that we don't introduce unnecessary performance penalties.

Change-Id: I827a27915308f78fc01019bd988b34ea987c90ea
2024-03-12 13:58:01 +01:00
Zuul 1beac435ab Merge "Finish circular dependency refactor" 2024-02-10 21:27:22 +00:00
James E. Blair 1f026bd49c Finish circular dependency refactor
This change completes the circular dependency refactor.

The principal change is that queue items may now include
more than one change simultaneously in the case of circular
dependencies.

In dependent pipelines, the two-phase reporting process is
simplified because it happens during processing of a single
item.

In independent pipelines, non-live items are still used for
linear depnedencies, but multi-change items are used for
circular dependencies.

Previously changes were enqueued recursively and then
bundles were made out of the resulting items.  Since we now
need to enqueue entire cycles in one queue item, the
dependency graph generation is performed at the start of
enqueing the first change in a cycle.

Some tests exercise situations where Zuul is processing
events for old patchsets of changes.  The new change query
sequence mentioned in the previous paragraph necessitates
more accurate information about out-of-date patchsets than
the previous sequence, therefore the Gerrit driver has been
updated to query and return more data about non-current
patchsets.

This change is not backwards compatible with the existing
ZK schema, and will require Zuul systems delete all pipeline
states during the upgrade.  A later change will implement
a helper command for this.

All backwards compatability handling for the last several
model_api versions which were added to prepare for this
upgrade have been removed.  In general, all model data
structures involving frozen jobs are now indexed by the
frozen job's uuid and no longer include the job name since
a job name no longer uniquely identifies a job in a buildset
(either the uuid or the (job name, change) tuple must be
used to identify it).

Job deduplication is simplified and now only needs to
consider jobs within the same buildset.

The fake github driver had a bug (fakegithub.py line 694) where
it did not correctly increment the check run counter, so our
tests that verified that we closed out obsolete check runs
when re-enqueing were not valid.  This has been corrected, and
in doing so, has necessitated some changes around quiet dequeing
when we re-enqueue a change.

The reporting in several drivers has been updated to support
reporting information about multiple changes in a queue item.

Change-Id: I0b9e4d3f9936b1e66a08142fc36866269dc287f1
Depends-On: https://review.opendev.org/907627
2024-02-09 07:39:40 -08:00
Zuul 0e6b023a5f Merge "Optimize db prune query" 2024-01-30 22:39:21 +00:00
James E. Blair 344ad5c3d3 Avoid joining the provides table in the builds query
The builds query includes an outer join on the provides table for
the purpose of allowing the scheduler to find previously completed
builds which provide artifacts for items currently in the queue.
Otherwise it is unused (it is not possible to query builds by
provides in the web api).

This join can sometimes produce very slow queries, especially
under postgres with certain data characteristics (usually when there
are too few rows to prompt the query planner to use indexes).  This
is similar to the problem that prompted the recent change to the
job runtime queries.

To avoid this, only join the provides table if required for the query.

Change-Id: I83810577c230bcd7365504e01e2cd1d1e642fa0e
2024-01-11 10:41:21 -08:00
James E. Blair a0cefe66c1 Optimize db prune query
We've added some more tables and the query used to delete old
buildsets suffered as a result.  The current approach performs
quite a lot of individual queries: one for each table multiplied
by each buildset that we inspect.

A faster approach is to use one query for each table, regardless
of the number of buildsets (note, however, that sqlalchemy will
shard this in order to manage query length; in practice, this means
one query for each table for every group of 500 buildsets -- so
a 1/500 reduction in the number of queries).

In local testing, this runs in 1/3 of the time as the current
code in postgres, and between 1/5 and 2/3 of the time with
mysql (performance appears more variable than postgres).

Change-Id: Ie93318ab12832516640ee9210ec071e18623eac4
2024-01-11 09:48:34 -08:00
James E. Blair f99cee543e Use getBuildTimes for build time estimator
The existing build time estimator uses the normal getBuilds query
method.  We recently added an optimized query for a new build times
API endpoint.  As one would expect, the build time estimator has
limited query needs and should be able to use the new optimized
query as well.

The one thing missing is a search for a specific result (ie, SUCCESS),
so that is added.  The unused sort operator is removed.

From what I can tell, switching to sorting by end_time should not
produce a reduction in performance compared to sorting by id.

Change-Id: I1096d466accad5574b6cfa226e68b070f769128f
2024-01-04 06:30:52 -08:00
James E. Blair 50f068ee6d Add a build-times web endpoint
This endpoint runs an optimized query for returning information
suitable for displaying a graph of build times.

This includes a schema migration to add some indexes to aid
the query.

Change-Id: I56e8422a599c1ee51216f26fcae5a39013066e6b
2024-01-03 13:06:07 -08:00
James E. Blair eaab8d6017 Add more zuul_ref indexes
This drops the (project, change) index in favor of several single-column
indexes that the dbms can combine as needed to quickly seek zuul_refs in queries.

Change-Id: I50ec1552e24c5aa634538ebb181da907898f1284
2023-11-28 07:55:18 -08:00
James E. Blair 4ac07c3a22 Log migration errors
If we encounter an error in sql migrations during startup, send it
to the python logger before we exit.

Change-Id: I574123b5e1eb27b75614eaf0a0b0235a4fd47adb
2023-11-17 09:48:30 -08:00
James E. Blair 0a08299b5f Refactor bundle in sql connection
This refactors the sql connection to accomodate multiple
simulataneous changes in a buildset.

The change information is removed from the buildset table and
placed in a ref table.  Buildsets are associated with refs
many-to-many via the zuul_buildset_ref table.  Builds are also
associated with refs, many-to-one, so that we can support
multiple builds with the same job name in a buildset, but we
still know which change they are for.

In order to maintain a unique index in the new zuul_ref table (so that
we only have one entry for a given ref-like object (change, branch,
tag, ref)) we need to shorten the sha fields to 40 characters (to
accomodate mysql's index size limit) and also avoid nulls (to
accomodate postgres's inability to use null-safe comparison operators
on indexes).  So that we can continue to use change=None,
patchset=None, etc, values in Python, we add a sqlalchemy
TypeDectorator to coerce None to and from null-safe values such as 0
or the empty string.

Some previous schema migration tests inserted data with null projects,
which should never have actually happened, so these tests are updated
to be more realistic since the new data migration requires non-null
project fields.

The migration itself has been tested with a data set consisting of
about 3 million buildsets with 22 million builds.  The runtime on one
ssd-based test system in mysql is about 22 minutes and in postgres
about 8 minutes.

Change-Id: I21f3f3dfc8f93a23744856e5b82b3c948c118dc2
2023-10-19 17:42:09 -07:00
Benjamin Schanzel 5a288e0ca4
Log exceptions on db migration errors
Exceptions on db migration errors are not printed to the log, which
makes it cumbersome to find the cause.

Change-Id: Ie8ce5c1b091f879ff835188a3f402187ce4065b2
2023-09-01 09:55:27 +02:00
James E. Blair a2b7c8ae7c Add explicit foreign key indexes
The artifact, provides, and build_event tables all have foreign key
references to the build table.  In MySQL this causes an automatic
index to be created, but that does not appear to be the case in
Postgres.  Without these indexes, many of the queries we do can be
quite costly.

To address this, explicitly create these indexes.  Under MySQL, this
will simply rename the existing indexes, so this should be safe
and effective for both systems.

Change-Id: I16223fba75c1295480431d03ac59f72a5281b498
2023-07-10 12:46:46 -07:00
Zuul a914670514 Merge "Add statement timeouts to some web sql queries" 2023-05-03 06:51:46 +00:00
James E. Blair 7153505cd5 Fix prune-database command
This command had two problems:

* It would only delete the first 50 buildsets
* Depending on DB configuration, it may not have deleted anything
  or left orphan data.

We did not tell sqlalchemy to cascade delete operations, meaning that
when we deleted the buildset, we didn't delete anything else.

If the database enforces foreign keys (innodb, psql) then the command
would have failed.  If it doesn't (myisam) then it would have deleted
the buildset rows but not anything else.

The tests use myisam, so they ran without error and without deleting
the builds.  They check that the builds are deleted, but only through
the ORM via a joined load with the buildsets, and since the buildsets
are gone, the builds weren't returned.

To address this shortcoming, the tests now use distinct ORM methods
which return objects without any joins.  This would have caught
the error had it been in place before.

Additionally, the delet operation retained the default limit of 50
rows (set in place for the web UI), meaning that when it did run,
it would only delete the most recent 50 matching builds.

We now explicitly set the limit to a user-configurable batch size
(by default, 10,000 builds) so that we keep transaction sizes
manageable and avoid monopolizing database locks.  We continue deleting
buildsets in batches as long as any matching buildsets remain. This
should allow users to remove very large amounts of data without
affecting ongoing operations too much.

Change-Id: I4c678b294eeda25589b75ab1ce7c5d0b93a07df3
2023-03-29 17:12:13 -07:00
James E. Blair 84c0420792 Add statement timeouts to some web sql queries
The SQL queries are designed to be highly optimized and should return
in milliseconds even with millions of rows.  However, sometimes
query planners are misled by certain characteristics and can end
up performing suboptimally.

To protect the web server in case that happens, set a statement or
query timeout for the queries which list builds or buildsets.  This
will instruct mysql or postgresql to limit execution of the buildset
or build listing queries to 30 seconds -- but only if these queries
originate in zuul-web.  Other users (such as the admin tools) may
still run these queries without an explicit time limit (though the
server may still have one).

Unfortunately (or perhaps fortunately) the RDBMSs can occasionally
satisfy the queries we use in testing in less than 1ms, making a
functional test of this feature impractical (we are unable to set
the timeout to 0ms).

Change-Id: If2f01b33dc679ab7cf952a4fbf095a1f3b6e4faf
2023-03-13 14:57:29 -07:00
Clark Boylan cfedda0c2f Don't nest alembic transactions
SqlAlchemy 2.0 has gotten a lot more picky about transactions. In
addition to needing to explicitly set up transactions using SqlAlchemy
2.0 it seems alembic's get_current_revision() call cannot be in the same
transaction as the alembic migrate/stamp calls with MySQL 8.0.

In particular the get_current_revision call seems to get a
SHARED_HIGH_PRIO lock on the alembic_version table. This prevents the
migrate/stamp calls from creating the alembic_version table as this
requires an EXCLUSIVE lock. The SHARED_HIGH_PRIO lock appears to be in
place as long as the get_current_revision transaction is active. To fix
this we simplify our migration tooling and put get_current_revision in a
transaction block of its own. The rest of our migrate function calls
into functions that will setup new transactions and it doesn't need to
be in this block.

Change-Id: Ic71ddf1968610784cef72c4634ccec3a18855a0e
2023-02-01 09:18:56 -08:00
Felix Edel f9786ac2a8 Store pause and resume events on the build and report them
When a build is paused or resumed, we now store this information on the
build together with the event time. Instead of additional attributes for
each timestamp, we add an "event" list attribute to the build which can
also be used for other events in the future.

The events are stored in the SQL database and added to the MQTT payload
so the information can be used by the zuul-web UI (e.g. in the "build
times" gantt chart) or provided to external services.

Change-Id: I789b4f69faf96e3b8fd090a2e389df3bb9efd602
2023-01-02 10:07:26 +01:00
James E. Blair 0738d31b08 Include skipped builds in database and web ui
We have had an on-and-off relationship with skipped builds in the
database.  Generally we have attempted to exclude them from the db,
but we have occasionally (accidentally?) included them.  The status
quo is that builds with a result of SKIPPED (as well as several
other results which don't come from the executor) are not recorded
in the database.

With a greater interest in being able to determine which jobs ran
or did not run for a change after the fact, this job deliberately
adds all builds (whether they touch an executor or not, whether
real or not) to the database.  This means than anything that could
potentially show up on the status page or in a code-review report
will be in the database, and can therefore be seen in the web UI.

It is still the case that we are not actually interested in seeing
a page full of SKIPPED builds when we visit the "Builds" tab in
the web ui (which is the principal reason we have not included them
in the database so far).  To address this, we set the default query
in the builds tab to exclude skipped builds (it is easy to add other
types of builds to exclude in the future if we wish).  If a user
then specifies a query filter to *include* specific results, we drop
the exclusion from the query string.  This allows for the expected
behavior of not showing SKIPPED by default, then as specific results
are added to the filter, we show only those, and if the user selects
that they want to see SKIPPED, they will then be included.

On the buildset page, we add a switch similar to the current "show
retried jobs" switch that selects whether skipped builds in a buildset
should be displayed (again, it hides them by default).

Change-Id: I1835965101299bc7a95c952e99f6b0b095def085
2022-10-06 13:28:02 -07:00
James E. Blair 3ffbf10f25 Add prune-database command
This adds a zuul-admin command which allows operators to delete old
database entries.

Change-Id: I4e277a07394aa4852a563f4c9cdc39b5801ab4ba
2022-05-30 07:31:16 -07:00
James E. Blair e03d8c887c Rename MERGER_FAILURE to MERGE_CONFLICT
This is a prelude to a change which will report a distinct buildset result
to the database if the upstream code review system is unable to merge a change.
Currently it is reported as MERGER_FAILURE which makes it difficult to
distinguish from merge conflicts.

Essentially, the two states we're interested in are when Zuul's merger is
unable to prepare a git checkout of the change (99% of the time, this is
a merge conflict).  This will be known as MERGE_CONFLICT now.

The second state is when Zuul asks Gerrit/Github/etc to submit/merge a change
and the remote system is unable (or refuses) to do so.  In a future change,
that will be reported as MERGE_FAILURE.

To avoid confusion and use names which better reflect the situation, this change
performs the rename to MERGE_CONFLICT.

Because there are pipeline configuration options tied to the MERGER_FAILURE
status (which start with 'merge-failure') they are also renamed to 'merge-conflict'.
The old names are supported for backwards compatibility.

A SQL migration takes care of updating values in the database.

The upgrade procedure is noted as being special because of the db value updates.
If an operator doesn't follow the recommended procedure, however, the consequences
are minimal (builds which won't be easily queried in the web ui; that can be
manually corrected if desired).

A model API change is not needed since the only place where we receive this value
from ZK can be updated to accept both values.

Change-Id: I3050409ed68805c748efe7a176b9755fa281536f
2022-02-24 17:06:03 -08:00
James E. Blair 8f9b99dc7b Add buildset start/end db columns
Add two columns to the buildset table in the database: the timestamp
of the start of the first build and the end of the last build.  These
are calculated from the builds in the webui buildset page, but they
are not available in the buildset listing without performing
a table join on the server side.

To keep the buildset query simple and fast, this adds the columns to
the buildset table (which is a minor data duplication).

Return the new values in the rest api.

Change-Id: Ie162e414ed5cf09e9dc8f5c92e07b80934592fdf
2022-02-23 06:24:04 -08:00
Zuul 61691dcb3f Merge "Display overall duration in buidset page in zuul web" 2022-02-05 19:45:09 +00:00
James E. Blair 29fbee7375 Add a model API version
This is a framework for making upgrades to the ZooKeeper data model
in a manner that can support a rolling Zuul system upgrade.

Change-Id: Iff09c95878420e19234908c2a937e9444832a6ec
2022-01-27 12:19:11 -08:00
Dong Zhang 8a01c61991 Display overall duration in buidset page in zuul web
The overall duration is from a user (developer) point of view, how much
time it takes from the trigger of the build (e.g. a push, a comment,
etc.), till the last build is finished.

It takes into account also the time spent in waiting in queue, launching
nodes, preparing the nodes, etc.

Technically it measures between the event timestamp and the end time of
the last build in the build set.

This duration reflects the user experience of how much time the user needs
to wait.

Change-Id: I253d023146c696d0372197e599e0df3c217ef344
2022-01-27 10:38:29 -08:00
Simon Westphahl cdf7dbcb89 Protect database migration by a lock in Zookeeper
Ensure that during startup of multiple schedulers or web-instances in
parallel only one at a time is performing the migration for a SQL
connection.

Change-Id: I734bd76dde16c52cd76ea93e44a0fc6e7c7855f1
2022-01-18 14:41:30 -08:00
Matthieu Huin bc211ea798 REST API: add idx_min, idx_max params to getBuilds, getBuildsets
Allow filtering searches per primary index; ie return only
builds or buildsets whose primary index key is greater than idx_min
or lower than idx_max. This is expected to increase queries speed
compared to using the offset argument when it is possible to do
so, since "offset" requires the database to sift through all results until
the offset is reached.

Change-Id: I420d71d7c62dad6d118310525e97b4a546f05f99
2021-12-02 21:59:48 +01:00
Felix Edel 2c900c2c4a Split up registerScheduler() and onLoad() methods
This is an early preparation step for removing the RPC calls between
zuul-web and the scheduler.

In order to do so we must initialize the ConfigLoader in zuul-web which
requires all connections to be available. Therefore, this change ensures
that we can load all connections in zuul-web without providing a
scheduler instance.

To avoid unnecessary traffic from a zuul-web instance the onLoad()
method initializes the change cache only if a scheduler instance is
available on the connection.

Change-Id: I3c1d2995e81e17763ae3454076ab2f5ce87ab1fc
2021-11-09 09:17:43 +01:00
James E. Blair 7d7d2f9f2a Remove time database
We can obtain the same information from the SQL database now, so
do that and remove the filesystem-based time database.  This will
help support multiple schedulers (as they will all have access to
the same data).

Nothing in the scheduler uses the state directory anymore, so clean
up the docs around that.  The executor still has a state dir where
it may install ansible-related files.

The SQL query was rather slow in practice because it created a
temporary table since it was filtering mostly by buildset fields
then sorting by build.id.  We can sort by buildset.id and get nearly
the same results (equally valid from our perspective) much faster.

In some configurations under postgres, we may see a performance
variation in the run-time of the query.  In order to keep the time
estimation out of the critical path of job launches, we perform
the SQL query asynchronously.  We may be able to remove this added
bit of complexity once the scale-out-scheduler work is finished
(and/or we we further define/restrict our database requirements).

Change-Id: Id3c64be7a05c9edc849e698200411ad436a1334d
2021-09-27 11:54:33 -07:00
James E. Blair 279a9b7ea8 Import alembic.migration
If this works, it's apparently needed by alembic 1.7.x

Change-Id: Icbbffeb3b30410c4af33f0cdf74821eb4f6eb676
2021-08-31 10:20:59 -07:00
Zuul a619c9d8a6 Merge "Escape raw percent in 'dburi' value" 2021-08-24 11:44:16 +00:00
Clark Boylan f52c54ab91 Prepare for SQLAlchemy 2.0
SQLAlchemy 2.0 will introduce a number of changes. Thankfully Current
sqlalchemy has deprecation warnings and can be run with future flags set
to True to enforce 2.0 behavior. We use these tools to prepare Zuul for
the SQLAlchemy 2.0 release.

In tox.ini configure the environment to always emit DeprecationWarnings
for modules that touch sqlalchemy.

Update sqlconnection to use the new location for declarative_base and
set future=True on our Engine and Session objects.

Finally update the database migration tests to use transaction based
connections for executing raw sql statements. Also we switch to the
exec_driver_sql method for that. SQLAlchemy 2.0 will not do implicit
autocommiting and doesn't support executing strings directly.

https://docs.sqlalchemy.org/en/14/changelog/migration_20.html has tons
of info on these changes. Reviews should probably pay attention to the
transaction behavior changes as well as any alembic code that might also
need updating.

Change-Id: I4e7a56d24d0f52b6d5b00a8c12fed52d6fae92ef
2021-08-12 12:53:40 -07:00
James E. Blair cb613a1da9 Add a 'complete' parameter to build/buildset queries
This adds a tri-state parameter to the build and buildset queries,
both in the internal API and via the web API.  True means return
builds with results, False means only in-progress builds,
None (or omitted) means both.

Also render "In Progress" builds as such in the web UI.

Change-Id: Ib021e6a2c7338c08deae1aef4dbb5f0d9154daa0
2021-07-16 16:24:54 -07:00
James E. Blair 29234faf6c Add nodeset to build table
The name of the nodeset used by a job may be of interest to users
as they look at historical build results (did py38 run on focal or
bionic?).  Add a column for that purpose.

Meanwhile, the node_name column has not been used in a very long
time (at least v3).  It is universally null now.  As a singular value,
it doesn't make a lot of sense for a multinode system.  Drop it.

The build object "node_labels" field is also unused; drop it as well.

The MQTT schema is updated to match SQL.

Change-Id: Iae8444dfdd52561928c80448bc3e3158744e08e6
2021-07-08 15:47:47 -07:00
James E. Blair 97811f53d6 Report intermediate buildsets and builds
This moves some functions of the SQL reporter into the pipeline
manager, so that builds and buildsets are always recorded in the
database when started and when completed.  The 'final' flag is
used to indicate whether a build or buildset result is user-visible
or not.

Change-Id: I053e195d120ecbb2fd89cf7e1e9fc7eccc9dcd2f
2021-07-08 14:07:20 -07:00
Pierre-Louis Bonicoli 1e89973736
Escape raw percent in 'dburi' value
The alembic documentation mentions that a raw percent sign not part of
an interpolation symbol in 'value' parameter must be escaped.

Fix this exception which occurs when for example a password contains
a percent:

    2021-06-11 10:22:14,366 ERROR zuul.Scheduler: Error starting Zuul:
    Traceback (most recent call last):
      File "zuul/lib/python3.7/site-packages/zuul/cmd/scheduler.py", line 172, in run
        self.sched.registerConnections(self.connections)
      File "zuul/lib/python3.7/site-packages/zuul/scheduler.py", line 400, in registerConnections
        self.connections.registerScheduler(self, load)
      File "zuul/lib/python3.7/site-packages/zuul/lib/connections.py", line 73, in registerScheduler
        connection.onLoad()
      File "zuul/lib/python3.7/site-packages/zuul/driver/sql/sqlconnection.py", line 247, in onLoad
        self._migrate()
      File "zuul/lib/python3.7/site-packages/zuul/driver/sql/sqlconnection.py", line 238, in _migrate
        self.connection_config.get('dburi'))
      File "zuul/lib/python3.7/site-packages/alembic/config.py", line 242, in set_main_option
        self.set_section_option(self.config_ini_section, name, value)
      File "zuul/lib/python3.7/site-packages/alembic/config.py", line 269, in set_section_option
        self.file_config.set(section, name, value)
      File "/usr/lib/python3.7/configparser.py", line 1198, in set
        super().set(section, option, value)
      File "/usr/lib/python3.7/configparser.py", line 893, in set
        value)
      File "/usr/lib/python3.7/configparser.py", line 402, in before_set
        "position %d" % (value, tmp_value.find('%')))
    ValueError: invalid interpolation syntax in 'postgresql://[...]' at position 18

Change-Id: I96d70f68da2ba58455cbc2ae4d54a3c90f461123
2021-06-23 18:08:36 +02:00
James E. Blair abe2a482be Use create_all for empty databases
Rather than running through all of the migrations when starting Zuul
with an empty database, this uses sqlalchemy's create_all method to
create it from the declarative schema.

To make sure that stays in sync with alembic, a test is added to run
DB creation both ways and compare.

The declaritive schema had one column with an incorrect type, and
several columns out of order; this adjusts the schema to match the
migrations.

Contrary to expectations, using sqlalchemy to create the schema actually
adds about 0.05 seconds on averate to test runtime.

Change-Id: I594b6980f5efa5fa4b8ca387c5d0ab4373b86394
2021-06-19 14:39:10 -07:00
James E. Blair a951f37280 Wait on startup if SQL not available
Now that the SQL database is required, fail to start if the dburi has
an error (like an incorrect module specification), and wait forever
for a connection to the database before proceeding.

This can be especially helpful in container environments where starting
Zuul may race starting a SQL database.

A test which verified that Zuul would start despite problems with the
SQL connection is removed since that is no longer the desired behavior.

Change-Id: Iae8ea420297f6264ae1d265b22b96d81f1df9a12
2021-05-17 19:59:02 -07:00
Jan Kubovy 9ab527971f Required SQL reporters
On the way towards a fully scale out scheduler we need to move the
times database from the local filesystem into the SQL
database. Therefore we need to make at least one SQL connection
mandatory.

SQL reporters are required (an implied sql reporter is added to
every pipeline, explicit sql reporters ignored)

Change-Id: I30723f9b320b9f2937cc1d7ff3267519161bc380
Depends-On: https://review.opendev.org/621479
Story: 2007192
Task: 38329
2021-02-03 13:41:55 -08:00
Matthieu Huin ae22f46223 Builds: add "held" attribute
The boolean "held" attribute is set to True if a build triggered
a autohold request, and its nodeset was held.
Allow filtering builds by "held" status.

Change-Id: I6517764c534f3ba8112094177aefbaa144abafae
2020-09-04 18:09:37 +02:00
Benedikt Loeffler 25ccee91af
Report retried builds via sql reporter.
Since we added those to the MQTT reporter, we should also store them in
the SQL database. They are stored in the zuul_build table and can be
identified via the new "final" column which is set to False for those
builds (and True for all others).

The final flag can also be used in the API to specifically filter for
those builds or remove them from the result. By default, no builds are
filtered out.

The buildset API includes these retried builds under a dedicated
'retry_builds' key to not mix them up with the final builds. Thus, the
JSON format is equally to the one the MQTT reporter uses.

For the migration of the SQL database, all existing builds will be set
to final.

We could also provide a filter mechanism via the web UI, but that should
be part of a separate change (e.g. add a checkbox next to the search
filter saying "Show retried builds").

Change-Id: I5960df8a997c7ab81a07b9bd8631c14dbe22b8ab
2020-04-09 11:57:29 +02:00
James E. Blair 828d5deb1a Store build.error_detail in SQL
This should be stored in the SQL database so that the build page
can present the reason why a particular build failed, instead of
just the result "ERROR".

Change-Id: I4dd25546e27b8d3f3a4e049f9980082a3622079f
2020-02-26 15:11:02 -08:00
Simon Westphahl e58e8224c0 Store event id for buildsets in database
Having the zuul event id available in the database and also in the build
and buildset detail page makes debugging a lot easier.

Change-Id: Ia1e4aaf50fb28bb27cbcfcfc3b5a92bba88fc85e
2019-12-05 15:55:43 +01:00
James E. Blair 267345125f Improve SQL query performance in some cases
The query for builds-joined-with-buildsets is currently optimized
for the case where little additional filtering is performed.  E.g.,
the case where a user browses to the builds tab and does not enter
any search terms.  In that case, mysql needs a hint supplied in
order to choose the right index.

When search terms are entered which can, due to the presense of
other indexes, greatly reduce the working set, it's better to let
the query planner off the leash and it will make the right choices.

This change stops adding the hint in the cases where a user supplies
a search term that matches one of the indexes on the build or
buildset table (notable exception: job_name because it is difficult
to generalize about that one).

It also adds an additional index for build and buildset uuids,
which should provide excellent performance when searching for
only those terms.

Change-Id: I0277be8cc4ba7555c5e6a9a7eb3eed988a24469c
2019-07-25 08:39:23 -07:00