Cross-track upgrade testing for LXD clusters

Adding multi-step cross-track upgrade tests to LXD CI, so a cluster upgrade from an LTS track all the way to the development head is validated before it ships.

canonical/lxd#17174 was a regression where a cluster upgrade from the 5.21 LTS track to latest would fail under certain conditions. The bug was caught late because the CI only tested upgrades within a single snap track – the cross-track jump was never exercised. canonical/lxd#17175 tracked the ask to fix that gap. This post covers the two PRs that close it.

What was missing

The existing tests/cluster script in lxd-ci took an explicit source and destination channel as arguments:

# caller in the LXD workflow
EXTRA_ARGS="3 ${src_track} ${dst_track}"

With SNAP_TRACK=latest, both src_track and dst_track resolved to channels on latest, so the test only ever validated within-track refreshes. An upgrade starting from 5.21/stable – the channel an LTS user is actually on – was never tested.

The upgrade chain

canonical/lxd-ci#758 refactors tests/cluster to infer the full upgrade path from a single target channel. The core is a get_upgrade_chain() function. Given a target like 5.21/edge, it produces an ordered list of channels to upgrade through:

5.21/stable -> 5.21/edge -> 6/edge

The first step keeps the within-track refresh (the path most real users follow before crossing tracks – an LTS user on 5.21/stable would first refresh within the LTS before jumping to head). The final step is the cross-track jump to the current head. For a target already on track 6 or latest, only the within-track steps are emitted – no redundant hop since the two tracks carry the same snaps.

A KNOWN_TRACKS array validates the input and fails the test immediately if an unrecognised track appears, so CI goes red whenever a new LTS track opens and the list needs updating.

KNOWN_TRACKS=("5.0" "5.21" "6")

latest is normalised to 6 internally so callers using either alias both work.

The LXD side

canonical/lxd#18287 drops the track-derivation logic from the workflow and delegates it to the script:

# before
EXTRA_ARGS="${EXTRA_ARGS:-3} ${src_track} ${dst_track}"

# after
EXTRA_ARGS="${EXTRA_ARGS:-3}"

The script also handles a kernel 6.17 workaround: */stable channels have a dqlite/AppArmor regression on that kernel, so get_upgrade_chain() substitutes */candidate as the install channel when it detects 6.17.0. That logic had previously lived in the workflow and is now co-located with the rest of the channel logic.