[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 2/2] automation: add a smoke test for xen.efi on X86
On Wed, Oct 02, 2024 at 03:22:59PM -0700, Stefano Stabellini wrote: > I forgot to reply to one important part below > > > On Wed, 2 Oct 2024, Stefano Stabellini wrote: > > On Wed, 2 Oct 2024, Marek Marczykowski-Górecki wrote: > > > Check if xen.efi is bootable with an XTF dom0. > > > > > > The TEST_TIMEOUT is set in the script to override project-global value. > > > Setting it in the gitlab yaml file doesn't work, as it's too low > > > priority > > > (https://docs.gitlab.com/ee/ci/variables/#cicd-variable-precedence). > > > > > > The multiboot2+EFI path is tested on hardware tests already. > > > > > > Signed-off-by: Marek Marczykowski-Górecki > > > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > > > --- > > > This requires rebuilding debian:bookworm container. > > > > > > The TEST_TIMEOUT issue mentioned above applies to xilix-* jobs too. It's > > > not clear to me why the default TEST_TIMEOUT is set at the group level > > > instead of in the yaml file, so I'm not adjusting the other places. > > > > Let me premise that now that we use "expect" all successful tests will > > terminate as soon as the success condition is met, without waiting for > > the test timeout to expire. > > > > There is a CI/CD variable called TEST_TIMEOUT set at the > > gitlab.com/xen-project level. (There is also a check in console.exp in > > case TEST_TIMEOUT is not set so that we don't run into problems in case > > the CI/CD variable is removed accidentally.) The global TEST_TIMEOUT is > > meant to be a high value to account for slow QEMU tests running > > potentially on our slowest cloud runners. > > > > However, for hardware-based tests such as the xilinx-* jobs, we know > > that the timeout is supposed to be less than that. The test is running > > on real hardware which is considerably faster than QEMU running on our > > slowest runners. Basically, the timeout depends on the runner more than > > the test. So we override the TEST_TIMEOUT variable for the xilinx-* jobs > > providing a lower timeout value. > > > > The global TEST_TIMEOUT is set to 1500. > > The xilinx-* timeout is set to 120 for ARM and 1000 for x86. > > > > You are welcome to override the TEST_TIMEOUT value for the > > hardware-based QubesOS tests. At the same time, given that on success > > the timeout is not really used, it is also OK to leave it like this. > > > > > --- > > > automation/build/debian/bookworm.dockerfile | 1 + > > > automation/gitlab-ci/test.yaml | 7 ++++ > > > automation/scripts/qemu-smoke-x86-64-efi.sh | 44 +++++++++++++++++++++ > > > 3 files changed, 52 insertions(+) > > > create mode 100755 automation/scripts/qemu-smoke-x86-64-efi.sh > > > > > > diff --git a/automation/build/debian/bookworm.dockerfile > > > b/automation/build/debian/bookworm.dockerfile > > > index 3dd70cb6b2e3..061114ba522d 100644 > > > --- a/automation/build/debian/bookworm.dockerfile > > > +++ b/automation/build/debian/bookworm.dockerfile > > > @@ -46,6 +46,7 @@ RUN apt-get update && \ > > > # for test phase, qemu-smoke-* jobs > > > qemu-system-x86 \ > > > expect \ > > > + ovmf \ > > > # for test phase, qemu-alpine-* jobs > > > cpio \ > > > busybox-static \ > > > diff --git a/automation/gitlab-ci/test.yaml > > > b/automation/gitlab-ci/test.yaml > > > index 8675016b6a37..74fd3f3109ae 100644 > > > --- a/automation/gitlab-ci/test.yaml > > > +++ b/automation/gitlab-ci/test.yaml > > > @@ -463,6 +463,13 @@ qemu-smoke-x86-64-clang-pvh: > > > needs: > > > - debian-bookworm-clang-debug > > > > > > +qemu-smoke-x86-64-gcc-efi: > > > + extends: .qemu-x86-64 > > > + script: > > > + - ./automation/scripts/qemu-smoke-x86-64-efi.sh pv 2>&1 | tee > > > ${LOGFILE} > > > + needs: > > > + - debian-bookworm-gcc-debug > > > > Given that the script you wrote (thank you!) can also handle pvh, can we > > directly add a pvh job to test.yaml too? I guess we can, but is xen.efi + PVH dom0 actually different enough to worth testing given we already test MB2+EFI + PVH dom0? > > > qemu-smoke-riscv64-gcc: > > > extends: .qemu-riscv64 > > > script: > > > diff --git a/automation/scripts/qemu-smoke-x86-64-efi.sh > > > b/automation/scripts/qemu-smoke-x86-64-efi.sh > > > new file mode 100755 > > > index 000000000000..e053cfa995ba > > > --- /dev/null > > > +++ b/automation/scripts/qemu-smoke-x86-64-efi.sh > > > @@ -0,0 +1,44 @@ > > > +#!/bin/bash > > > + > > > +set -ex -o pipefail > > > + > > > +# variant should be either pv or pvh > > > +variant=$1 > > > + > > > +# Clone and build XTF > > > +git clone https://xenbits.xen.org/git-http/xtf.git > > > +cd xtf && make -j$(nproc) && cd - > > > + > > > +case $variant in > > > + pvh) k=test-hvm64-example extra="dom0-iommu=none dom0=pvh" ;; > > > + *) k=test-pv64-example extra= ;; > > > +esac > > > + > > > +mkdir -p boot-esp/EFI/BOOT > > > +cp binaries/xen.efi boot-esp/EFI/BOOT/BOOTX64.EFI > > > +cp xtf/tests/example/$k boot-esp/EFI/BOOT/kernel > > > + > > > +cat > boot-esp/EFI/BOOT/BOOTX64.cfg <<EOF > > > +[global] > > > +default=test > > > + > > > +[test] > > > +options=loglvl=all console=com1 noreboot console_timestamps=boot $extra > > > +kernel=kernel > > > +EOF > > > + > > > +cp /usr/share/OVMF/OVMF_CODE.fd OVMF_CODE.fd > > > +cp /usr/share/OVMF/OVMF_VARS.fd OVMF_VARS.fd > > > + > > > +rm -f smoke.serial > > > +export TEST_CMD="qemu-system-x86_64 -nographic -M > > > q35,kernel-irqchip=split \ > > > + -drive if=pflash,format=raw,readonly=on,file=OVMF_CODE.fd \ > > > + -drive if=pflash,format=raw,file=OVMF_VARS.fd \ > > > + -drive file=fat:rw:boot-esp,media=disk,index=0,format=raw \ > > > + -m 512 -monitor none -serial stdio" > > > + > > > +export TEST_LOG="smoke.serial" > > > +export PASSED="Test result: SUCCESS" > > > +export TEST_TIMEOUT=120 > > Although this works, I would prefer keeping the TEST_TIMEOUT overrides > in test.yaml for consistency. The problem is this doesn't work. The group-level variable overrides the one in yaml. See the commit message and the link there... > However, it might be better not to > override it (or to override to a higher timeout value), as successful > tests will terminate immediately anyway. We need to be cautious about > setting TEST_TIMEOUT values too low, as using a slow runner (like a > small, busy cloud instance) can lead to false positive failures. This > issue occurred frequently with ARM tests when we temporarily moved from > a fast ARM server to slower ARM cloud instances a couple of months ago. > > On the other hand, adjusting TEST_TIMEOUT for non-QEMU hardware-based > tests is acceptable since those tests rely on real hardware > availability, which is unlikely to become suddenly slower. -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |