Xen project Mailing List

Re: [PATCH v2 3/5] automation: Add the expect script with test case for FVP

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Fri, 8 Dec 2023 13:30:05 -0800 (PST)

Cc: Michal Orzel <michal.orzel@xxxxxxx>, Henry Wang <Henry.Wang@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Doug Goldstein <cardoe@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>

Delivery-date: Fri, 08 Dec 2023 21:30:19 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, 8 Dec 2023, Julien Grall wrote: > On 08/12/2023 09:50, Michal Orzel wrote: > > On 08/12/2023 10:21, Henry Wang wrote: > > > > On Dec 8, 2023, at 17:11, Michal Orzel <michal.orzel@xxxxxxx> wrote: > > > > On 08/12/2023 10:05, Henry Wang wrote: > > > > > > > > > > Hi Michal, > > > > > > > > > > > On Dec 8, 2023, at 16:57, Michal Orzel <michal.orzel@xxxxxxx> wrote: > > > > > > > > > > > > Hi Henry, > > > > > > > > > > > > On 08/12/2023 06:46, Henry Wang wrote: > > > > > > > diff --git > > > > > > > a/automation/scripts/expect/fvp-base-smoke-dom0-arm64.exp > > > > > > > b/automation/scripts/expect/fvp-base-smoke-dom0-arm64.exp > > > > > > > new file mode 100755 > > > > > > > index 0000000000..25d9a5f81c > > > > > > > --- /dev/null > > > > > > > +++ b/automation/scripts/expect/fvp-base-smoke-dom0-arm64.exp > > > > > > > @@ -0,0 +1,73 @@ > > > > > > > +#!/usr/bin/expect > > > > > > > + > > > > > > > +set timeout 2000 > > > > > > Do we really need such a big timeout (~30 min)? > > > > > > Looking at your test job, it took 16 mins (quite a lot but I know > > > > > > FVP is slow > > > > > > + send_slow slows things down) > > > > > > > > > > This is a really good question. I did have the same question while > > > > > working on > > > > > the negative test today. The timeout 2000 indeed will fail the job at > > > > > about 30min, > > > > > and waiting for it is indeed not really pleasant. > > > > > > > > > > But my second thought would be - from my observation, the overall time > > > > > now > > > > > would vary between 15min ~ 20min, and having a 10min margin is not > > > > > that crazy > > > > > given that we probably will do more testing from the job in the > > > > > future, and if the > > > > > GitLab Arm worker is high loaded, FVP will probably become slower. And > > > > > normally > > > > > we don’t even trigger the timeout as the job will normally pass. So I > > > > > decided > > > > > to keep this. > > > > > > > > > > Mind sharing your thoughts about the better value of the timeout? > > > > > Probably 25min? > > > > From what you said that the average is 15-20, I think we can leave it > > > > set to 30. > > > > But I wonder if we can do something to decrease the average time. ~20 > > > > min is a lot > > > > even for FVP :) Have you tried setting send_slow to something lower than > > > > 100ms? > > > > That said, we don't send too many chars to FVP, so I doubt it would play > > > > a major role > > > > in the overall time. > > > > > > I agree with the send_slow part. Actually I do have the same concern, here > > > are my current > > > understanding and I think you will definitely help with your knowledge: > > > If you check the full log of Dom0 booting, for example [1], you will find > > > that we wasted so > > > much time in starting the services of the OS (modloop, udev-settle, etc). > > > All of these services > > > are retried many times but in the end they are still not up, and from my > > > understanding they > > > won’t affect the actual test(?) If we can somehow get rid of these > > > services from rootfs, I think > > > we can save a lot of time. > > > > > > And honestly, I noticed that qemu-alpine-arm64-gcc suffers from the same > > > problem and it also > > > takes around 15min to finish. So if we managed to tailor the services from > > > the filesystem, we > > > can save a lot of time. > > That is not true. Qemu runs the tests relatively fast within few minutes. > > The reason you see e.g. 12 mins > > for some Qemu jobs comes from the timeout we set in Qemu scripts. We don't > > have yet the solution (we could > > do the same as Qubes script) to detect the test success early and exit > > before timeout. That is why currently > > the only way for Qemu tests to finish is by reaching the timeout. > > > > So the problem is not with the rootfs and services (the improvement would > > not be significant) but with > > the simulation being slow. > > From my experience with the FVP improvement would be significant. A normal > boot distribution will start a lot of services. I end up to write my own > initscript doing the bare minimum for creating a guest. This saves me a lot of > time everytime I needed to test on FVP. > > I think we can do the same for the gitlab. Maybe not to the point of writing > your initscript but cutting down anything unnecessary. > > This will avoid the FVP test to become the bottlneck in the gitlab CI. Along the same lines another idea would be to use busybox alone (no Alpine Linux) as Dom0 rootfs. That's going to be faster, but you cannot really use xl to create DomUs due to libraries and other dependencies but you can for sure create additional guests using Dom0less, see for instance automation/scripts/qemu-smoke-dom0less-arm64.sh So if you have troubles improving the boot times of Dom0 + xl create an alternative would be to create two Linux dom0less DomUs both of them with only busybox as ramdisk.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.