[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Notes for xen summit 2018 design session] Process changes: is the 6 monthly release Cadence too short, Security Process, ...



On 06/07/18 00:47, Sander Eikelenboom wrote:
> On 05/07/18 19:11, Ian Jackson wrote:
>> Sander Eikelenboom writes ("Re: [Xen-devel] [Notes for xen summit 2018 
>> design session] Process changes: is the 6 monthly release Cadence too short, 
>> Security Process, ..."):
>>> Just wondering, are there any timing statistics kept for the OSStest
>>> flights (and separate for building the various components and running
>>> the individual tests ?). Or should they be parse-able from the logs kept ?
>>
>> Yes.  The database has a started and stopped time_t for each test
>> step.  That's where I got the ~15 mins number from.
>>
>> Ian.
>>
> 
> Hi Ian,
> 
> Since the current OSStest emails give a 404 on the link to the logs,
> i digged in the archives and found the right url:
>     http://logs.test-lab.xenproject.org/osstest/logs/
> 
> I took the liberty to browse through some of the flights trying to get a 
> grasp on how
> to interpret the numbers.
> 
> Let't take an example: 
> http://logs.test-lab.xenproject.org/osstest/logs/124946/
> Started:      2018-07-03 13:08:06 Z
> Finished:     2018-07-05 06:08:54 Z
> 
> That is quite some time ...
> 
> Now if i take an example job/test say: 
> http://logs.test-lab.xenproject.org/osstest/logs/124940/test-amd64-amd64-xl/info.html
> 
> I see:
> - step 2 hosts-allocate takes 20012 seconds
>   which if i interpret it right, indicates a lot of time waiting before 
> actually having a slot available to run,
>   so that seems to be indicating at least a capacity problem on the infra 
> structure.
> - Step 3 seems to be the elapsed time while syslog recorded all the steps 
> thereafter.
>   It's 2639 seconds, while the rest of the steps remaining give a sum of 
> 2630, so that seems about right.
> 
>   All the other steps together take 2630 seconds, so the run to wait ratio is 
> about 1/7 ....
>   For the remainder let's keep the waiting out of the equation, under the 
> assumption that if we can reduce the rest, 
>   we reduce the load on the infrastructure and reduce the waiting time as 
> well.
>  
> - step 4 host-install(4) takes 1005 seconds
>   It seems step 4 is the step you referred to with the 15 minutes (it's 
> indeed around 15 minutes) ?
>   That is around 38% percent of all the steps (excluding the waiting from 
> step 2) !
> 
> - step 10 debian-install which seems to be the guest install, seems modest 
> with 288 seconds.
> 
> I also browsed some other tests and flights and on first sight it does seem 
> the give the same pattern.
> 
> So (sometimes), a lot of time is spent on waiting for a slot, followed by 
> doing the host install. 
> 
> So any improvement in the later will probably reap a double benefit by also 
> reducing the wait time !
> 
> 
> When i look at job/test: 
> http://logs.test-lab.xenproject.org/osstest/logs/124940/test-amd64-amd64-xl-qemuu-win10-i386/info.html
> 
> I see:
> - step 2 hosts-allocate: 47116 seconds.
> - step 3 syslog-server: 8191 seconds.
> - step 4 host-install(4): 789 seconds, somewhat shorter than the other 
> job/test.
> - step 10 windows-install 7061 seconds, but a failing windows 10 guest 
> install dwarfs them all...
> 
> 
> When i look at job/test: 
> http://logs.test-lab.xenproject.org/osstest/logs/124940/test-amd64-amd64-xl-qemuu-win7-amd64/info.html
> 
> I see:
> - step 2 hosts-allocate: 13272 seconds.
> - step 3 syslog-server: 2985 seconds.
> - step 4 host-install(4): 675 seconds, even somewhat shorter than both the 
> other job/tests.
> - step 10 windows-install 1029 seconds, that's a lot better than the failing 
> windows 10 install from the other job.
> 
> So running the windows install is currently a black box with a timeout of 
> 7000 seconds.
> If it fails the total runtime of the job/test is around 8000 seconds which is 
> almost 2 hours !
> 
> Which we do 4 times: 
> - test-amd64-amd64-xl-qemut-win10-i386
> - test-amd64-i386-xl-qemut-win10-i386
> - test-amd64-amd64-xl-qemuu-win10-i386
> - test-amd64-i386-xl-qemuu-win10-i386
> 
> Which all seem to result in a "10. windows-install" -> "fail never pass".
> I sincerely *hope* i'm not interpreting this correct .. but are we wasting 4 
> * 2 hours = 8 hours in a flight, 
> on a job/test that has *never ever* passed (and probably will never, miracles 
> or a specific bugfix excluded) ?

This morning had another look and 
http://logs.test-lab.xenproject.org/osstest/logs/124940/test-amd64-amd64-xl-qemuu-win10-i386/fiano0_win.guest.osstest-vnc.jpeg
could indicate windows 10 has detected no NIC. Perhaps changing the emulated 
NIC type from the default realtek 8139 to an intel e1000 would be all it takes 
to make
the test succeed, seems to be worth a try. Hopefully the time of a successful 
test of a windows 10 install will be significantly less than the 2 hours of a 
failing one.

--
Sander


> 
> Would it be an idea to only test "fail never pass" on install steps only 
> every once in a while (they can't be blockers anyway ?)
> if at all (only re-enable manually after fix?). If my interpretation is right 
> this seems to be quite low hanging fruit.
> 
> --
> Sander
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.