Xen project Mailing List

[Wg-test-framework] osstest Massachusetts test lab resource usage investigation

From: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

Date: Thu, 25 Aug 2016 16:27:11 +0100

Cc: wg-test-framework@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Thu, 25 Aug 2016 15:27:51 +0000

List-id: Xen Project Test Framework Working Group <wg-test-framework.lists.xenproject.org>

Because I have suspected that we are not getting as much out of the test lab as we could have, I have been doing some digging in the database. I wrote some code to mine the database for historical information about resource usage. (Patch below, which I will push to osstest production at some point.) I ran it on the last 92 days of history (strictly, 92 days starting 7 days ago, because for obscure technical reasons it doesn't want to look right at the most recent data). The most noteworthy conclusion I have drawn is that about 30% of the capacity is not being used. That is, at any one time, on average, about ~30% of the capacity is idle. Some of this will be inevitable: for example, if we need to shut the system down for maintenance, we will need to ask it to drain its queue so that it is idle by the start of the maintenance window. It is also possible that some of this is a genuine lack of work to do, in the sense that the programmed test flights are not able to use the capacity. I don't think this is very likely (but I haven't yet figured out for sure). Certainly, my personal impression is that the lead time on tests has varied greatly; currently it is quite low because the stable trees haven't had recent updates. Personal experience and ad-hoc log reading leads me to think that the biggest likely cause of idle test boxes is database contention. That is, there is work which the system in some sense knows it ought to be doing, but it is stuck on bookkeeping operations involved in setting up a flight, scheduling the jobs, and at the appropriate time allocating a test box for each such job, and son on. Depending on locking strategies, these database operations can be blocked by long-running report jobs. It is also possible that the database queries involved in host scheduling need tuning. In parallel with the work I am reporting in this email, I am already taking steps to investigate the performance of the various database queries, and the situation with locking. I hope that this will produce actionable ideas for improvement. Additionally, each test box spends about ~16% of the time being wiped and reinstalled for each test. This is because the reinstallation time is significant when compared to the duration of each test. It is necessary to reinstall when a host was previously used by a completely different flight (which might involve a different hypervisor), or if we aren't sure that the previous test left the host in a good state. But most flights contain many similar host installations and most jobs leave the host in a good state. So there is an opportunity for a worthwhile improvement in throughput by arranging for test host reuse within a flight. osstest already understands sharing for the purpose of builds; making it share for tests, under the right conditions, is a moderately extensive, but straightforward, task. About 4% of the capacity is used for builds, but a further 5% or so preparing a box to be a build host. We could probably optimise this by providing a few dedicated build hosts, as this would result in fewer reinstalls. We should consider this in our next round of provisioning. (The report also claims that about 3% of the capacity is "unused" when a machine is configured for building, and osstest would be willing in principle to run multiple different builds at once, but for scheduling reasons only one build is running.) Other interesting results: about 3-4% of the time is spent doing bisections. Looking at the branches: the xen-unstable-smoke flights are about 3%. xen-unstable itself consumes the bulk of the capacity, at about 28%. qemu-mainline (which updates frequently) is the next biggest. The graphs are generated using python3-matplotlib and shown on my workstation's X display. I haven't arranged for them to be output as PostScript or png or anything - instead I have put a screenshot here: http://xenbits.xen.org/people/iwj/2016/mg-report-host-usage.2016-08-25.png I'm afraid that this is likely to be rather opaque unless you look at the source code (in osstest, and the patch below) to see what the various categories mean. I also haven't bothered prettifying the formatting, beyond what's needed to be able to read the charts. Thanks all for your attention. Ian. From 6895c84d890a2db9f68e7994198783c0e4465181 Mon Sep 17 00:00:00 2001 From: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> Date: Thu, 25 Aug 2016 15:59:28 +0100 Subject: [OSSTEST PATCH] mg-report-host-usage-*: New admin debugging tools These two tools can be used to dig into the osstest history and produce some graphs of what the system is spending its resources on. Signed-off-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> --- mg-report-host-usage-collect | 443 +++++++++++++++++++++++++++++++++++++++++++ mg-report-host-usage-tuples | 124 ++++++++++++ 2 files changed, 567 insertions(+) create mode 100755 mg-report-host-usage-collect create mode 100755 mg-report-host-usage-tuples diff --git a/mg-report-host-usage-collect b/mg-report-host-usage-collect new file mode 100755 index 0000000..ab43c5f --- /dev/null +++ b/mg-report-host-usage-collect @@ -0,0 +1,443 @@ +#!/usr/bin/perl -w + +# This is part of "osstest", an automated testing framework for Xen. +# Copyright (C) 2009-2016 Citrix Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU Affero General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU Affero General Public License for more details. +# +# You should have received a copy of the GNU Affero General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + + +use strict qw(vars); + +use DBI; +use Osstest; +use IO::Handle; +use POSIX; +use JSON; +use Data::Dumper; +use Carp; + +$SIG{__WARN__} = sub { confess $_[0]; }; + +my %hh; # host history +# contains a list of phases +# a phase is "this host was used in this way during this period" +# initially phases may overlap in complex ways +# +# $hh{HOST}[]{From} = time_t +# $hh{HOST}[]{To} = time_t +# $hh{HOST}[]{KV}{Key} = VALUE +# +# Keys are the db results plus (caps are synth ones): +# +# PREP = boolish: 1 means was host install or h-build-prep +# SHAREABLE = boolish: 1 means was sharable during this period +# +# Calculated during consolidate: +# SHARES = 1 or 3 (or 'shared max tasks' host prop) +# IDLE = 1 or undef +# +# Combinations of the above +# OVERALL IDL +# TEST-PREP TEST +# BUILD-PREP BUILD BUILD-IDLE IDLE-PREPPED +# ^ idle share ^ all shares idle +# +# WEIGHT 1.0/SHARES (or 1.0) + +our $duration = 31 * 86400; +our $max_job_len = 86400 * 7; +our $progress = 1; +our $debug = 0; + +csreadconfig(); + +open DEBUG, "> /dev/null" or die $!; +open PROGR, ">& STDERR" or die $!; + +while (@ARGV) { + $_ = shift @ARGV; + last if m/^--?$/; + while (m/^-./) { + if (s/^-D/-/) { + open DEBUG, ">& STDERR" or die $!; + open PROGR, "> /dev/null" or die $!; + $debug = 1; + } elsif (s/^-q/-/) { + open PROGR, "> /dev/null" or die $!; + } elsif (s/^-l(\d+)([y|m|w|d])$//) { + $duration = $1 * ($2 eq 'y' ? 366 : + $2 eq 'm' ? 31 : + $2 eq 'w' ? 7 : + 1) * 86400; + } elsif (s/^--max-job-len=(\d+)$//) { + $max_job_len = $1; + } else { + die "$_ ?"; + } + } +} + +PROGR->autoflush(1); + +our $cutoff_end = time - $max_job_len; +our $cutoff_start = $cutoff_end - $duration; + +printf STDERR "cutoff start: %s\n", show_abs_time $cutoff_start; +printf STDERR "cutoff end: %s\n", show_abs_time $cutoff_end; + +our $last_progress = 0; + +sub progress1 { + print DEBUG "@_\n" if $debug; + printf PROGR "%-79.79s\r", "@_"; + $last_progress = time; +} + +sub progress { + return unless time > $last_progress; + print DEBUG "@_\n" if $debug; + local $debug = 0; + progress1 @_; +} + +my $propq = db_prepare(<<END); + SELECT val + FROM resource_properties + WHERE restype='host' + AND resname=? + AND name=? +END + +sub collectdata () { + my $startq = db_prepare(<<END); + SELECT * + FROM flights + WHERE flights.started >= ? + ORDER BY flight asc + LIMIT 1 +END + $startq->execute($cutoff_start - $max_job_len); + my ($minflight) = $startq->fetchrow_array(); + die unless defined $minflight; + + progress1 "minflight $minflight"; + + my $jobsq = db_prepare(<<END); + SELECT flights.flight flight, + flights.blessing blessing, + flights.branch branch, + flights.intended intended, + alloc.finished alloc_finished, + jobs.job job + FROM flights + JOIN jobs USING (flight) + JOIN steps alloc USING (flight,job) + WHERE flights.flight >= ? + AND alloc.step='ts-hosts-allocate' + AND alloc.status='pass' + AND alloc.finished IS NOT NULL +END + + my $prepq = db_prepare(<<END); + SELECT finished prep_finished, + status prep_status + FROM steps prep + WHERE flight=? and job=? + AND prep.finished IS NOT NULL + AND (prep.step='ts-host-build-prep' + OR prep.step like 'ts-host-install%') + ORDER BY stepno DESC + LIMIT 1 +END + + my $hostsq = db_prepare(<<END); + SELECT val, synth + FROM runvars + WHERE flight=? and job=? + AND (name like '%_host' or name='host') +END + + my $finishq = db_prepare(<<END); + SELECT max(finished) AS finished + FROM steps + WHERE flight=? and job=? +END + + progress1 "minflight $minflight executing..."; + $jobsq->execute($minflight); + + progress1 "minflight $minflight fetching..."; + $last_progress = 0; # forces next progress message to be printed + + while (my $jobrow = $jobsq->fetchrow_hashref()) { + my @fj = ($jobrow->{flight}, $jobrow->{job}); + progress "PROCESSING @fj"; + my @kvs = (%$jobrow); + + $finishq->execute(@fj); + my ($finish) = $finishq->fetchrow_array(); + die "@fj ?" unless defined $finish; + + $hostsq->execute(@fj); + my $hosts = $hostsq->fetchall_arrayref({ }); + + my $record_phase = sub { + my ($start,$finish,@xkvs) = @_; + $finish = $start if $finish < $start; # wtf + foreach my $hostrow (@$hosts) { + my $hn = $hostrow->{val}; + print DEBUG " phase $hn $start..$finish @kvs @xkvs\n"; + push @{ $hh{$hn} }, { + From => $start, + To => $finish, + KV => { @kvs, @xkvs, host_fixed => !$hostrow->{synth} }, + }; + } + }; + + $prepq->execute(@fj); + my $preprow = $prepq->fetchrow_hashref(); + if (defined $preprow) { + push @kvs, %$preprow; + if ((defined $preprow->{prep_status}) && + $preprow->{prep_status} eq 'pass') { + $record_phase->($jobrow->{alloc_finished}, + $preprow->{prep_finished}, + 'PREP' => 1, + 'SHAREABLE' => 0); + $record_phase->($preprow->{prep_finished}, + $finish, + 'PREP' => 0, + 'SHAREABLE' => 1); + } else { + # there was a prep attempt but it didn't pass + $record_phase->($jobrow->{alloc_finished}, + $finish, + 'PREP' => 1, + 'SHAREABLE' => 0); + } + } else { + # something strange, well, fine + $record_phase->($jobrow->{alloc_finished}, + $finish, + PREP => 0, + SHAREABLE => 0); + } + } +} + +sub consolidate_host ($) { + my ($hostname) = @_; + + progress1 "host $hostname"; + + $propq->execute($hostname, 'shared max tasks'); + my ($max_shares) = $propq->fetchrow_array(); + $max_shares ||= 3; + + print DEBUG "host $hostname max_shares=$max_shares\n"; + + print DEBUG "host $hostname sorting\n"; + + my @sorted = sort { $a->{From} <=> $b->{From} } @{ $hh{$hostname} }; + my @out; + $hh{$hostname} = \@out; + + print DEBUG "host $hostname chopping\n"; + + while (@sorted) { + my $subphase_start = $sorted[0]{From}; + my $subphase_end = $sorted[0]{To}; + + # We compute the next subphase period, and how many of the + # incoming phases overlap with it. + # + # There are one or more phases which start at $subphase_start. + # It is exactly these phases which occur in the subphase. + # + # The subphase ends when either: one of those subphases ends, + # or a subsequent subphase begins + + # We figure out when the next subphase + + my $eat_phases = 0; + my $next_event = $subphase_end; + + for (my $i=0; $i<@sorted; $i++) { + my $cand_phase = $sorted[$i]; + + last if $cand_phase->{From} > $subphase_end; + # ... we are well beyond the subphase then + + if ($cand_phase->{From} == $subphase_start) { + $eat_phases++; + $next_event = $cand_phase->{To}; + } else { + $cand_phase->{From} > $subphase_start or die; + $next_event = $cand_phase->{From}; + } + $subphase_end = $next_event if $next_event < $subphase_end; + } + + progress sprintf "subphase %-20s %15d %15d %3d", + $hostname, $subphase_start, $subphase_end, $eat_phases; + + my @secondhalves; + my @subphases; + for (my $i=0; $i<$eat_phases; $i++) { + my $eating = shift @sorted; + die unless $eating->{From} == $subphase_start; + die if $eating->{To} < $subphase_end; + if ($eating->{To} > $subphase_end) { + # Each of the incoming phases which start at the + # subphase period and finish after its end is broken + # up into two new phases: one which goes into the + # output. + push @secondhalves, { + From => $subphase_end, + To => $eating->{To}, + KV => { %{ $eating->{KV} } }, + }; + $eating->{To} = $subphase_end; + } + push @subphases, $eating; + } + unshift @sorted, @secondhalves; + + # See if we are sharing, and if so calculate the sharing + # fraction and also perhaps invent an idle share. We do this + # here so that we are editing in @out. Editing in @sorted + # would be wrong because it would affect other subphases. + + my $all_shares = 1; + my $max_shares_now = 1; + foreach my $sp (@subphases) { + $all_shares &&= $sp->{KV}{SHAREABLE}; + if (defined $sp->{KV}{job} && + $sp->{KV}{job} =~ m/^build/) { + $max_shares_now = $max_shares; + } + } + my @sp0kvs = %{ $subphases[0]{KV} }; + + if ($all_shares) { + if (@subphases > $max_shares_now) { + progress1 ''; + print STDERR "warning: $hostname". + " $subphase_start..$subphase_end:". + " msn=$max_shares_now != ".(scalar @subphases)."\n"; + $max_shares_now = @subphases; + } + + foreach my $sp (@subphases) { + $sp->{KV}{SHARES} = $max_shares_now; + } + + while (@subphases < $max_shares_now) { + push @subphases, { + From => $subphase_start, + To => $subphase_end, + KV => { @sp0kvs, SHARES => $max_shares_now, IDLE => 1 }, + }; + } + } + + my $next_phase = $sorted[0]; + if ($next_phase && + $next_phase->{From} != $subphase_end) { + # Need to add an idle phase, to bridge to next incoming + # phase. + die unless $next_phase->{From} > $subphase_end; + push @subphases, { + From => $subphase_end, + To => $next_phase->{From}, + KV => $all_shares ? { @sp0kvs, IDLE => 1 } : { IDLE => 1 }, + }; + } + + foreach my $sp (@subphases) { + my $kv = $sp->{KV}; + my $jk = ( + !defined $kv->{job} ? '' : + $kv->{job} =~ m/^build/ ? 'BUILD' : + 'TEST' + ); + $kv->{OVERALL} = + ( $kv->{IDLE} ? ( $kv->{SHARES} ? "IDLE-SHARE-$jk" : + $kv->{SHAREABLE} ? "IDLE-PREPPED-$jk" : + "IDLE" ) : + $kv->{PREP} ? "PREP-$jk" : + "$jk" + ); + + $kv->{WEIGHT} = + 1.0 / ( $kv->{SHARES} // 1.0 ); + } + + foreach my $sp (@subphases) { + my @kvs = map { "$_=$sp->{KV}{$_}" } sort keys %{ $sp->{KV} }; + my $t = $sp->{To} - $sp->{From}; + printf DEBUG " SP %s %5d %s..%s %skvs\n", + $hostname, $t, $sp->{From}, $sp->{To}, "@kvs"; + $sp->{From} = $cutoff_start if $sp->{From} < $cutoff_start; + $sp->{To} = $cutoff_end if $sp->{To} > $cutoff_end; + next unless $sp->{From} < $sp->{To}; + push @out, $sp; + } + } + progress1 ''; +} + +sub output_host ($) { + my ($hn) = @_; + + my $hostflagsq = db_prepare(<<END); + SELECT hostflag f + FROM hostflags + WHERE hostname = ? +END + + my %hinfo; + $hostflagsq->execute($hn); + while (my ($f) = $hostflagsq->fetchrow_array()) { + if ($f =~ m/^blessed-real$/) { + $hinfo{'HOST_BLESSED_REAL'} = 1; + } elsif ($f =~ m/^arch-(?:amd64|i386)$/) { + $hinfo{'HOST_ARCH_GROUP'} = 'x86'; + } elsif ($f =~ m/^arch-arm/) { + $hinfo{'HOST_ARCH_GROUP'} = 'arm'; + } + } + + my $hd = $hh{$hn}; + foreach my $e (@$hd) { + my $duration = $e->{To} - $e->{From}; + my $obj = { %{ $e->{KV} }, %hinfo }; + delete $obj->{$_} foreach qw(flight alloc_finished prep_finished); + $obj->{HOSTNAME} = $hn; + $obj->{DURATION} = $duration; + $obj->{WDURATION} = $obj->{WEIGHT} * $obj->{DURATION}; + print encode_json($obj),"\n" or die $!; + } +} + +collectdata(); + +foreach my $h (sort keys %hh) { +# print DEBUG "\n-------------------- $h --------------------\n"; +# print DEBUG "\n",Dumper($hh{$h}); + consolidate_host($h); +# print DEBUG "\n",Dumper($hh{$h}); + output_host($h); +} diff --git a/mg-report-host-usage-tuples b/mg-report-host-usage-tuples new file mode 100755 index 0000000..375da88 --- /dev/null +++ b/mg-report-host-usage-tuples @@ -0,0 +1,124 @@ +#!/usr/bin/python3 + +import re as regexp +import sys +import json +from collections import defaultdict +from math import floor, sqrt, ceil + +import numpy as np +import matplotlib.pyplot as plt + +kv = defaultdict(lambda: defaultdict(int)) + +undefstr = '(undef)' + +def readin(): + f = open(sys.argv[1]) + + key_bad_re = regexp.compile(r'\W') + + for l in f: + obj = json.loads(l) + w = obj['DURATION'] + for k, v in obj.items(): + assert(not key_bad_re.match(k)) + assert(len(k)) + assert(not v==undefstr) + if k=='DURATION' or k=='WDURATION': continue + kv[k][v] += w + +def truncto(totals, maxvlen): + ntotals = defaultdict(int) + for v, w in totals.items(): + v = v[0:maxvlen] + ntotals[v] += w + return ntotals + +def makegraph(k, totals): + print(k) + + maxvlen = 25 + maxitems = 30 + + while True: + totals = truncto(totals, maxvlen) + + if maxvlen <= 1: break + if len(totals) <= maxitems: break + + print(" reducing", maxvlen, len(totals)) + maxvlen -= 1 + + vals = totals.keys() + vals = sorted(vals) + vals.reverse() + + nvals = len(vals) + indices = np.arange(nvals) + totals = [totals[vals[ix]] for ix in indices] + + total = sum(totals) + if total: + totals = [x * 100 / total for x in totals] + + #print(str([vals,indices,totals])) + + global subplotnum + ax = plt.subplot(subplots_nrows, subplots_ncols, subplotnum) + subplotnum += 1 + + plt.yticks(indices + 0.35, vals) + plt.barh(indices, totals) + ax.set_title(k) + +akvw = defaultdict(lambda: defaultdict(lambda: defaultdict(float))) + +def recordin(obj, w, kvw): + for k in kdo: + v = obj.get(k,undefstr) + kvw[k][str(v)] += w + +def readin2(): + global kdo + + kdo = sorted(kv.keys()) + #kdo = [ 'branch', 'OVERALL','job' ] + + f = open(sys.argv[1]) + for l in f: + obj = json.loads(l) + w = obj['WDURATION'] + recordin(obj, w, akvw['*']) + ag = obj.get('HOST_ARCH_GROUP',undefstr) + recordin(obj, w, akvw[ag]) + +fignum = 0 + +def basicgraphs(ag, kvw): + global fignum + global subplotnum + global subplots_nrows + global subplots_ncols + + fignum = fignum + 1 + subplotnum = 1 + + fig = plt.figure(fignum) + titlemsg = ag + plt.suptitle(titlemsg) + fig.canvas.set_window_title("mg-report-host-usage-tuples " + ag) + + subplots_nrows = floor(sqrt(len(kdo))) + subplots_ncols = ceil(len(kdo) / subplots_nrows) + for k in kdo: + makegraph(k, kvw[k]) + +def graphs(): + for ag in sorted(akvw.keys()): + basicgraphs(ag, akvw[ag]) + plt.show() + +readin() +readin2() +graphs() -- 2.1.4 _______________________________________________ Wg-test-framework mailing list Wg-test-framework@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/cgi-bin/mailman/listinfo/wg-test-framework

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.