[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [OSSTEST PATCH 14/14] duration_estimator: Move duration query loop into database


  • To: Ian Jackson <Ian.Jackson@xxxxxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxx>
  • Date: Mon, 27 Jul 2020 17:43:54 +0000
  • Accept-language: en-GB, en-US
  • Authentication-results: esa4.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 27 Jul 2020 17:44:34 +0000
  • Ironport-sdr: sYkvvr2wlyKV7bIgqh3sIKkYlYj28T/iMdh0Q5O7voTAIyZ/K/YtKKy0ZtwZ6k/C5qDJeoTIBv 9ugxTSuyvIgbgmj1gNLHxnEqGVdWRbc6rbYceqbuZQOfXlYY+4p/0zlunruFSRSeh0IZyznV3c pwwNhNPG4n+JEjJWxZHz5fvIoJ3MaciGuefcylx6wgx4H+L+enFJMug73lDBAXM2jjGNNhZqPO u6H+PGdMVu3emFjvr8MjNkxcpG69mnwwU9jWW6qmXwdo4F57frDAkSC4wwGQ81SOj/+CmlUWpM 7rc=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWX5IXwUKXL54pjkKddmBXrxrv3qkbmlIA
  • Thread-topic: [OSSTEST PATCH 14/14] duration_estimator: Move duration query loop into database


> On Jul 21, 2020, at 7:42 PM, Ian Jackson <ian.jackson@xxxxxxxxxxxxx> wrote:
> 
> Stuff the two queries together: we use the firsty query as a WITH
> clause.  This is significantly faster, perhaps because the query
> optimiser does a better job but probably just because it saves on
> round trips.
> 
> No functional change.
> 
> Perf: subjectively this seemed to help when the cache was cold.  Now I
> have a warm cache and it doesn't seem to make much difference.
> 
> Perf: runtime of my test case now ~5-7s.
> 
> Example queries before (from the debugging output):
> 
> Query A part I:
> 
>            SELECT f.flight AS flight,
>                   j.job AS job,
>                   f.started AS started,
>                   j.status AS status
>                     FROM flights f
>                     JOIN jobs j USING (flight)
>                     JOIN runvars r
>                             ON  f.flight=r.flight
>                            AND  r.name=?
>                    WHERE  j.job=r.job

Did these last two get mixed up?  My limited experience w/ JOIN ON and WHERE 
would lead me to expect we’re joining on `f.flight=r.flight and r.job = j.job`, 
and having `r.name = ?` as part of the WHERE clause.  I see it’s the same in 
the combined query as well.

>                      AND  f.blessing=?
>                      AND  f.branch=?
>                      AND  j.job=?
>                      AND  r.val=?
>                      AND  (j.status='pass' OR j.status='fail'
>                           OR j.status='truncated'!)
>                      AND  f.started IS NOT NULL
>                      AND  f.started >= ?
>                 ORDER BY f.started DESC
> 
> With bind variables:
>     "test-amd64-i386-xl-pvshim"
>     "guest-start"
> 
> Query B part I:
> 
>            SELECT f.flight AS flight,
>                   s.job AS job,
>                   NULL as started,
>                   NULL as status,
>                   max(s.finished) AS max_finished
>                      FROM steps s JOIN flights f
>                        ON s.flight=f.flight
>                     WHERE s.job=? AND f.blessing=? AND f.branch=?
>                       AND s.finished IS NOT NULL
>                       AND f.started IS NOT NULL
>                       AND f.started >= ?
>                     GROUP BY f.flight, s.job
>                     ORDER BY max_finished DESC
> 
> With bind variables:
>    "test-armhf-armhf-libvirt"
>    'real'
>    "xen-unstable"
>    1594144469
> 
> Query common part II:
> 
>        WITH tsteps AS
>        (
>            SELECT *
>              FROM steps
>             WHERE flight=? AND job=?
>        )
>        , tsteps2 AS
>        (
>            SELECT *
>              FROM tsteps
>             WHERE finished <=
>                     (SELECT finished
>                        FROM tsteps
>                       WHERE tsteps.testid = ?)
>        )
>        SELECT (
>            SELECT max(finished)-min(started)
>              FROM tsteps2
>          ) - (
>            SELECT sum(finished-started)
>              FROM tsteps2
>             WHERE step = 'ts-hosts-allocate'
>          )
>                AS duration

Er, wait — you were doing a separate `duration` query for each row of the 
previous query?  Yeah, that sounds like it could be a lot of round trips. :-)

> 
> With bind variables from previous query, eg:
>     152045
>     "test-armhf-armhf-libvirt"
>     "guest-start.2"
> 
> After:
> 
> Query A (combined):
> 
>            WITH f AS (
>            SELECT f.flight AS flight,
>                   j.job AS job,
>                   f.started AS started,
>                   j.status AS status
>                     FROM flights f
>                     JOIN jobs j USING (flight)
>                     JOIN runvars r
>                             ON  f.flight=r.flight
>                            AND  r.name=?
>                    WHERE  j.job=r.job
>                      AND  f.blessing=?
>                      AND  f.branch=?
>                      AND  j.job=?
>                      AND  r.val=?
>                      AND  (j.status='pass' OR j.status='fail'
>                           OR j.status='truncated'!)
>                      AND  f.started IS NOT NULL
>                      AND  f.started >= ?
>                 ORDER BY f.started DESC
> 
>            )
>            SELECT flight, max_finished, job, started, status,
>            (
>        WITH tsteps AS
>        (
>            SELECT *
>              FROM steps
>             WHERE flight=f.flight AND job=f.job
>        )
>        , tsteps2 AS
>        (
>            SELECT *
>              FROM tsteps
>             WHERE finished <=
>                     (SELECT finished
>                        FROM tsteps
>                       WHERE tsteps.testid = ?)
>        )
>        SELECT (
>            SELECT max(finished)-min(started)
>              FROM tsteps2
>          ) - (
>            SELECT sum(finished-started)
>              FROM tsteps2
>             WHERE step = 'ts-hosts-allocate'
>          )
>                AS duration
> 
>            ) FROM f

I mean, in both queries (A and B), the transform should basically result in the 
same thing happening, as far as I can tell.

I can try to analyze the duration query and see if I can come up with any 
suggestions, but that would be a different patch anyway.

 -George


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.