[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[xen-unstable bisection] complete test-xtf-amd64-amd64-1



branch xen-unstable
xenbranch xen-unstable
job test-xtf-amd64-amd64-1
testid xtf/test-pv64-selftest

Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git
Tree: xtf git://xenbits.xen.org/xtf.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  xen git://xenbits.xen.org/xen.git
  Bug introduced:  ad0fd291c5e79191c2e3c70e43dded569f11a450
  Bug not present: a5eaac9245f4f382a3cd0e9710e9d1cba7db20e4
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/153986/


  commit ad0fd291c5e79191c2e3c70e43dded569f11a450
  Author: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  Date:   Tue Aug 11 16:05:06 2020 +0100
  
      x86/pv: Rewrite segment context switching from scratch
      
      There are multiple bugs with the existing implementation.
      
      On AMD CPUs prior to Zen2, loading a NUL segment selector doesn't clear 
the
      segment base, which is a problem for 64bit code which typically expects 
to use
      a NUL %fs/%gs selector.
      
      On a context switch from any PV vcpu, to a 64bit PV vcpu with an %fs/%gs
      selector which faults, the fixup logic loads NUL, and the guest is 
entered at
      the failsafe callback with the stale base.
      
      Alternatively, a PV context switch sequence of 64 (NUL, non-zero base) =>
      32 (NUL) => 64 (NUL, zero base) will similarly cause Xen to enter the 
guest
      with a stale base.
      
      Both of these corner cases manifest as state corruption in the final vcpu.
      However, damage is limited to to 64bit code expecting to use Thread Local
      Storage with a base pointer of 0, which doesn't occur by default.
      
      The context switch logic is extremely complicated, and is attempting to
      optimise away loading a NUL selector (which is fast), or writing a 64bit 
base
      of 0 (which is rare).  Furthermore, it fails to respect Linux's ABI with
      userspace, which manifests as userspace state corruption as far as Linux 
is
      concerned.
      
      Always restore all selector and base state, in all cases.
      
      Leave a large comment explaining hardware behaviour, and the new ABI
      expectations.  Update the comments in the public headers.
      
      Drop all "segment preloading" to handle the AMD corner case.  It was never
      anything but a waste of time for %ds/%es, and isn't needed now that 
%fs/%gs
      bases are unconditionally written for 64bit PV guests.  In 
load_segments(),
      store the result of is_pv_32bit_vcpu() as it is an expensive predicate 
now,
      and not used in a way which impacts speculative safety.
      
      Reported-by: Andy Lutomirski <luto@xxxxxxxxxx>
      Reported-by: Sarah Newman <srn@xxxxxxxxx>
      Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
      Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>


For bisection revision-tuple graph see:
   
http://logs.test-lab.xenproject.org/osstest/results/bisect/xen-unstable/test-xtf-amd64-amd64-1.xtf--test-pv64-selftest.html
Revision IDs in each graph node refer, respectively, to the Trees above.

----------------------------------------
Running cs-bisection-step 
--graph-out=/home/logs/results/bisect/xen-unstable/test-xtf-amd64-amd64-1.xtf--test-pv64-selftest
 --summary-out=tmp/153986.bisection-summary --basis-template=152877 
--blessings=real,real-bisect xen-unstable test-xtf-amd64-amd64-1 
xtf/test-pv64-selftest
Searching for failure / basis pass:
 153957 fail [host=rimava1] / 153882 [host=godello0] 153845 [host=elbling0] 
153813 [host=chardonnay1] 153788 [host=huxelrebe1] 153770 [host=elbling1] 
153758 [host=fiano1] 153653 [host=albana0] 153619 [host=chardonnay0] 153602 
[host=elbling0] 153591 [host=albana1] 153551 [host=chardonnay1] 153526 
[host=huxelrebe1] 153494 [host=godello0] 153468 [host=huxelrebe0] 153437 
[host=godello1] 153400 [host=fiano1] 153363 [host=pinot1] 153321 [host=albana1] 
153280 [host=albana0] 153109 [host=fiano0] 153065 \
 [host=pinot0] 153028 [host=elbling1] 153004 ok.
Failure / basis pass flights: 153957 / 153004
(tree with no url: minios)
(tree with no url: ovmf)
(tree with no url: seabios)
Tree: linux git://xenbits.xen.org/linux-pvops.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git
Tree: xtf git://xenbits.xen.org/xtf.git
Latest c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
b11910082d90bb1597f6679524eb726a33306672 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
Basis pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
d400dc5729e4e132d61c2e7df57d81aaed762044 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
Generating revisions with ./adhoc-revtuple-generator  
git://xenbits.xen.org/linux-pvops.git#c3038e718a19fc596f7b1baba0f83d5146dc7784-c3038e718a19fc596f7b1baba0f83d5146dc7784
 
git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860
 
git://xenbits.xen.org/qemu-xen-traditional.git#3d273dd05e51e5a1ffba3d98c7437ee84e8f8764-3d273dd05e51e5a1ffba3d98c7437ee84e8f8764
 git://xenbits.xen.org/qemu-xen.git#ea6d3cd1ed79d824e605a70c3626bc4\
 37c386260-ea6d3cd1ed79d824e605a70c3626bc437c386260 
git://xenbits.xen.org/xen.git#d400dc5729e4e132d61c2e7df57d81aaed762044-b11910082d90bb1597f6679524eb726a33306672
 
git://xenbits.xen.org/xtf.git#17d372b763cb0b2e2e6b5a637c11f3997d2533fa-17d372b763cb0b2e2e6b5a637c11f3997d2533fa
Loaded 5001 nodes in revision graph
Searching for test results:
 152985 []
 153004 pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
d400dc5729e4e132d61c2e7df57d81aaed762044 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153028 [host=elbling1]
 153065 [host=pinot0]
 153109 [host=fiano0]
 153280 [host=albana0]
 153321 [host=albana1]
 153363 [host=pinot1]
 153400 [host=fiano1]
 153437 [host=godello1]
 153468 [host=huxelrebe0]
 153494 [host=godello0]
 153526 [host=huxelrebe1]
 153551 [host=chardonnay1]
 153591 [host=albana1]
 153602 [host=elbling0]
 153619 [host=chardonnay0]
 153653 [host=albana0]
 153758 [host=fiano1]
 153770 [host=elbling1]
 153788 [host=huxelrebe1]
 153813 [host=chardonnay1]
 153845 [host=elbling0]
 153882 [host=godello0]
 153906 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
b11910082d90bb1597f6679524eb726a33306672 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153931 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
b11910082d90bb1597f6679524eb726a33306672 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153955 pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
d400dc5729e4e132d61c2e7df57d81aaed762044 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153958 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
b11910082d90bb1597f6679524eb726a33306672 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153960 pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
2c8fabb2232d34d6d20a9ce6989e2e6cbee07d52 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153965 pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
e52716154da04967f9b9d7cf9a1655ea4bcd9e93 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153967 pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
a5eaac9245f4f382a3cd0e9710e9d1cba7db20e4 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153969 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
1be24cd17741192d1e18f24e6cf92f0ae9324e62 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153972 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
ad0fd291c5e79191c2e3c70e43dded569f11a450 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153976 pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
a5eaac9245f4f382a3cd0e9710e9d1cba7db20e4 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153979 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
ad0fd291c5e79191c2e3c70e43dded569f11a450 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153957 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
b11910082d90bb1597f6679524eb726a33306672 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153982 pass c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
a5eaac9245f4f382a3cd0e9710e9d1cba7db20e4 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
 153986 fail c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
ad0fd291c5e79191c2e3c70e43dded569f11a450 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
Searching for interesting versions
 Result found: flight 153004 (pass), for basis pass
 For basis failure, parent search stopping at 
c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
a5eaac9245f4f382a3cd0e9710e9d1cba7db20e4 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa, results HASH(0x56489ec1a300) 
HASH(0x56489ec09f70) HASH(0x56489ec26b68) For basis failure, parent search 
stopping at c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 3d273dd05\
 e51e5a1ffba3d98c7437ee84e8f8764 ea6d3cd1ed79d824e605a70c3626bc437c386260 
e52716154da04967f9b9d7cf9a1655ea4bcd9e93 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa, results HASH(0x56489ec22230) For 
basis failure, parent search stopping at 
c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
2c8fabb2232d34d6d20a9ce6989e2e6cbee07d52 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa, results HASH(0x56489ec0\
 f388) For basis failure, parent search stopping at 
c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
d400dc5729e4e132d61c2e7df57d81aaed762044 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa, results HASH(0x56489ec122b8) 
HASH(0x56489ec188f8) Result found: flight 153906 (fail), for basis failure (at 
ancestor ~257)
 Repro found: flight 153955 (pass), for basis pass
 Repro found: flight 153957 (fail), for basis failure
 0 revisions at c3038e718a19fc596f7b1baba0f83d5146dc7784 
c530a75c1e6a472b0eb9558310b518f0dfcd8860 
3d273dd05e51e5a1ffba3d98c7437ee84e8f8764 
ea6d3cd1ed79d824e605a70c3626bc437c386260 
a5eaac9245f4f382a3cd0e9710e9d1cba7db20e4 
17d372b763cb0b2e2e6b5a637c11f3997d2533fa
No revisions left to test, checking graph state.
 Result found: flight 153967 (pass), for last pass
 Result found: flight 153972 (fail), for first failure
 Repro found: flight 153976 (pass), for last pass
 Repro found: flight 153979 (fail), for first failure
 Repro found: flight 153982 (pass), for last pass
 Repro found: flight 153986 (fail), for first failure

*** Found and reproduced problem changeset ***

  Bug is in tree:  xen git://xenbits.xen.org/xen.git
  Bug introduced:  ad0fd291c5e79191c2e3c70e43dded569f11a450
  Bug not present: a5eaac9245f4f382a3cd0e9710e9d1cba7db20e4
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/153986/


  commit ad0fd291c5e79191c2e3c70e43dded569f11a450
  Author: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  Date:   Tue Aug 11 16:05:06 2020 +0100
  
      x86/pv: Rewrite segment context switching from scratch
      
      There are multiple bugs with the existing implementation.
      
      On AMD CPUs prior to Zen2, loading a NUL segment selector doesn't clear 
the
      segment base, which is a problem for 64bit code which typically expects 
to use
      a NUL %fs/%gs selector.
      
      On a context switch from any PV vcpu, to a 64bit PV vcpu with an %fs/%gs
      selector which faults, the fixup logic loads NUL, and the guest is 
entered at
      the failsafe callback with the stale base.
      
      Alternatively, a PV context switch sequence of 64 (NUL, non-zero base) =>
      32 (NUL) => 64 (NUL, zero base) will similarly cause Xen to enter the 
guest
      with a stale base.
      
      Both of these corner cases manifest as state corruption in the final vcpu.
      However, damage is limited to to 64bit code expecting to use Thread Local
      Storage with a base pointer of 0, which doesn't occur by default.
      
      The context switch logic is extremely complicated, and is attempting to
      optimise away loading a NUL selector (which is fast), or writing a 64bit 
base
      of 0 (which is rare).  Furthermore, it fails to respect Linux's ABI with
      userspace, which manifests as userspace state corruption as far as Linux 
is
      concerned.
      
      Always restore all selector and base state, in all cases.
      
      Leave a large comment explaining hardware behaviour, and the new ABI
      expectations.  Update the comments in the public headers.
      
      Drop all "segment preloading" to handle the AMD corner case.  It was never
      anything but a waste of time for %ds/%es, and isn't needed now that 
%fs/%gs
      bases are unconditionally written for 64bit PV guests.  In 
load_segments(),
      store the result of is_pv_32bit_vcpu() as it is an expensive predicate 
now,
      and not used in a way which impacts speculative safety.
      
      Reported-by: Andy Lutomirski <luto@xxxxxxxxxx>
      Reported-by: Sarah Newman <srn@xxxxxxxxx>
      Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
      Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

Revision graph left in 
/home/logs/results/bisect/xen-unstable/test-xtf-amd64-amd64-1.xtf--test-pv64-selftest.{dot,ps,png,html,svg}.
----------------------------------------
153986: tolerable ALL FAIL

flight 153986 xen-unstable real-bisect [real]
http://logs.test-lab.xenproject.org/osstest/logs/153986/

Failures :-/ but no regressions.

Tests which did not succeed,
including tests which could not be run:
 test-xtf-amd64-amd64-1       18 xtf/test-pv64-selftest  fail baseline untested
 test-xtf-amd64-amd64-1       19 leak-check/check        fail baseline untested


jobs:
 test-xtf-amd64-amd64-1                                       fail    


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.