[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [xen-unstable-smoke test] 117106: trouble: blocked/broken



flight 117106 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/117106/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64                     <job status>                 broken
 test-arm64-arm64-xl-xsm         <job status>                 broken
 build-armhf                     <job status>                 broken
 build-armhf                   4 host-install(4)        broken REGR. vs. 116956
 build-amd64                   4 host-install(4)        broken REGR. vs. 116956

Tests which did not succeed, but are not blocking:
 build-amd64-libvirt           1 build-check(1)               blocked  n/a
 test-armhf-armhf-xl           1 build-check(1)               blocked  n/a
 test-amd64-amd64-libvirt      1 build-check(1)               blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386  1 build-check(1)         blocked n/a
 test-arm64-arm64-xl-xsm       1 build-check(1)               blocked  n/a

version targeted for testing:
 xen                  b95f7be32d668fa4b09300892ebe19636ecebe36
baseline version:
 xen                  43550972395f9a3a48bb4086a0faf0f8d442e37d

Last test of basis   116956  2017-12-07 23:02:27 Z    6 days
Failing since        117015  2017-12-08 22:03:21 Z    5 days   12 attempts
Testing same since   117106  2017-12-12 18:02:13 Z    1 days    1 attempts

------------------------------------------------------------
People who touched revisions under test:
  Andre Przywara <andre.przywara@xxxxxxx>
  Andre Przywara <andre.przywara@xxxxxxxxxx>
  Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  Daniel Kiper <daniel.kiper@xxxxxxxxxx>
  Jan Beulich <jbeulich@xxxxxxxx>
  Julien Grall <julien.grall@xxxxxxx>
  Julien Grall <julien.grall@xxxxxxxxxx>
  Stefano Stabellini <sstabellini@xxxxxxxxxx>

jobs:
 build-amd64                                                  broken  
 build-armhf                                                  broken  
 build-amd64-libvirt                                          blocked 
 test-armhf-armhf-xl                                          blocked 
 test-arm64-arm64-xl-xsm                                      broken  
 test-amd64-amd64-xl-qemuu-debianhvm-i386                     blocked 
 test-amd64-amd64-libvirt                                     blocked 


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary

broken-job build-amd64 broken
broken-job test-arm64-arm64-xl-xsm broken
broken-job build-armhf broken
broken-step build-armhf host-install(4)
broken-step build-amd64 host-install(4)

Not pushing.

------------------------------------------------------------
commit b95f7be32d668fa4b09300892ebe19636ecebe36
Author: Jan Beulich <jbeulich@xxxxxxxx>
Date:   Tue Dec 12 16:56:15 2017 +0100

    x86/mm: drop bogus paging mode assertion
    
    Olaf has observed this assertion to trigger after an aborted migration
    of a PV guest:
    
    (XEN) Xen call trace:
    (XEN)    [<ffff82d0802a85dc>] do_page_fault+0x39f/0x55c
    (XEN)    [<ffff82d08036b7d8>] 
x86_64/entry.S#handle_exception_saved+0x66/0xa4
    (XEN)    [<ffff82d0802a9274>] __copy_to_user_ll+0x22/0x30
    (XEN)    [<ffff82d0802772d4>] update_runstate_area+0x19c/0x228
    (XEN)    [<ffff82d080277371>] domain.c#_update_runstate_area+0x11/0x39
    (XEN)    [<ffff82d080277596>] context_switch+0x1fd/0xf25
    (XEN)    [<ffff82d0802395c5>] schedule.c#schedule+0x303/0x6a8
    (XEN)    [<ffff82d08023d067>] softirq.c#__do_softirq+0x6c/0x95
    (XEN)    [<ffff82d08023d0da>] do_softirq+0x13/0x15
    (XEN)    [<ffff82d08036b2f1>] x86_64/entry.S#process_softirqs+0x21/0x30
    
    Release builds work fine, which is a first indication that the assertion
    isn't really needed.
    
    What's worse though - there appears to be a timing window where the
    guest runs in shadow mode, but not in log-dirty mode, and that is what
    triggers the assertion (the same could, afaict, be achieved by test-
    enabling shadow mode on a PV guest). This is because turing off log-
    dirty mode is being performed in two steps: First the log-dirty bit gets
    cleared (paging_log_dirty_disable() [having paused the domain] ->
    sh_disable_log_dirty() -> shadow_one_bit_disable()), followed by
    unpausing the domain and only then clearing shadow mode (via
    shadow_test_disable(), which pauses the domain a second time).
    
    Hence besides removing the ASSERT() here (or optionally replacing it by
    explicit translate and refcounts mode checks, but this seems rather
    pointless now that the three are tied together) I wonder whether either
    shadow_one_bit_disable() should turn off shadow mode if no other bit
    besides PG_SH_enable remains set (just like shadow_one_bit_enable()
    enables it if not already set), or the domain pausing scope should be
    extended so that both steps occur without the domain getting a chance to
    run in between.
    
    Reported-by: Olaf Hering <olaf@xxxxxxxxx>
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Reviewed-by: Tim Deegan <tim@xxxxxxx>
    Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

commit 7f1061938415f0d93d7ee6040e49236d2e050627
Author: Jan Beulich <jbeulich@xxxxxxxx>
Date:   Tue Dec 12 14:31:55 2017 +0100

    x86emul: build SIMD tests with -Os
    
    Specifically in the context of putting together subsequent patches I've
    noticed that together with the touch() macro using -Os further
    increases the chances of the compiler using memory operands for the
    instructions we actually care to test.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxxx>
    Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

commit 9589927e5bf9e123ec42b6e0b0809f153bd92732
Author: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
Date:   Tue Dec 12 14:30:53 2017 +0100

    x86/mb2: avoid Xen image when looking for module/crashkernel position
    
    Commit e22e1c4 (x86/EFI: avoid Xen image when looking for module/kexec
    position) added relevant check for EFI case. However, since commit
    f75a304 (x86: add multiboot2 protocol support for relocatable images)
    Multiboot2 compatible bootloaders are able to relocate Xen image too.
    So, we have to avoid also Xen image region in such cases.
    
    Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
    Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
    Signed-off-by: Daniel Kiper <daniel.kiper@xxxxxxxxxx>
    Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

commit b4d0218cff66b7eaa9c9b8dc9bd71e7b089b016d
Author: Jan Beulich <jbeulich@xxxxxxxx>
Date:   Tue Dec 12 14:30:17 2017 +0100

    x86/paging: don't unconditionally BUG() on finding SHARED_M2P_ENTRY
    
    PV guests can fully control the values written into the P2M.
    
    This is XSA-251.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

commit 10be8001de7d87be1f0ccdda75cc70e922e56d03
Author: Jan Beulich <jbeulich@xxxxxxxx>
Date:   Tue Dec 12 14:29:45 2017 +0100

    x86/shadow: fix ref-counting error handling
    
    The old-Linux handling in shadow_set_l4e() mistakenly ORed together the
    results of sh_get_ref() and sh_pin(). As the latter failing is not a
    correctness problem, simply ignore its return value.
    
    In sh_set_toplevel_shadow() a failing sh_get_ref() must not be
    accompanied by installing the entry, despite the domain being crashed.
    
    This is XSA-250.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Reviewed-by: Tim Deegan <tim@xxxxxxx>

commit 54e2292e8df7a1a7b041192be9d6d797b6d00869
Author: Jan Beulich <jbeulich@xxxxxxxx>
Date:   Tue Dec 12 14:29:13 2017 +0100

    x86/shadow: fix refcount overflow check
    
    Commit c385d27079 ("x86 shadow: for multi-page shadows, explicitly track
    the first page") reduced the refcount width to 25, without adjusting the
    overflow check. Eliminate the disconnect by using a manifest constant.
    
    Interestingly, up to commit 047782fa01 ("Out-of-sync L1 shadows: OOS
    snapshot") the refcount was 27 bits wide, yet the check was already
    using 26.
    
    This is XSA-249.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxxx>
    Reviewed-by: Tim Deegan <tim@xxxxxxx>

commit ff2a793e15bb0b6254bc849ef8e83e1c284c3583
Author: Jan Beulich <jbeulich@xxxxxxxx>
Date:   Tue Dec 12 14:28:36 2017 +0100

    x86/mm: don't wrongly set page ownership
    
    PV domains can obtain mappings of any pages owned by the correct domain,
    including ones that aren't actually assigned as "normal" RAM, but used
    by Xen internally.  At the moment such "internal" pages marked as owned
    by a guest include pages used to track logdirty bits, as well as p2m
    pages and the "unpaged pagetable" for HVM guests. Since the PV memory
    management and shadow code conflict in their use of struct page_info
    fields, and since shadow code is being used for log-dirty handling for
    PV domains, pages coming from the shadow pool must, for PV domains, not
    have the domain set as their owner.
    
    While the change could be done conditionally for just the PV case in
    shadow code, do it unconditionally (and for consistency also for HAP),
    just to be on the safe side.
    
    There's one special case though for shadow code: The page table used for
    running a HVM guest in unpaged mode is subject to get_page() (in
    set_shadow_status()) and hence must have its owner set.
    
    This is XSA-248.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Reviewed-by: Tim Deegan <tim@xxxxxxx>
    Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxxx>

commit e40b0219a8c77741ae48989efb520f4a762a5be3
Author: Jan Beulich <jbeulich@xxxxxxxx>
Date:   Tue Dec 12 14:27:34 2017 +0100

    x86: don't wrongly trigger linear page table assertion (2)
    
    _put_final_page_type(), when free_page_type() has exited early to allow
    for preemption, should not update the time stamp, as the page continues
    to retain the typ which is in the process of being unvalidated. I can't
    see why the time stamp update was put on that path in the first place
    (albeit it may well have been me who had put it there years ago).
    
    This is part of XSA-240.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Tested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
    Reviewed-by: George Dunlap <george.dunlap.com>

commit c6c2fc6e4919a1420096b94a4ba8682f20e92709
Author: Julien Grall <julien.grall@xxxxxxxxxx>
Date:   Wed Nov 1 14:03:14 2017 +0000

    xen/arm32: mm: Rework is_xen_heap_page to avoid nameclash
    
    The arm32 version of the function is_xen_heap_page currently define a
    variable _mfn. This will lead to a compiler when use typesafe MFN in a
    follow-up patch:
    
    called object '_mfn' is not a function or function pointer
    
    Fix it by renaming the local variable _mfn to mfn_.
    
    Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
    Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>

commit 4da2ec1b1584975012ceda49160bd2f276076d5d
Author: Julien Grall <julien.grall@xxxxxxxxxx>
Date:   Wed Nov 1 14:03:13 2017 +0000

    xen/arm: domain_build: Clean-up insert_11_bank
    
        - Remove spurious ()
        - Add missing spaces
        - Turn 1 << to 1UL <<
        - Rename spfn to smfn and switch to mfn_t
    
    Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
    Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>

commit 70f7b6ca0e8208034bdc91d20b2f311bbe63a0a9
Author: Andre Przywara <andre.przywara@xxxxxxxxxx>
Date:   Thu Dec 7 16:14:08 2017 +0000

    ARM: VGIC: move gic_remove_irq_from_queues()
    
    gic_remove_irq_from_queues() was not only misnamed, it also has the wrong
    abstraction, as it should not live in gic.c.
    Move it into vgic.c and vgic.h, where it belongs to, and rename it on
    the way.
    
    Signed-off-by: Andre Przywara <andre.przywara@xxxxxxxxxx>
    Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>

commit 9630c5ae363b4cbf8eb61366530f40c80680af4d
Author: Julien Grall <julien.grall@xxxxxxx>
Date:   Wed Dec 6 14:51:37 2017 +0000

    xen/arm: gic-v3: Bail out if gicv3_cpu_init fail
    
    When system registers are not enabled, all the access to them will trap
    in EL2. In Xen, system registers will be enabled by gicv3_cpu_init only
    on success. As the rest of the code (e.g gicv3_hyp_init) relies on
    system register, it is better to bail out directly.
    
    This will save time on debugging early boot issue on GICv3 platform.
    
    Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
    Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>

commit ac2d8d402370f6f93f82871f3b34ddb9a9ccae05
Author: Julien Grall <julien.grall@xxxxxxxxxx>
Date:   Wed Nov 29 17:46:35 2017 +0000

    xen/arm: Surround HSR_SYSREG macro value with ()
    
    The value of the macro HCR_SYSREG is not surrounded by (). This means
    the behavior may change depend on how it is used.
    
    Thanksfully recent GCC will issue a warning for that.
    
    Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
    Reviewed-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>

commit b819187a15ecea7fe00cffded1bf454b8a6d7dd2
Author: Andre Przywara <andre.przywara@xxxxxxx>
Date:   Thu Oct 19 13:48:37 2017 +0100

    ARM: vGIC: fix nr_irq definition
    
    The global variable "nr_irqs" is used for x86 and some common Xen code.
    To make the latter work easily for ARM, it was #defined to NR_IRQS.
    This not only violated the common habit of capitalizing macros, but
    also caused issues if one wanted to use a rather innocent "nr_irqs" as
    a local variable name or as a function parameter.
    Drop the optimization and make nr_irqs a normal variable for ARM also.
    
    Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>

commit 2e9b1c655f060b5c4e68bc8499f02253babe1bbc
Author: Andre Przywara <andre.przywara@xxxxxxx>
Date:   Thu Oct 19 13:48:36 2017 +0100

    ARM: remove unneeded gic.h inclusions
    
    gic.h is supposed to hold defines and prototypes for the hardware side
    of the GIC interrupt controller. A lot of parts in Xen should not be
    bothered with that, as they either only care about the VGIC or use
    more generic interfaces.
    Remove unneeded inclusions of gic.h from files where they are actually
    not needed.
    
    Signed-off-by: Andre Przywara <andre.przywara@xxxxxxx>

commit c05aa4afac64ea687c1a2bf9277ba6552809495b
Author: Julien Grall <julien.grall@xxxxxxxxxx>
Date:   Wed Nov 29 17:57:32 2017 +0000

    xen/arm: bootfdt: Use proper default for #address-cells and #size-cells
    
    Per the device-tree specific [1], when the property #address-cells
    and  #size-cells are not present, the default value should be resp. 1
    and 2.
    
    [1] 
https://www.devicetree.org/downloads/devicetree-specification-v0.1-20160524.pdf
    
    Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
    Acked-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
(qemu changes not included)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.