[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ASSERT in rangeset_remove_range



On Tue, 9 Nov 2021, Julien Grall wrote:
> (+ Jan, Andrew, Wei for the common code)
> 
> On 08/11/2021 22:45, Stefano Stabellini wrote:
> > Hi Oleksandr, Julien,
> 
> Hi,
> 
> > I discovered a bug caused by the recent changes to introduce extended
> > regions in make_hypervisor_node (more logs appended):
> > 
> > 
> > (XEN) d1 BANK[0] 0x00000040000000-0x0000007e800000 (1000MB)
> > (XEN) d1 BANK[1] 0x00000200000000-0x00000200000000 (0MB)
> > (XEN) DEBUG find_unallocated_memory 994 s=40000000 e=7e7fffff
> > (XEN) DEBUG find_unallocated_memory 994 s=200000000 e=1ffffffff
> > (XEN) Assertion 's <= e' failed at rangeset.c:189
> > 
> > 
> > When a bank of memory is zero in size, then rangeset_remove_range is
> > called with end < start, triggering an ASSERT in rangeset_remove_range.
> > 
> > One solution is to avoid creating 0 size banks. The following change
> > does that:
> > 
> > 
> > diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> > index 49b4eb2b13..3efe542d0f 100644
> > --- a/xen/arch/arm/domain_build.c
> > +++ b/xen/arch/arm/domain_build.c
> > @@ -459,9 +459,12 @@ static void __init allocate_memory(struct domain *d,
> > struct kernel_info *kinfo)
> >           goto fail;
> >         bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
> > -    if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> > +    if ( bank_size > 0 )
> > +    {
> > +        if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
> >                                  bank_size) )
> > -        goto fail;
> > +            goto fail;
> > +    }
> 
> I would move the size check in allocate_bank_memory().

Sure, I can do that


> >         if ( kinfo->unassigned_mem )
> >           goto fail;
> > 
> > 
> > 
> > We have a couple of other options too:
> > 
> > - remove the ASSERT in rangeset_remove_range
> > There is an argument that we should simply return error
> > fromrangeset_remove_range, rather than a full assert.
> 
> To be honest, this is a developper mistake to call with end < start. If we
> were going to return an error then we would completely hide (even in
> developper) it because we would fallback to not exposing extended regions.
> 
> So I am not sure switch from ASSERT() to a plain check is a good idea. Jan,
> Andrew, Wei, what do you think?
> 
> That said, this option would not be sufficient to fix your problem as extended
> regions will not work.
> 
> > 
> > - add a if (end > start) check before calling rangeset_remove_range
> > We could check that end > start before calling rangeset_remove_range at
> > all the call sites in domain_build.c. There are 5 call sites at the
> > moment.
> 
> I think we only want to add (end > start) where we expect the region size to
> be 0. AFAICT, the only other potential place where this can happens is
> ``find_memory_holes()`` (I vaguely recall a discussion in the past where some
> of the "reg"  property would have size == 0).
> 
> > 
> > Any other ideas or suggestions?
> 
> My preference goes with your initial sugestion (so long the check is moved to
> allocate_bank_memory()).
> 
> [...]
> 
> > (XEN) Assertion 's <= e' failed at rangeset.c:189
> > (XEN) ----[ Xen-4.16-rc  arm64  debug=y  Not tainted ]----
> > (XEN) Xen call trace:
> > (XEN)    [<0000000000220e6c>] rangeset_remove_range+0xbc/0x2bc (PC)
> > (XEN)    [<00000000002cd508>]
> > domain_build.c#make_hypervisor_node+0x258/0x7f4 (LR)
> > (XEN)    [<00000000002cf2a8>] domain_build.c#construct_domU+0x9cc/0xa8c
> 
> Vanilla staging doesn't call make_hypervisor_node() from construct_domU. So
> what are you using?

Well spotted. This is my WIP branch with PV drivers support for Dom0less
guests (soon to be sent to xen-devel). The underlying bug could affect
vanilla Xen too as it only takes a zero-size bank to trigger it, but it
is certainly harder to reproduce because make_hypervisor_node is only
called for Dom0 and allocate_bank_memory (the function that today always
adds a zero size bank) is called for DomUs.


> > (XEN)    [<00000000002d0440>] create_domUs+0xe8/0x224
> > (XEN)    [<00000000002d4988>] start_xen+0xafc/0xbf0
> > (XEN)    [<00000000002001a0>] arm64/head.o#primary_switched+0xc/0x1c
> > (XEN)
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) Assertion 's <= e' failed at rangeset.c:189
> > (XEN) ****************************************



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.