[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/mm: Short circuit damage from "fishy" ref/typecount failure
On 29.01.2021 17:17, Andrew Cooper wrote: > On 29/01/2021 11:29, Jan Beulich wrote: >> On 25.01.2021 18:59, Andrew Cooper wrote: >>> On 20/01/2021 08:06, Jan Beulich wrote: >>>> Also, as far as "impossible" here goes - the constructs all >>>> anyway exist only to deal with what we consider impossible. >>>> The question therefore really is of almost exclusively >>>> theoretical nature, and hence something like a counter >>>> possibly overflowing imo needs to be accounted for as >>>> theoretically possible, albeit impossible with today's >>>> computers and realistic timing assumptions. If a counter >>>> overflow occurred, it definitely wouldn't be because of a >>>> bug in Xen, but because of abnormal behavior elsewhere. >>>> Hence I remain unconvinced it is appropriate to deal with >>>> the situation by BUG(). >>> I'm not sure how to be any clearer. >>> >>> I am literally not changing the current behaviour. Xen *will* hit a >>> BUG() if any of these domain_crash() paths are taken. >>> >>> If you do not believe me, then please go and actually check what happens >>> when simulating a ref-acquisition failure. >> So I've now also played the same game on the ioreq path (see >> debugging patch below, and again with some non-"//temp" >> changes actually improving overall behavior in that "impossible" >> case). No BUG()s hit, no leaks (thanks to the extra changes), >> no other anomalies observed. >> >> Hence I'm afraid it is now really up to you to point out the >> specific BUG()s (and additional context as necessary) that you >> either believe could be hit, or that you have observed being hit. > > The refcounting logic was taken verbatim from ioreq, with the only > difference being an order greater than 0. The logic is also identical > to the vlapic logic. > > And the reason *why* it bugs is obvious - the cleanup logic > unconditionally put()'s refs it never took to begin with, and hits > underflow bugchecks. In current staging, neither vmx_alloc_vlapic_mapping() nor hvm_alloc_ioreq_mfn() put any refs they couldn't get. Hence my failed attempt to repro your claimed misbehavior. > For specifics, a simulated regular ref failure: > > diff --git a/xen/common/domain.c b/xen/common/domain.c > index 1051d86a20..314d258e31 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -171,9 +171,14 @@ static int vmtrace_alloc_buffer(struct vcpu *v) > v->vmtrace.buf = pg; > > for ( i = 0; i < d->vmtrace_frames; i++ ) > + { > + if ( i == 0 ) > + return -ENOMEM; > + > /* Domain can't know about this page yet - something fishy > going on. */ > if ( !get_page_and_type(&pg[i], d, PGT_writable_page) ) > BUG(); > + } No bad put here. This should have BUG() replaced just like the other two instances mentioned above have it. > and the simulated typeref failure: > > diff --git a/xen/common/domain.c b/xen/common/domain.c > index db845ccc81..bd810157f4 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -172,8 +172,16 @@ static int vmtrace_alloc_buffer(struct vcpu *v) > > for ( i = 0; i < d->vmtrace_frames; i++ ) > { > + get_page(&pg[i], d); > + > if ( i == 0 ) > + { > + put_page(&pg[i]); This of course if wrong is the get_page() failed. Assuming it succeeds I don't see what's wrong here. > return -ENOMEM; > + } > + > + get_page_type(&pg[i], PGT_writable_page); > + continue; > > /* Domain can't know about this page yet - something fishy > going on. */ > if ( !get_page_and_type(&pg[i], d, PGT_writable_page) ) > > both yield: > > (XEN) Xen BUG at /local/xen.git/xen/include/xen/mm.h:610 > (XEN) RIP: e008:[<ffff82d04020423e>] > common/domain.c#vmtrace_free_buffer+0x57/0xc1 > (XEN) Xen call trace: > (XEN) [<ffff82d04020423e>] R > common/domain.c#vmtrace_free_buffer+0x57/0xc1 > (XEN) [<ffff82d040205497>] F vcpu_create+0x245/0x32b > (XEN) [<ffff82d04023ae5b>] F do_domctl+0xb48/0x1964 > (XEN) [<ffff82d04030c6b2>] F pv_hypercall+0x2e4/0x53d > (XEN) [<ffff82d04039045d>] F lstar_enter+0x12d/0x140 So maybe vmtrace_free_buffer() is broken? Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |