[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen/blkback: Check for insane amounts of request on the ring.
On 6/11/2013 3:42 AM, Jan Beulich wrote: On 10.06.13 at 18:43, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:On Mon, Jun 10, 2013 at 04:52:35PM +0100, Jan Beulich wrote:On 07.06.13 at 22:11, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:On Tue, Jun 04, 2013 at 03:57:06PM -0400, Konrad Rzeszutek Wilk wrote:+ /* N.B. 'rp', not 'rc'. */ + if (RING_REQUEST_CONS_OVERFLOW(&blk_rings->common, rp)) { + pr_warn(DRV_PFX "Frontend provided bogus ring requests (%d - %d = %d).Halting ring processing on dev=%04x\n",+ rp, rc, rp - rc, blkif->vbd.pdevice);Hm, I seem to be able to get: [ 189.398095] xen-blkback:Frontend provided bogus ring requests (125 - 115 = 10). Halting ring processing on dev=800011 or: [ 478.558699] xen-blkback:Frontend provided bogus ring requests (95 - 94 = 1). Halting ring processing on dev=800011 Which is clearly wrong. Piggybacking on the rsp_prod_pvt does not seem to cut it.We see that too, but not very frequently. One thing is that rsp_prod_pvt doesn't get printed along with rc and rp, thus making it not immediately obvious how this can be off in any way. Among the instance there are cases where the printed difference is 32, which makes me wonder whether part of the problem is the >= in the macro (we may want > here). And then we might have been living with some sort of issue in the past, because the existing use of the macro just causes the loop to be exited, with it getting re-entered subsequently (i.e. at worst causing performance issues).My observation was that the rsp_prod_pvt was lagging behind b/c the READ requests weren't completed. In other words, the processing of the ring was stalled b/c 'make_response' hadn't been called yet. Which meant that rsp_prod was not updated to rsp_prod_pvt (backend does not care about that value, only frontend does).I don't buy this: rsp_prod is being updated by the backend just for the frontend's sake, so this value really doesn't need looking at (or else we'd become susceptible to the guest maliciously writing that field). rsp_prod_pvt, otoh, is never behind rsp_prod, and if the guest produces requests that don't have matching space for responses, the guest is doing something bogus (and perhaps malicious). I believe this is what I saw with the rsp_prod_pvt added in the printk. Unfortunately I did not save the logs. Going back to the rc an rp check solves the immediate 'insane ring check'.Consequently, while this check is better than none at all, I think it is still too lax, and we really want to check against the produced responses. Just that other than for the rc check using >=, we'd need > for the rp one. But first of all let me see if I can get the original broken check to trigger wrongly here (so far only our stage testing caught these), and look at by how much rsp_prod_pvt really lags. OK. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |