[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Dom 0 crash
On 05/11/13 13:16, Jan Beulich wrote: On 05.11.13 at 12:58, Ian Murray <murrayie@xxxxxxxxxxx> wrote:I have a recurring crash using Xen 4.3.1-RC2 and Ubuntu 12.04 as Dom0 (3.2.0-55-generic). I have software RAID 5 with LVM's. DomU (also 12.04 Ubuntu 3.2.0-55 kernel) has a dedicated logical volume, which is being backed up shutting down the DomU, an LVM snapshot being created, restart of DomU and then the snapshot dd'ed to another logical volume. The snapshot is then removed and the second LV is dd'ed to gzip and onto DAT tape. I currently have this running every hour (unless its already running) for testing purposes. After 6-12 runs of this, the Dom0 kernel crashes with he below output. When I preform this booting into the same kernel standalone, the problem does not occur.Likely because the action that triggers this doesn't get performed in that case? Thanks for the response.I am obviously comparing apples and oranges, but I have tried to be as similar as possible in as much as I have limited kernel memory to 512M as I do with Dom0 and have used a background task writing /dev/urandom to the LV that the domU would normally be using. The only difference is that it isn't running under Xen and I don't have a domU running in the background. I will repeat the exercise with no domU running, but under Xen. Can anyone please suggest what I am doing wrong or identify if it is bug?Considering that exception address ...RIP: e030:[<ffffffff8142655d>] [<ffffffff8142655d>] scsi_dispatch_cmd+0x6d/0x2e0... and call stack ...[24149.786311] Call Trace: [24149.786315] <IRQ> [24149.786323] [<ffffffff8142da62>] scsi_request_fn+0x3a2/0x470 [24149.786333] [<ffffffff812f1a28>] blk_run_queue+0x38/0x60 [24149.786339] [<ffffffff8142c416>] scsi_run_queue+0xd6/0x1b0 [24149.786347] [<ffffffff8142e822>] scsi_next_command+0x42/0x60 [24149.786354] [<ffffffff8142ea52>] scsi_io_completion+0x1b2/0x630 [24149.786363] [<ffffffff816611fe>] ? _raw_spin_unlock_irqrestore+0x1e/0x30 [24149.786371] [<ffffffff81424b5c>] scsi_finish_command+0xcc/0x130 [24149.786378] [<ffffffff8142e7ae>] scsi_softirq_done+0x13e/0x150 [24149.786386] [<ffffffff812fb6b3>] blk_done_softirq+0x83/0xa0 [24149.786394] [<ffffffff8106fa38>] __do_softirq+0xa8/0x210 [24149.786402] [<ffffffff8166ba6c>] call_softirq+0x1c/0x30 [24149.786410] [<ffffffff810162f5>] do_softirq+0x65/0xa0 [24149.786416] [<ffffffff8106fe1e>] irq_exit+0x8e/0xb0 [24149.786428] [<ffffffff813aecd5>] xen_evtchn_do_upcall+0x35/0x50 [24149.786436] [<ffffffff8166babe>] xen_do_hypervisor_callback+0x1e/0x30 [24149.786441] <EOI> [24149.786449] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [24149.786456] [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000 [24149.786464] [<ffffffff8100a500>] ? xen_safe_halt+0x10/0x20 [24149.786472] [<ffffffff8101c913>] ? default_idle+0x53/0x1d0 [24149.786478] [<ffffffff81013236>] ? cpu_idle+0xd6/0x120... point into the SCSI subsystem, this is likely the wrong list to ask for help on. ... but the right list to confirm that I am on the wrong list? :)Seriously, the specific evidence may suggest it's a non-Xen issue/bug, but Xen is the only measurable/visible difference so far. I referred it to this list because here the demarcation between hypervisor, PVOPS and regular kernel code interaction is likely best understood. Thanks again for your response. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |