[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP ProLiant G6 with dual Xeon 5540 (Nehalem)



Qing,

Your patch worked. It suppressed the extra write that previously overwrote the 
MSI message data with the old vector. No more "no handler for irq" message and 
the interrupts were successfully migrated to the new CPU. I still experienced a 
hang on both domU and dom0 when I changed the smp_affinity of all 4 PCI devices 
(I have a 4-function PCI device) simultaneously (the "echo <new_smp_affinity> > 
/proc/irq/<irq#>/smp_affinity" are in a shell script) but I didn't get a chance 
to pursue this today.

Dante

-----Original Message-----
From: Qing He [mailto:qing.he@xxxxxxxxx] 
Sent: Wednesday, October 21, 2009 10:11 PM
To: Zhang, Xiantao
Cc: Cinco, Dante; xen-devel@xxxxxxxxxxxxxxxxxxx; keir.fraser@xxxxxxxxxxxxx
Subject: Re: [Xen-devel] IRQ SMP affinity problems in domU with vcpus > 4 on HP 
ProLiant G6 with dual Xeon 5540 (Nehalem)

On Thu, 2009-10-22 at 09:58 +0800, Zhang, Xiantao wrote:
> > (XEN) traps.c:1626: guest_io_write::pci_conf_write data=0x40ba
> 
> This should be written by dom0(likely to be Qemu).  And if it does 
> exist, we may have to prohibit such unsafe writings about MSI in Qemu.

Yes, it is the case, the problem happens in Qemu, the algorithm looks like 
below:

    pt_pci_write_config(new_value)
    {
        dev_value = pci_read_block();

        value = msi_write_handler(dev_value, new_value);

        pci_write_block(value);

    }

    msi_write_handler(dev_value, new_value)
    {
        HYPERVISOR_bind_pt_irq(); // updates MSI binding

        return dev_value;   // it decides not to change it
    }

The problem lies here, when bind_pt_irq is called, the real physical 
data/address is updated by the hypervisor. There were no problem exposed before 
because at that time hypervisor uses a universal vector , the data/address of 
msi remains unchanged. But this isn't the case when per-CPU vector is there, 
the pci_write_block is undesirable in QEmu now, it writes stale value back into 
the register and invalidate any modifications.

Clearly, if QEmu decides to hand the management of these registers to the 
hypervisor, it shouldn't touch them again. Here is a patch to fix this by 
introducing a no_wb flag. Can you have a try?

Thanks,
Qing

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.