[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [XEN PATCH v10 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0
On 19.06.2024 10:51, Chen, Jiqian wrote: > On 2024/6/19 16:06, Jan Beulich wrote: >> On 19.06.2024 09:53, Chen, Jiqian wrote: >>> On 2024/6/18 16:55, Jan Beulich wrote: >>>> On 18.06.2024 08:57, Chen, Jiqian wrote: >>>>> On 2024/6/17 22:52, Jan Beulich wrote: >>>>>> On 17.06.2024 11:00, Jiqian Chen wrote: >>>>>>> The gsi of a passthrough device must be configured for it to be >>>>>>> able to be mapped into a hvm domU. >>>>>>> But When dom0 is PVH, the gsis don't get registered, it causes >>>>>>> the info of apic, pin and irq not be added into irq_2_pin list, >>>>>>> and the handler of irq_desc is not set, then when passthrough a >>>>>>> device, setting ioapic affinity and vector will fail. >>>>>>> >>>>>>> To fix above problem, on Linux kernel side, a new code will >>>>>>> need to call PHYSDEVOP_setup_gsi for passthrough devices to >>>>>>> register gsi when dom0 is PVH. >>>>>>> >>>>>>> So, add PHYSDEVOP_setup_gsi into hvm_physdev_op for above >>>>>>> purpose. >>>>>>> >>>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx> >>>>>>> Signed-off-by: Huang Rui <ray.huang@xxxxxxx> >>>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx> >>>>>>> --- >>>>>>> The code link that will call this hypercall on linux kernel side is as >>>>>>> follows: >>>>>>> https://lore.kernel.org/xen-devel/20240607075109.126277-3-Jiqian.Chen@xxxxxxx/ >>>>>> >>>>>> One of my v9 comments was addressed, thanks. Repeating the other, >>>>>> unaddressed >>>>>> one here: >>>>>> "As to GSIs not being registered: If that's not a problem for Dom0's own >>>>>> operation, I think it'll also want/need explaining why what is >>>>>> sufficient for >>>>>> Dom0 alone isn't sufficient when pass-through comes into play." >>>>> I have modified the commit message to describe why GSIs are not >>>>> registered can cause passthrough not work, according to this v9 comment. >>>>> " it causes the info of apic, pin and irq not be added into irq_2_pin >>>>> list, and the handler of irq_desc is not set, then when passthrough a >>>>> device, setting ioapic affinity and vector will fail." >>>>> What description do you want me to add? >>>> >>>> What I'd first like to have clarification on (i.e. before putting it in >>>> the description one way or another): How come Dom0 alone gets away fine >>>> without making the call, yet for passthrough-to-DomU it's needed? Is it >>>> perhaps that it just so happened that for Dom0 things have been working >>>> on systems where it was tested, but the call should in principle have been >>>> there in this case, too [1]? That (to me at least) would make quite a >>>> difference for both this patch's description and us accepting it. >>> Oh, I think I know what's your concern now. Thanks. >>> First question, why gsi of device can work on PVH dom0: >>> Because when probe a driver to a normal device, it will call linux kernel >>> side:pci_device_probe-> request_threaded_irq-> irq_startup-> >>> __unmask_ioapic-> io_apic_write, then trap into xen side hvmemul_do_io-> >>> hvm_io_intercept-> hvm_process_io_intercept-> vioapic_write_indirect-> >>> vioapic_hwdom_map_gsi-> mp_register_gsi. So that the gsi can be registered. >>> Second question, why gsi of passthrough can't work on PVH dom0: >>> Because when assign a device to be passthrough, it uses pciback to probe >>> the device, and it calls pcistub_probe, but in all callstack of >>> pcistub_probe, it doesn't unmask the gsi, and we can see on Xen side, the >>> function vioapic_hwdom_map_gsi-> mp_register_gsi will be called only when >>> the gsi is unmasked, so that the gsi can't work for passthrough device. >> >> And why exactly would the fake IRQ handler not be set up by pciback? Its >> setting up ought to lead to those same IO-APIC RTE writes that Xen >> intercepts. > Because isr_on is not set, when xen_pcibk_control_isr is called, it will > return due to " !dev_data->isr_on". So that fake IRQ handler aren't installed. I'm afraid I don't follow you here. Quoting from the function: enable = dev_data->enable_intx; /* Asked to disable, but ISR isn't runnig */ if (!enable && !dev_data->isr_on) return; I.e. we bail if the request was to _disable_ and there is no ISR. I can't exclude though that command_write()'s logic to set ->enable_intx is insufficient. But in the common case one would surely expect at least one of PCI_COMMAND_MEMORY and PCI_COMMAND_IO to be set first by a guest. IOW at some point I'd expect xen_pcibk_control_isr() to be called with the second argument 0 and with ->enable_intx set. > And it seems isr_on is set through driver sysfs " irq_handler_state" for a > level device that is to be shared with guest and the IRQ is shared with the > initial domain. The sysfs interface is, according to my reading of the description of the commit introducing it, merely for debugging / recovery purposes. (It also looks to me as if this was partly broken: If one would use it, thus clearing ->isr_on, a subsequent disable request would take exactly that early bailing path quoted above, with nothing removing the IRQ handler.) That description also talks about both an edge vs level distinction in behavior and one for shared vs non-shared, but neither in that commit nor in present code I can spot any respective checks. Otherwise I could understand that there are cases where the necessary information isn't propagated to Xen. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |