[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN PATCH v10 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • Date: Wed, 19 Jun 2024 10:10:41 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UAFaY3835/5VXsvcwiLyNxPzUKU3mPun3ddUNNa43as=; b=nMqLytwbzGDzG/LyQdh1RtdKPSVe47BlK8oHM2Ickh/rkOnxpQzjJQY9sFMZN8B3/ApI4jqNqhxZwfs2V2Nc5zpP/EzlrglN7Zg2WCymKB0SpOUlunEMMuT3KqKpRzH7LLgHKmcjBqCPmMKwpPEb+DQ+FQZpZ8tnh8gkjbrVFDutixvJvMxSud9b5S/TGlAE7+6uOF04wPTvH6C4wCbKZCoip8HIUTRxbVyVJQ4H1rFTz33/MkMcAWD9BEvC2YgrfURVO4MzFxOjRs994bB+cZ9UcuivdhYrPMq1nkHeqK2pQpRF1+u5ngIUlwY2PwwPTjceDwxAf8grLKshUvSUAg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P1GnRDn0aMS+dNaWBkKrUhExfMQPlN0fPjf9zLtonvtg7ZREsY7DjET//nf34ZrB28+YI3i80wdJjOqk8w740OgxjJgVRiaAEp/mrQ3nFhVH834PAlWSbOqlJpnR99B6C6gx1a+fMFfMi9u1/hwBq95XMwRCJySzDSNWmeNVYtCu7b5KZZNpMURyxaQGhJBK+Q/bQcEXDOXX/znepCiO8ZX0GO7dL3R8Tx257GY8/wZUAI5D8Fu6klDQnMGZM9XC+fTTqVehIwbqqXkX/9B2NKUaJgDkRZZGbCc0yvSGo5dIjVhM5TJGAoE2BsvRzvxEkRatEuR7Od6LDAIaqRmBZg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony@xxxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, "Daniel P . Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, "Hildebrand, Stewart" <Stewart.Hildebrand@xxxxxxx>, "Huang, Ray" <Ray.Huang@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • Delivery-date: Wed, 19 Jun 2024 10:10:54 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHawJTljFn+TFlFw0Cz/HTewbBwKbHMCqyAgAGSNQD//5yWgIAB+7GA//+I/4CAAJBIgP//jEaAABEfFIA=
  • Thread-topic: [XEN PATCH v10 3/5] x86/pvh: Add PHYSDEVOP_setup_gsi for PVH dom0

On 2024/6/19 17:49, Jan Beulich wrote:
> On 19.06.2024 10:51, Chen, Jiqian wrote:
>> On 2024/6/19 16:06, Jan Beulich wrote:
>>> On 19.06.2024 09:53, Chen, Jiqian wrote:
>>>> On 2024/6/18 16:55, Jan Beulich wrote:
>>>>> On 18.06.2024 08:57, Chen, Jiqian wrote:
>>>>>> On 2024/6/17 22:52, Jan Beulich wrote:
>>>>>>> On 17.06.2024 11:00, Jiqian Chen wrote:
>>>>>>>> The gsi of a passthrough device must be configured for it to be
>>>>>>>> able to be mapped into a hvm domU.
>>>>>>>> But When dom0 is PVH, the gsis don't get registered, it causes
>>>>>>>> the info of apic, pin and irq not be added into irq_2_pin list,
>>>>>>>> and the handler of irq_desc is not set, then when passthrough a
>>>>>>>> device, setting ioapic affinity and vector will fail.
>>>>>>>>
>>>>>>>> To fix above problem, on Linux kernel side, a new code will
>>>>>>>> need to call PHYSDEVOP_setup_gsi for passthrough devices to
>>>>>>>> register gsi when dom0 is PVH.
>>>>>>>>
>>>>>>>> So, add PHYSDEVOP_setup_gsi into hvm_physdev_op for above
>>>>>>>> purpose.
>>>>>>>>
>>>>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx>
>>>>>>>> Signed-off-by: Huang Rui <ray.huang@xxxxxxx>
>>>>>>>> Signed-off-by: Jiqian Chen <Jiqian.Chen@xxxxxxx>
>>>>>>>> ---
>>>>>>>> The code link that will call this hypercall on linux kernel side is as 
>>>>>>>> follows:
>>>>>>>> https://lore.kernel.org/xen-devel/20240607075109.126277-3-Jiqian.Chen@xxxxxxx/
>>>>>>>
>>>>>>> One of my v9 comments was addressed, thanks. Repeating the other, 
>>>>>>> unaddressed
>>>>>>> one here:
>>>>>>> "As to GSIs not being registered: If that's not a problem for Dom0's own
>>>>>>>  operation, I think it'll also want/need explaining why what is 
>>>>>>> sufficient for
>>>>>>>  Dom0 alone isn't sufficient when pass-through comes into play."
>>>>>> I have modified the commit message to describe why GSIs are not 
>>>>>> registered can cause passthrough not work, according to this v9 comment.
>>>>>> " it causes the info of apic, pin and irq not be added into irq_2_pin 
>>>>>> list, and the handler of irq_desc is not set, then when passthrough a 
>>>>>> device, setting ioapic affinity and vector will fail."
>>>>>> What description do you want me to add?
>>>>>
>>>>> What I'd first like to have clarification on (i.e. before putting it in
>>>>> the description one way or another): How come Dom0 alone gets away fine
>>>>> without making the call, yet for passthrough-to-DomU it's needed? Is it
>>>>> perhaps that it just so happened that for Dom0 things have been working
>>>>> on systems where it was tested, but the call should in principle have been
>>>>> there in this case, too [1]? That (to me at least) would make quite a
>>>>> difference for both this patch's description and us accepting it.
>>>> Oh, I think I know what's your concern now. Thanks.
>>>> First question, why gsi of device can work on PVH dom0:
>>>> Because when probe a driver to a normal device, it will call linux kernel 
>>>> side:pci_device_probe-> request_threaded_irq-> irq_startup-> 
>>>> __unmask_ioapic-> io_apic_write, then trap into xen side hvmemul_do_io-> 
>>>> hvm_io_intercept-> hvm_process_io_intercept-> vioapic_write_indirect-> 
>>>> vioapic_hwdom_map_gsi-> mp_register_gsi. So that the gsi can be registered.
>>>> Second question, why gsi of passthrough can't work on PVH dom0:
>>>> Because when assign a device to be passthrough, it uses pciback to probe 
>>>> the device, and it calls pcistub_probe, but in all callstack of 
>>>> pcistub_probe, it doesn't unmask the gsi, and we can see on Xen side, the 
>>>> function vioapic_hwdom_map_gsi-> mp_register_gsi will be called only when 
>>>> the gsi is unmasked, so that the gsi can't work for passthrough device.
>>>
>>> And why exactly would the fake IRQ handler not be set up by pciback? Its
>>> setting up ought to lead to those same IO-APIC RTE writes that Xen
>>> intercepts.
>> Because isr_on is not set, when xen_pcibk_control_isr is called, it will 
>> return due to " !dev_data->isr_on". So that fake IRQ handler aren't 
>> installed.
> 
> I'm afraid I don't follow you here. Quoting from the function:
> 
>       enable =  dev_data->enable_intx;
> 
>       /* Asked to disable, but ISR isn't runnig */
>       if (!enable && !dev_data->isr_on)
>               return;
> 
> I.e. we bail if the request was to _disable_ and there is no ISR.
I mean, after debugging the pcistub_probe callstack:
pcistub_seize-> pcistub_init_device-> xen_pcibk_reset_device-> 
xen_pcibk_control_isr(dev, 1 /*reset device*/)
and in xen_pcibk_control_isr code:
        if (reset) {
                dev_data->enable_intx = 0;
                dev_data->ack_intr = 0;
        }
        enable =  dev_data->enable_intx;

        /* Asked to disable, but ISR isn't runnig */
        if (!enable && !dev_data->isr_on)
                return;
"reset" is 1, so "dev_date->enable_intx" is set 0, then "enable" is 0, and then 
arrive the second "if" check, "enable" is 0 and "dev_date->isr_on" is also 0, 
so it return here.
It can't reach following code to install irq handle.

> 
> I can't exclude though that command_write()'s logic to set ->enable_intx
> is insufficient. But in the common case one would surely expect at least
> one of PCI_COMMAND_MEMORY and PCI_COMMAND_IO to be set first by a guest.
> IOW at some point I'd expect xen_pcibk_control_isr() to be called with
> the second argument 0 and with ->enable_intx set.
I only see xen_pcibk_control_isr(dev, 0) is called in xen_pcibk_do_one_op, but 
xen_pcibk_do_one_op is not called during assigning a device to be passthrough.

> 
>> And it seems isr_on is set through driver sysfs " irq_handler_state" for a 
>> level device that is to be shared with guest and the IRQ is shared with the 
>> initial domain.
> 
> The sysfs interface is, according to my reading of the description
> of the commit introducing it, merely for debugging / recovery purposes.
> (It also looks to me as if this was partly broken: If one would use it,
> thus clearing ->isr_on, a subsequent disable request would take exactly
> that early bailing path quoted above, with nothing removing the IRQ
> handler.)
> 
> That description also talks about both an edge vs level distinction in
> behavior and one for shared vs non-shared, but neither in that commit
> nor in present code I can spot any respective checks. Otherwise I could
> understand that there are cases where the necessary information isn't
> propagated to Xen.
> 
> Jan

-- 
Best regards,
Jiqian Chen.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.