[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] libxl_qmp: wait for completion of device removal



On Tue, Jul 02, 2019 at 02:42:33PM +0100, Anthony PERARD wrote:
>Hi,
>
>Thanks for the patch. I've got some comments.
>
>On Tue, Jul 02, 2019 at 03:46:40PM +0800, Chao Gao wrote:
>> To remove a device from a domain, a qmp command is sent to qemu. But it is
>> handled by qemu asychronously. Even the qmp command is claimed to be done,
>> the actual handling in qemu side may happen later.
>> This behavior brings two questions:
>> 1. Attaching a device back to a domain right after detaching the device from
>> that domain would fail with error:
>> 
>> libxl: error: libxl_qmp.c:341:qmp_handle_error_response: Domain 1:received an
>> error message from QMP server: Duplicate ID 'pci-pt-60_00.0' for device
>> 
>> 2. Accesses to PCI configuration space in Qemu may overlap with later device
>> reset issued by 'xl' or by pciback.
>> 
>> In order to avoid mentioned questions, wait for the completion of device
>> removal by querying all pci devices using qmp command and ensuring the target
>> device isn't listed. Only retry 5 times to avoid 'xl' potentially being 
>> blocked
>> by qemu.
>> 
>> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
>> ---
>>  tools/libxl/libxl_qmp.c | 57 
>> ++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 56 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/libxl/libxl_qmp.c b/tools/libxl/libxl_qmp.c
>> index 42c8ab8..18f6126 100644
>> --- a/tools/libxl/libxl_qmp.c
>> +++ b/tools/libxl/libxl_qmp.c
>> @@ -1000,9 +1032,32 @@ int libxl__qmp_pci_add(libxl__gc *gc, int domid, 
>> libxl_device_pci *pcidev)
>>  static int qmp_device_del(libxl__gc *gc, int domid, char *id)
>>  {
>>      libxl__json_object *args = NULL;
>> +    libxl__qmp_handler *qmp = NULL;
>> +    int rc = 0;
>> +
>> +    qmp = libxl__qmp_initialize(gc, domid);
>> +    if (!qmp)
>> +        return ERROR_FAIL;
>>  
>>      qmp_parameters_add_string(gc, &args, "id", id);
>> -    return qmp_run_command(gc, domid, "device_del", args, NULL, NULL);
>> +    rc = qmp_synchronous_send(qmp, "device_del", args,
>> +                              NULL, NULL, qmp->timeout);
>> +    if (rc == 0) {
>> +        unsigned int retry = 0;
>> +
>> +        do {
>> +            if (qmp_synchronous_send(qmp, "query-pci", NULL,
>> +                                     pci_del_callback, id, qmp->timeout) == 
>> 0) {
>
>Could you assign the return value of qmp_synchronous_send into a
>variable, then check for success/error?
>
>qmp_synchronous_send returns several possible values:
>- errors, when rc < 0
>- rc of the callback, which is 0 or 1 here.
>
>If there's an error, we don't want to keep trying we probably want to
>return that error.

Sure. Will do. But I prefer to continue device removal to clean up
related status and mapping set up for the device in Xen (VT-d) and
pciback. It at least makes the device available to other domains.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.