[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading


  • To: Dongli Zhang <dongli.zhang@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
  • From: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
  • Date: Mon, 17 Sep 2018 15:08:22 -0400
  • Autocrypt: addr=boris.ostrovsky@xxxxxxxxxx; prefer-encrypt=mutual; keydata= xsFNBFH8CgsBEAC0KiOi9siOvlXatK2xX99e/J3OvApoYWjieVQ9232Eb7GzCWrItCzP8FUV PQg8rMsSd0OzIvvjbEAvaWLlbs8wa3MtVLysHY/DfqRK9Zvr/RgrsYC6ukOB7igy2PGqZd+M MDnSmVzik0sPvB6xPV7QyFsykEgpnHbvdZAUy/vyys8xgT0PVYR5hyvhyf6VIfGuvqIsvJw5 C8+P71CHI+U/IhsKrLrsiYHpAhQkw+Zvyeml6XSi5w4LXDbF+3oholKYCkPwxmGdK8MUIdkM d7iYdKqiP4W6FKQou/lC3jvOceGupEoDV9botSWEIIlKdtm6C4GfL45RD8V4B9iy24JHPlom woVWc0xBZboQguhauQqrBFooHO3roEeM1pxXjLUbDtH4t3SAI3gt4dpSyT3EvzhyNQVVIxj2 FXnIChrYxR6S0ijSqUKO0cAduenhBrpYbz9qFcB/GyxD+ZWY7OgQKHUZMWapx5bHGQ8bUZz2 SfjZwK+GETGhfkvNMf6zXbZkDq4kKB/ywaKvVPodS1Poa44+B9sxbUp1jMfFtlOJ3AYB0WDS Op3d7F2ry20CIf1Ifh0nIxkQPkTX7aX5rI92oZeu5u038dHUu/dO2EcuCjl1eDMGm5PLHDSP 0QUw5xzk1Y8MG1JQ56PtqReO33inBXG63yTIikJmUXFTw6lLJwARAQABzTNCb3JpcyBPc3Ry b3Zza3kgKFdvcmspIDxib3Jpcy5vc3Ryb3Zza3lAb3JhY2xlLmNvbT7CwXgEEwECACIFAlH8 CgsCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEIredpCGysGyasEP/j5xApopUf4g 9Fl3UxZuBx+oduuw3JHqgbGZ2siA3EA4bKwtKq8eT7ekpApn4c0HA8TWTDtgZtLSV5IdH+9z JimBDrhLkDI3Zsx2CafL4pMJvpUavhc5mEU8myp4dWCuIylHiWG65agvUeFZYK4P33fGqoaS VGx3tsQIAr7MsQxilMfRiTEoYH0WWthhE0YVQzV6kx4wj4yLGYPPBtFqnrapKKC8yFTpgjaK jImqWhU9CSUAXdNEs/oKVR1XlkDpMCFDl88vKAuJwugnixjbPFTVPyoC7+4Bm/FnL3iwlJVE qIGQRspt09r+datFzPqSbp5Fo/9m4JSvgtPp2X2+gIGgLPWp2ft1NXHHVWP19sPgEsEJXSr9 tskM8ScxEkqAUuDs6+x/ISX8wa5Pvmo65drN+JWA8EqKOHQG6LUsUdJolFM2i4Z0k40BnFU/ kjTARjrXW94LwokVy4x+ZYgImrnKWeKac6fMfMwH2aKpCQLlVxdO4qvJkv92SzZz4538az1T m+3ekJAimou89cXwXHCFb5WqJcyjDfdQF857vTn1z4qu7udYCuuV/4xDEhslUq1+GcNDjAhB nNYPzD+SvhWEsrjuXv+fDONdJtmLUpKs4Jtak3smGGhZsqpcNv8nQzUGDQZjuCSmDqW8vn2o hWwveNeRTkxh+2x1Qb3GT46uzsFNBFH8CgsBEADGC/yx5ctcLQlB9hbq7KNqCDyZNoYu1HAB Hal3MuxPfoGKObEktawQPQaSTB5vNlDxKihezLnlT/PKjcXC2R1OjSDinlu5XNGc6mnky03q yymUPyiMtWhBBftezTRxWRslPaFWlg/h/Y1iDuOcklhpr7K1h1jRPCrf1yIoxbIpDbffnuyz kuto4AahRvBU4Js4sU7f/btU+h+e0AcLVzIhTVPIz7PM+Gk2LNzZ3/on4dnEc/qd+ZZFlOQ4 KDN/hPqlwA/YJsKzAPX51L6Vv344pqTm6Z0f9M7YALB/11FO2nBB7zw7HAUYqJeHutCwxm7i BDNt0g9fhviNcJzagqJ1R7aPjtjBoYvKkbwNu5sWDpQ4idnsnck4YT6ctzN4I+6lfkU8zMzC gM2R4qqUXmxFIS4Bee+gnJi0Pc3KcBYBZsDK44FtM//5Cp9DrxRQOh19kNHBlxkmEb8kL/pw XIDcEq8MXzPBbxwHKJ3QRWRe5jPNpf8HCjnZz0XyJV0/4M1JvOua7IZftOttQ6KnM4m6WNIZ 2ydg7dBhDa6iv1oKdL7wdp/rCulVWn8R7+3cRK95SnWiJ0qKDlMbIN8oGMhHdin8cSRYdmHK kTnvSGJNlkis5a+048o0C6jI3LozQYD/W9wq7MvgChgVQw1iEOB4u/3FXDEGulRVko6xCBU4 SQARAQABwsFfBBgBAgAJBQJR/AoLAhsMAAoJEIredpCGysGyfvMQAIywR6jTqix6/fL0Ip8G jpt3uk//QNxGJE3ZkUNLX6N786vnEJvc1beCu6EwqD1ezG9fJKMl7F3SEgpYaiKEcHfoKGdh 30B3Hsq44vOoxR6zxw2B/giADjhmWTP5tWQ9548N4VhIZMYQMQCkdqaueSL+8asp8tBNP+TJ PAIIANYvJaD8xA7sYUXGTzOXDh2THWSvmEWWmzok8er/u6ZKdS1YmZkUy8cfzrll/9hiGCTj u3qcaOM6i/m4hqtvsI1cOORMVwjJF4+IkC5ZBoeRs/xW5zIBdSUoC8L+OCyj5JETWTt40+lu qoqAF/AEGsNZTrwHJYu9rbHH260C0KYCNqmxDdcROUqIzJdzDKOrDmebkEVnxVeLJBIhYZUd t3Iq9hdjpU50TA6sQ3mZxzBdfRgg+vaj2DsJqI5Xla9QGKD+xNT6v14cZuIMZzO7w0DoojM4 ByrabFsOQxGvE0w9Dch2BDSI2Xyk1zjPKxG1VNBQVx3flH37QDWpL2zlJikW29Ws86PHdthh Fm5PY8YtX576DchSP6qJC57/eAAe/9ztZdVAdesQwGb9hZHJc75B+VNm4xrh/PJO6c1THqdQ 19WVJ+7rDx3PhVncGlbAOiiiE3NOFPJ1OQYxPKtpBUukAlOTnkKE6QcA4zckFepUkfmBV1wM Jg6OxFYd01z+a+oL
  • Cc: jgross@xxxxxxxx, wei.liu2@xxxxxxxxxx, konrad.wilk@xxxxxxxxxx, srinivas.eeda@xxxxxxxxxx, paul.durrant@xxxxxxxxxx, roger.pau@xxxxxxxxxx
  • Delivery-date: Mon, 17 Sep 2018 19:07:31 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 9/16/18 9:20 PM, Dongli Zhang wrote:
> Hi Boris,
>
> On 09/17/2018 04:17 AM, Boris Ostrovsky wrote:
>>
>> On 9/14/18 3:34 AM, Dongli Zhang wrote:
>>
>>> +
>>> +struct mtwatch_info {
>>> +    /*
>>> +     * The mtwatch_domain is put on both a hash table and a list.
>>> +     * domain_list is used to optimize xenbus_watch un-registration.
>>> +     *
>>> +     * The mtwatch_domain is removed from domain_hash (with state set
>>> +     * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
>>> +     * left on domain_list until all events belong to such
>>> +     * mtwatch_domain are processed in mtwatch_thread().
>>
>> Do we really need to keep mwatch_domain on both lists? Why is keeping it on,
>> say, only the hash not sufficient?
> In the state of the art xenbus, when a watch is unregistered (e.g.,
> unregister_xenbus_watch()), we need to traverse the list 'watch_events' to
> remove all inflight/pending events (for such watch) from 'watch_events'.
>
> About this patch set, as each domain would have its own event list, we need to
> traverse the list of each domain to remove the pending events for the watch to
> be unregistered.
>
> E.g.,
> unregister_xenbus_watch()-->unregister_mtwatch()-->unregister_all_mtwatch() in
> [PATCH 2/6] xenbus: implement the xenwatch multithreading framework.
>
> To traverse a hash table is not as efficient as traversing a list. That's why 
> a
> domain is kept on both the hash table and list.


Keeping the same object on two lists also has costs. More importantly
IMO, it increases chances on introducing a bug  when people update one
instance but not the other.


>
>>> +     *
>>> +     * While there may exist two mtwatch_domain with the same domid on
>>> +     * domain_list simultaneously,
>>
>> How is it possible to have two domids on the list at the same time? Don't you
>> want to remove it (which IIUIC means moving it to the purge list) when 
>> domain is
>> destroyed?
> Here is one case (suppose the domid/frontend-id is 9):
>
> 1. Suppose the last pv driver device is removed from domid=9, and therefore 
> the
> reference count of per-domU xenwatch thread for domid=9 (which we call as old
> thread below) should be removed. We remove it from hash table (it is left in 
> the
> list).
>
> Here we cannot remove the domain from the list immediately because there might
> be pending events being processed by the corresponding per-domU xenwatch 
> thread.
> If we remove it from the list while there is related watch being unregistered 
> as
> mentioned for last question, we may hit page fault when processing watch 
> event.


Don't you need to grab domain->domain_mutex to remove the driver?
Meaning that events for that mtwatch thread cannot be processed?

In any case, I think that having two mtwatch_domains for the same domain
should be avoided. (and if you keep the hash list only then this issue
gets resolved automatically ;-))


-boris


>
> 2. Now the administrator attaches new pv device to domid=9 immediately and
> therefore reference count is initially set to 1. The per-domU xenwatch thread
> for domid=9 (called new thread) is created again. It is inserted to both the
> hash table and list.
>
> 3. As the old thread for domid=9 might still be on the list, we would have two
> threads for domid=9 (one old one to be removed and one newly inserted one to 
> be
> used by new pv devices).
>
> Dongli Zhang
>
>>
>> -boris
>>
>>
>>> +      *  all mtwatch_domain on hash_hash
>>> +     * should have unique domid.
>>> +     */
>>> +    spinlock_t domain_lock;
>>> +    struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
>>> +    struct list_head domain_list;
>>> +
>>> +    /*
>>> +     * When a per-domU kthread is going to be destroyed, it is put
>>> +     * on the purge_list, and will be flushed by purge_work later.
>>> +     */
>>> +    struct work_struct purge_work;
>>> +    spinlock_t purge_lock;
>>> +    struct list_head purge_list;
>>> +};
>>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.