Xen project Mailing List

Re: [Xen-devel] [PATCH RFC 0/6] Slotted channels for sync vm_events

To: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>, Petre Ovidiu PIRCALABU <ppircalabu@xxxxxxxxxxxxxxx>

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Mon, 4 Mar 2019 16:01:28 +0000

Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABtCRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT6JAlcEEwEKAEECGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEWIQTXqBy2bTNXPzpOYFimNjwxBZC0bQUCXEowWQUJDCJ7dgAKCRCmNjwx BZC0beKvEACJ75YlJXd7TnNHgFyiCJkm/qPeoQ3sFGSDZuZh7SKcdt9+3V2bFEb0Mii1hQaz 3hRqZb8sYPHJrGP0ljK09k3wf8k3OuNxziLQBJyzvn7WNlE4wBEcy/Ejo9TVBdA4ph5D0YaZ nqdsPmxe/xlTFuSkgu4ep1v9dfVP1TQR0e+JIBa/Ss+cKC5intKm+8JxpOploAHuzaPu0L/X FapzsIXqgT9eIQeBEgO2hge6h9Jov3WeED/vh8kA7f8c6zQ/gs5E7VGALwsiLrhr0LZFcKcw kI3oCCrB/C/wyPZv789Ra8EXbeRSJmTjcnBwHRPjnjwQmetRDD1t+VyrkC6uujT5jmgOBzaj KCqZ8PcMAssOzdzQtKmjUQ2b3ICPs2X13xZ5M5/OVs1W3TG5gkvMh4YoHi4ilFnOk+v3/j7q 65FG6N0JLb94Ndi80HkIOQQ1XVGTyu6bUPaBg3rWK91Csp1682kD/dNVF3FKHrRLmSVtmEQR 5rK0+VGc/FmR6vd4haKGWIRuPxzg+pBR77avIZpU7C7+UXGuZ5CbHwIdY8LojJg2TuUdqaVj yxmEZLOA8rVHipCGrslRNthVbJrGN/pqtKjCClFZHIAYJQ9EGLHXLG9Pj76opfjHij3MpR3o pCGAh6KsCrfrsvjnpDwqSbngGyEVH030irSk4SwIqZ7FwLkBDQRUWmc6AQgAzpc8Ng5Opbrh iZrn69Xr3js28p+b4a+0BOvC48NfrNovZw4eFeKIzmI/t6EkJkSqBIxobWRpBkwGweENsqnd 0qigmsDw4N7J9Xx0h9ARDqiWxX4jr7u9xauI+CRJ1rBNO3VV30QdACwQ4LqhR/WA+IjdhyMH wj3EJGE61NdP/h0zfaLYAbvEg47/TPThFsm4m8Rd6bX7RkrrOgBbL/AOnYOMEivyfZZKX1vv iEemAvLfdk2lZt7Vm6X/fbKbV8tPUuZELzNedJvTTBS3/l1FVz9OUcLDeWhGEdlxqXH0sYWh E9+PXTAfz5JxKH+LMetwEM8DbuOoDIpmIGZKrZ+2fQARAQABiQNbBBgBCgAmAhsCFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKMJ4FCQnQ/OQBKcBdIAQZAQoABgUCVFpnOgAKCRCyFcen x4Qb7cXrCAC0qQeEWmLa9oEAPa+5U6wvG1t/mi22gZN6uzQXH1faIOoDehr7PPESE6tuR/vI CTTnaSrd4UDPNeqOqVF07YexWD1LDcQG6PnRqC5DIX1RGE3BaSaMl2pFJP8y+chews11yP8G DBbxaIsTcHZI1iVIC9XLhoeegWi84vYc8F4ziADVfowbmbvcVw11gE8tmALCwTeBeZVteXjh 0OELHwrc1/4j4yvENjIXRO+QLIgk43kB57Upr4tP2MEcs0odgPM+Q+oETOJ00xzLgkTnLPim C1FIW2bOZdTj+Uq6ezRS2LKsNmW+PRRvNyA5ojEbA/faxmAjMZtLdSSSeFK8y4SoCRCmNjwx BZC0bevWEACRu+GyQgrdGmorUptniIeO1jQlpTiP5WpVnk9Oe8SiLoXUhXXNj6EtzyLGpYmf kEAbki+S6WAKnzZd3shL58AuMyDxtFNNjNeKJOcl6FL7JPBIIgIp3wR401Ep+/s5pl3Nw8Ii 157f0T7o8CPb54w6S1WsMkU78WzTxIs/1lLblSMcvyz1Jq64g4OqiWI85JfkzPLlloVf1rzy ebIBLrrmjhCE2tL1RONpE/KRVb+Q+PIs5+YcZ+Q1e0vXWA7NhTWFbWx3+N6WW6gaGpbFbopo FkYRpj+2TA5cX5zW148/xU5/ATEb5vdUkFLUFVy5YNUSyeBHuaf6fGmBrDc47rQjAOt1rmyD 56MUBHpLUbvA6NkPezb7T6bQpupyzGRkMUmSwHiLyQNJQhVe+9NiJJvtEE3jol0JVJoQ9WVn FAzPNCgHQyvbsIF3gYkCYKI0w8EhEoH5FHYLoKS6Jg880IY5rXzoAEfPvLXegy6mhYl+mNVN QUBD4h9XtOvcdzR559lZuC0Ksy7Xqw3BMolmKsRO3gWKhXSna3zKl4UuheyZtubVWoNWP/bn vbyiYnLwuiKDfNAinEWERC8nPKlv3PkZw5d3t46F1Dx0TMf16NmP+azsRpnMZyzpY8BL2eur feSGAOB9qjZNyzbo5nEKHldKWCKE7Ye0EPEjECS1gjKDwbkBDQRUWrq9AQgA7aJ0i1pQSmUR 6ZXZD2YEDxia2ByR0uZoTS7N0NYv1OjU8v6p017u0Fco5+Qoju/fZ97ScHhp5xGVAk5kxZBF DT4ovJd0nIeSr3bbWwfNzGx1waztfdzXt6n3MBKr7AhioB1m+vuk31redUdnhbtvN7O40MC+ fgSk5/+jRGxY3IOVPooQKzUO7M51GoOg4wl9ia3H2EzOoGhN2vpTbT8qCcL92ZZZwkBRldoA Wn7c1hEKSTuT3f1VpSmhjnX0J4uvKZ1V2R7rooKJYFBcySC0wa8aTmAtAvLgfcpe+legOtgq DKzLuN45xzEjyjCiI521t8zxNMPJY9FiCPNv0sCkDwARAQABiQI8BBgBCgAmAhsMFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKNJYFCQnQrVkACgkQpjY8MQWQtG2Xxg//RrRP+PFYuNXt 9C5hec/JoY24TkGPPd2tMC9usWZVImIk7VlHlAeqHeE0lWU0LRGIvOBITbS9izw6fOVQBvCA Fni56S12fKLusWgWhgu03toT9ZGxZ9W22yfw5uThSHQ4y09wRWAIYvhJsKnPGGC2KDxFvtz5 4pYYNe8Icy4bwsxcgbaSFaRh+mYtts6wE9VzyJvyfTqbe8VrvE+3InG5rrlNn51AO6M4Wv20 iFEgYanJXfhicl0WCQrHyTLfdB5p1w+072CL8uryHQVfD0FcDe+J/wl3bmYze+aD1SlPzFoI MaSIXKejC6oh6DAT4rvU8kMAbX90T834Mvbc3jplaWorNJEwjAH/r+v877AI9Vsmptis+rni JwUissjRbcdlkKBisoUZRPmxQeUifxUpqgulZcYwbEC/a49+WvbaYUriaDLHzg9xisijHwD2 yWV8igBeg+cmwnk0mPz8tIVvwi4lICAgXob7HZiaqKnwaDXs4LiS4vdG5s/ElnE3rIc87yru 24n3ypeDZ6f5LkdqL1UNp5/0Aqbr3EiN7/ina4YVyscy9754l944kyHnnMRLVykg0v+kakj0 h0RJ5LbfLAMM8M52KIA3y14g0Fb7kHLcOUMVcgfQ3PrN6chtC+5l6ouDIlSLR3toxH8Aam7E rIFfe2Dk+lD9A9BVd2rfoHA=

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Mon, 04 Mar 2019 16:01:44 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Openpgp: preference=signencrypt

On 2/19/19 11:48 AM, Razvan Cojocaru wrote: > On 2/12/19 7:01 PM, Tamas K Lengyel wrote: >> On Thu, Feb 7, 2019 at 9:06 AM Petre Ovidiu PIRCALABU >> <ppircalabu@xxxxxxxxxxxxxxx> wrote: >>> >>> On Thu, 2019-02-07 at 11:46 +0000, George Dunlap wrote: >>>> On 2/6/19 2:26 PM, Petre Ovidiu PIRCALABU wrote: >>>>> On Wed, 2018-12-19 at 20:52 +0200, Petre Pircalabu wrote: >>>>>> This patchset is a rework of the "multi-page ring buffer" for >>>>>> vm_events >>>>>> patch based on Andrew Cooper's comments. >>>>>> For synchronous vm_events the ring waitqueue logic was >>>>>> unnecessary as >>>>>> the >>>>>> vcpu sending the request was blocked until a response was >>>>>> received. >>>>>> To simplify the request/response mechanism, an array of slotted >>>>>> channels >>>>>> was created, one per vcpu. Each vcpu puts the request in the >>>>>> corresponding slot and blocks until the response is received. >>>>>> >>>>>> I'm sending this patch as a RFC because, while I'm still working >>>>>> on >>>>>> way to >>>>>> measure the overall performance improvement, your feedback would >>>>>> be a >>>>>> great >>>>>> assistance. >>>>>> >>>>> >>>>> Is anyone still using asynchronous vm_event requests? (the vcpu is >>>>> not >>>>> blocked and no response is expected). >>>>> If not, I suggest that the feature should be removed as it >>>>> (significantly) increases the complexity of the current >>>>> implementation. >>>> >>>> Could you describe in a bit more detail what the situation >>>> is? What's >>>> the current state fo affairs with vm_events, what you're trying to >>>> change, why async vm_events is more difficult? >>>> >>> The main reason for the vm_events modification was to improve the >>> overall performance in high throughput introspection scenarios. For >>> domus with a higher vcpu count, a vcpu could sleep for a certain period >>> of time while waiting for a ring slot to become available >>> (__vm_event_claim_slot) >>> The first patchset only increased the ring size, and the second >>> iteraton, based on Andrew Copper's comments, proposed a separate path >>> to handle synchronous events ( a slotted buffer for each vcpu) in order >>> to have the events handled independently of one another. To handle >>> asynchronous events, a dynamically allocated vm_event ring is used. >>> While the implementation is not exactly an exercise in simplicity, it >>> preserves all the needed functionality and offers fallback if the Linux >>> domain running the monitor application doesn't support >>> IOCTL_PRIVCMD_MMAP_RESOURCE. >>> However, the problem got a little bit more complicated when I tried >>> implementing the vm_events using an IOREQ server (based on Paul >>> Durrant's comments). For synchronous vm_events, it simplified things a >>> little, eliminating both the need for a special structure to hold the >>> processing state and the evtchns for each vcpu. >>> The asynchronous events were a little more tricky to handle. The >>> buffered ioreqs were a good candidate, but the only thing usable is the >>> corresponding evtchn in conjunction with an existing ring. In order to >>> use them, a mock buffered ioreq should be created and transmitted, with >>> the only meaningful field being the ioreq type. >>> >>>> I certainly think it would be better if you could write the new >>>> vm_event >>>> interface without having to spend a lot of effort supporting modes >>>> that >>>> you think nobody uses. >>>> >>>> On the other hand, getting into the habit of breaking stuff, even for >>>> people we don't know about, will be a hindrance to community growth; >>>> a >>>> commitment to keeping it working will be a benefit to growth. >>>> >>>> But of course, we haven't declared the vm_event interface 'supported' >>>> (it's not even mentioned in the SUPPORT.md document yet). >>>> >>>> Just for the sake of discussion, would it be possible / reasonble, >>>> for >>>> instance, to create a new interface, vm_events2, instead? Then you >>>> could write it to share the ioreq interface without having legacy >>>> baggage you're not using; we could deprecate and eventually remove >>>> vm_events1, and if anyone shouts, we can put it back. >>>> >>>> Thoughts? >>>> >>>> -George >>> Yes, it's possible and it will GREATLY simplify the implementation. I >>> just have to make sure the interfaces are mutually exclusive. >> >> I'm for removing features from the vm_event interface that are no >> longer in use, especially if they block more advantageous changes like >> this one. We don't know what the use-case was for async events nor >> have seen anyone even mention them since I've been working with Xen. >> Creating a new interface, as mentioned above, would make sense if >> there was a disagreement with retiring this feature. I don't think >> that's the case. I certainly would prefer not having to maintain two >> separate interfaces going forward without a clear justification and >> documented use-case explaining why we keep the old interface around. > > AFAICT, the async model is broken conceptually as well, so it makes no > sense. It would make sense, IMHO, if it would be lossy (i.e. we just > write in the ring buffer, if somebody manages to "catch" an event while > it flies by then so be it, if not it will be overwritten). If it would > have been truly lossy, I'd see a use-case for it, for gathering > statistics maybe. > > However, as the code is now, the VCPUs are not paused only if there's > still space in the ring buffer. If there's no more space in the ring > buffer, the VCPU trying to put an event in still gets paused by the > vm_event logic, which means that these events are only "half-async". > > FWIW, I'm with Tamas on this one: if nobody cares about async events / > comes forward with valid use cases or applications using them, I see no > reason why we should have this extra code to maintain, find bugs in, and > trip over other components (migration, in Andrew's example). The whole idea would be that we *don't* maintain "v1", unless someone shows up and says they want to use it. To let you know where I'm coming from: It's actually fairly uncommon for people using something to participate in the community that generates it. For example, a few years ago I spent some time with a GCoC student making libxl bindings for Golang. In spite of the fact that they're still at 'early prototype' stage, apparently there's a community of people that have picked those up and are using them. Such people often: * Start using without getting involved with the community * Don't read the mailing list to hear what's going to happen * Won't complain if something breaks, but will just go find a different platform. If you have something that already works, switching to something else is a huge effort; if what you used to have is broken and you have to switch everything over anyway, then you're much more likely to try something else. There are many reason for the success of both Linux and Linux containers, but one major one is the fairly fanatical commitment they have to backwards compatibility and avoiding breaking userspace applications. By contrast, it seems to me that the xend -> xl transition, while necessary, has been an inflection point that caused a lot of people to consider moving away from Xen (since a lot of their old tooling stopped working). So; at the moment, we don't know if anyone's using the async functionality or not. We have three choices: A. Try to implement both sync / async in the existing interface. B. Re-write the current interface to be sync-only. C. Create a new interface ('v2') from scratch, deleting 'v1' when 'v2' is ready. Option A *probaby* keeps silent users. But, Petre spends a lot of time designing around and debugging something that he doesn't use or care about, and that he's not sure *anyone* uses. Not great. Option B is more efficient for Petre. But if there are users, any ones that don't complain we lose immediately. If any users do complain, then again we have the choice: Try to retrofit the async functionality, or tell them we're sorry, they'll have to port their thing to v2 or go away. In the case of option C, we can leave 'v1' there as long as we want. If we delete it, and people complain, it won't be terribly difficult to reinstate 'v1' without affecting 'v2'. I mean, a part of me completely agrees with you: get rid of cruft nobody's using; if you want to depend on something someone else is developing, at least show up and get involved in the community, or don't complain when it goes away. But another part of me is just worried the long-term effects of this kind of behavior. Anyway, I won't insist on having a v2 if nobody else says anything; I just wanted to make sure the "invisible" effects got proper consideration. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.