[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 3/3] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server.
On 6/22/2016 7:33 PM, George Dunlap wrote: On 22/06/16 11:07, Yu Zhang wrote:On 6/22/2016 5:47 PM, George Dunlap wrote:On 22/06/16 10:29, Jan Beulich wrote:On 22.06.16 at 11:16, <george.dunlap@xxxxxxxxxx> wrote:On 22/06/16 07:39, Jan Beulich wrote:On 21.06.16 at 16:38, <george.dunlap@xxxxxxxxxx> wrote:On 21/06/16 10:47, Jan Beulich wrote:And then - didn't we mean to disable that part of XenGT during migration, i.e. temporarily accept the higher performance overhead without the p2m_ioreq_server entries? In which case flipping everything back to p2m_ram_rw after (completed or canceled) migration would be exactly what we want. The (new or previous) ioreq server should attach only afterwards, and can then freely re-establish any p2m_ioreq_server entries it deems necessary.Well, I agree this part of XenGT should be disabled during migration. But in such case I think it's device model's job to trigger the p2m type flipping(i.e. by calling HVMOP_set_mem_type).I agree - this would seem to be the simpler model here, despite (as George validly says) the more consistent model would be for the hypervisor to do the cleanup. Such cleanup would imo be reasonable only if there was an easy way for the hypervisor to enumerate all p2m_ioreq_server pages.Well, for me, the "easy way" means we should avoid traversing the whole ept paging structure all at once, right?Yes.Does calling p2m_change_entry_type_global() not satisfy this requirement?Not really - that addresses the "low overhead" aspect, but not the "enumerate all such entries" one.I'm sorry, I think I'm missing something here. What do we need the enumeration for?We'd need that if we were to do the cleanup in the hypervisor (as we can't rely on all p2m entry re-calculation to have happened by the time a new ioreq server registers for the type).So you're afraid of this sequence of events? 1) Server A de-registered, triggering a ioreq_server -> ram_rw type change 2) gfn N is marked as misconfigured 3) Server B registers and marks gfn N as ioreq_server 4) When N is accessed, the misconfiguration is resolved incorrectly to ram_rw But that can't happen, because misconfigured entries are resolved before setting a p2m entry; so at step 3, gfn N will be first set to (non-misconfigured) ram_rw, then changed to (non-misconfigured) ioreq_server. Or is there another sequence of events that I'm missing?Thanks for your reply, George. :) If no log dirty is triggered during this process, your sequence is correct. However, if log dirty is triggered, we'll met problems. I have described this in previous mails : http://lists.xenproject.org/archives/html/xen-devel/2016-06/msg02426.html on Jun 20 and http://lists.xenproject.org/archives/html/xen-devel/2016-06/msg02575.html on Jun 21Right -- sorry, now I see the issue: 1. Server A marks gfn X as ioreq_server 2. Server A deregisters, gfn X misconfigured 3. Server B registers, marks gfn Y as ioreq_server 4. Logdirty mode enabled; gfn Y misconfigured 5. When X or Y are accessed, resolve_misconfigure() has no way of telling whether the entry is from server A (which should be set to logdirty) or from server B (which should be left as ioreq_server). Exactly. :) Another simpler scenario would be 1. Server A marks gfn X as p2m_ioreq_server; 2. Logdirty mode enabled; gfn X misconfigured;3. When X is written, it will not cause ept vioalation, but ept misconfig, and the resolve_misconfig() would set gfn X back to p2m_ram_rw, thereafter we can not track access to X;Note: Not resetting the p2m type for p2m_ioreq_server when p2m->ioreq_server is not NULL is suitable for this simpler scenario, but is not correct if take your scenario into account.The core reason is I could not find a simple solution in resolve_misconfig() to handle handle both the outdated p2m_ioreq_server entries, the in-use ones and to support the logdirty feature at the same time. In a sense this is a deficiency in the change_entry_type_global() interface. A common OS principle is "make the common case fast, and the uncommon case correct". The scenario described above seems to me to be an uncommon case which is handled quickly but incorrectly; ideally we should handle it correctly, even if it's not very quick. Synchronously resolving a previous misconfig is probably the most straightforward thing to do. It could be done at point #3, when an M->N type change is not complete and a new p2m entry of type M is written; it could be at point #4, when an N->O type change is initiated while an M->N type change hasn't completed. Or it could be when an N->O type change happens while there are unfinished M->N transitions *and* post-type-change M entries. Sorry, I did not quite get it. Could you please elaborate more? Thanks! :) But, that's a lot of somewhat complicated work for a scenario that is unlikely to happen in practice, and I can see why Yu Zhang would feel reluctant to do it. Yes, that's part of my reasons. :) For the time being though, this will fail at #4, right? That is, logdirty mode cannot be enabled while server B is registered? That does mean we'd be forced to sort out the situation before we allow logdirty and ioreq_server to be used at the same time, but that doesn't really seem like such a bad idea to me. One solution I thought of is to just return failure in hap_enable_log_dirty() if p2m->ioreq.server is not NULL. But I did not choose such approach, because: 1> I still want to keep the logdirty feature so that XenGT can use it to keep track of the dirty rams when we support live migration in the future;2> I also agree with Paul's argument: it is device model's duty to do the p2m type resetting work. I'm still open to being convinced, but at the moment it really seems to me like improving the situation is the better long-term option. Thanks for all your advices, George. I'm also willing to taking other advices, if we have a more acceptable(for you, Jan and other maintainers) resync approach in hypervisor, I'd like to add this. If the code is too complicated, I can submit it in a separate new patchset. :) B.R. Yu _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |