[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/8] x86/vm-event/monitor: don't compromise monitor_write_data on domain cleanup



On 7/4/2016 6:07 PM, Jan Beulich wrote:
On 04.07.16 at 16:21, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
On 07/04/16 17:17, Jan Beulich wrote:
On 04.07.16 at 15:50, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
On 07/04/16 16:11, Jan Beulich wrote:
On 04.07.16 at 15:03, <czuzu@xxxxxxxxxxxxxxx> wrote:
On 7/4/2016 3:47 PM, Jan Beulich wrote:
On 30.06.16 at 20:45, <czuzu@xxxxxxxxxxxxxxx> wrote:
The arch_vm_event structure is dynamically allocated and freed @
vm_event_cleanup_domain. This cleanup is triggered e.g. when the toolstack
user
disables domain monitoring (xc_monitor_disable), which in turn effectively
discards any information that was in arch_vm_event.write_data.
Isn't that rather a toolstack user bug, not warranting a relatively
extensive (even if mostly mechanical) hypervisor change like this
one? Sane monitor behavior, after all, is required anyway for the
monitored guest to survive.
Sorry but could you please rephrase this, I don't quite understand what
you're saying.
The write_data field in arch_vm_event should _not ever_ be invalidated
as a direct result of a toolstack user's action.
The monitoring app can cause all kinds of problems to the guest it
monitors. Why would this specific one need taking care of in the
hypervisor, instead of demanding that the app not disable monitoring
at the wrong time?
I'm not sure there's a right time here. The problem is that, if I
understand this correctly, a race is possible between the moment the
userspace application responds to the vm_event _and_ call
xc_monitor_disable() and the time hvm_do_resume() gets called.
It's that _and_ in your reply that I put under question, but I admit
that questioning may be due to my limited (or should I say non-
existent) knowledge on the user space parts here: Why would the
app be _required_ to "responds to the vm_event _and_ call
xc_monitor_disable()", rather than delaying the latter for long
enough?
It's not required to do that, it's just that we don't really know what
"long enough means". I suppose a second should do be more than enough,
but obviously we'd prefer a fool-proof solution with better guarantees
and no lag - I thought that the hypervisor change is trivial enough to
not make the tradeoff into much of an issue.
It's mostly mechanical indeed, but iirc it grows struct vcpu (or was
it struct domain?), which is never really desirable (but admittedly
often unavoidable). I'm therefore simply trying to understand what
alternatives there are.

Jan

The change adds 20 bytes to the structure, that's less than 3 uint64_t's, not that much if you ask me. But given the discussion we're having over patch 4/8, monitor_write_data would indeed get larger if I revert that, so what I propose instead is going for something in-between what was before and what I did after, i.e. don't dynamically allocate arch_vm_event entirely, but don't statically allocate monitor_write_data either. I propose turning arch_vm_event into:

struct arch_vm_event {
    uint32_t emulate_flags;
    struct vm_event_emul_read_data *emul_read_data;
    struct monitor_write_data *write_data;
};

This way we can still selectively invalidate write_data from arch_vm_event.
Sounds good?

Corneliu.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.