[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v7] x86/emulate: Send vm_event from emulate



On 03.07.2019 12:56, Alexandru Stefan ISAILA wrote:
> A/D bit writes (on page walks) can be considered benign by an introspection
> agent, so receiving vm_events for them is a pessimization. We try here to
> optimize by fitering these events out.

But you add the sending of more events - how does "filter out" match
the actual implementation?

> Currently, we are fully emulating the instruction at RIP when the hardware 
> sees
> an EPT fault with npfec.kind != npfec_kind_with_gla. This is, however,
> incorrect, because the instruction at RIP might legitimately cause an
> EPT fault of its own while accessing a _different_ page from the original one,
> where A/D were set.
> The solution is to perform the whole emulation,

Above you said fully emulating such an insn is incorrect. To me the
two statements contradict one another.

> while ignoring EPT restrictions
> for the walk part, and taking them into account for the "actual" emulating of
> the instruction at RIP.

So the "ignore" part here is because the walk doesn't currently send
any events? That's an omission after all, which ultimately wants to
get fixed. This in turn makes me wonder whether there couldn't be
cases where a monitor actually wants to see these violations, too.
After all one may be able to abuse to page walker to set bits in
places you actually care to protect from undue modification.

> When we send out a vm_event, we don't want the emulation
> to complete, since in that case we won't be able to veto whatever it is doing.
> That would mean that we can't actually prevent any malicious activity, instead
> we'd only be able to report on it.
> When we see a "send-vm_event" case while emulating, we need to first send the
> event out and then stop the emulation (return X86EMUL_RETRY).

Perhaps better "suspend" instead of "stop"?

> After the emulation stops we'll call hvm_vm_event_do_resume() again after the
> introspection agent treats the event and resumes the guest. There, the
> instruction at RIP will be fully emulated (with the EPT ignored) if the
> introspection application allows it, and the guest will continue to run past
> the instruction.
> 
> We use hvmemul_map_linear_addr() to intercept r/w access and
> __hvm_copy() to intercept exec access.

Btw I continue to be unhappy about this asymmetry. Furthermore in
the former case you only handle write and rmw accesses, but not
reads afaics. I assume you don't care about reads, but this should
then be made explicit. Furthermore EPT allows read protection, and
there are p2m_access_w, p2m_access_wx, and p2m_access_x, so I guess
ignoring reads can at best be an option picked by the monitor, not
something to be left out of the interface altogether.

> hvm_emulate_send_vm_event() can return false if there was no violation,
> if there was an error from monitor_traps() or p2m_get_mem_access().

As said before - I don't think errors and lack of a violation can
sensibly be treated the same way. Is the implication perhaps that
emulation then will fail later anyway? If so, is such an
assumption taking into consideration possible races?

> Returning false if p2m_get_mem_access() is of because this will happen
> if it was called with a bad address or if the entry was not found in the
> EPT in which case it is unrestricted.

I'm afraid I'm having trouble understanding this. I'm in particular
heavily confused by the "of" in the middle.

> @@ -530,6 +532,71 @@ static int hvmemul_do_mmio_addr(paddr_t mmio_gpa,
>       return hvmemul_do_io_addr(1, mmio_gpa, reps, size, dir, df, ram_gpa);
>   }
>   
> +/*
> + * Send memory access vm_events based on pfec. Returns true if the event was
> + * sent and false for p2m_get_mem_access() error, no violation and event send
> + * error. Depends on arch.vm_event->send_event.

Instead of "depends", do you perhaps mean "assumes the caller to check"?
In which case you may want to ASSERT() this here to document the
requirement?

> + * NOTE: p2m_get_mem_access() can fail for wrong address or if the entry

What is "wrong address" here? IOW how is this different from "entry not
found"?

> @@ -629,6 +697,14 @@ static void *hvmemul_map_linear_addr(
>   
>               ASSERT(p2mt == p2m_ram_logdirty || !p2m_is_readonly(p2mt));
>           }
> +
> +        if ( curr->arch.vm_event &&
> +            curr->arch.vm_event->send_event &&
> +            hvm_emulate_send_vm_event(addr, gfn, pfec) )

Indentation looks off by one here.

> +        {
> +            err = ERR_PTR(~X86EMUL_RETRY);
> +            goto out;
> +        }

Did you notice that there's an immediate exit from the loop only
in case the linear -> physical translation fails? This is
relevant for page fault delivery correctness for accesses
crossing page boundaries. I think you want to use
update_map_err() and drop the "goto out". I can't really make up
my mind on the correct interaction between your new if() and the
one immediately ahead of it. You will want to think this through.

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -3224,6 +3224,14 @@ static enum hvm_translation_result __hvm_copy(
>               return HVMTRANS_bad_gfn_to_mfn;
>           }
>   
> +        if ( unlikely(v->arch.vm_event) &&
> +            v->arch.vm_event->send_event &&
> +            hvm_emulate_send_vm_event(addr, gfn, pfec) )

Indentation looks wrong again.

> +        {
> +            put_page(page);
> +            return HVMTRANS_gfn_paged_out;

Why "paged out"? If this is an intentional abuse, then you want
to say so in a comment and justify the abuse here or in the
description.

> --- a/xen/arch/x86/hvm/vm_event.c
> +++ b/xen/arch/x86/hvm/vm_event.c
> @@ -86,6 +86,7 @@ void hvm_vm_event_do_resume(struct vcpu *v)
>                     VM_EVENT_FLAG_SET_EMUL_INSN_DATA )
>               kind = EMUL_KIND_SET_CONTEXT_INSN;
>   
> +        v->arch.vm_event->send_event = false;
>           hvm_emulate_one_vm_event(kind, TRAP_invalid_op,
>                                    X86_EVENT_NO_EC);

Is this insertion meaning to use "true" instead, or is the
revision log entry wrong? Or does "set" there not necessarily
mean "set it to true", but just "set it to a deterministic
value" (in which case "initialize" would have been an
unambiguous alternative wording)?

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.