[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 08/13] Add IOREQ_TYPE_VMWARE_PORT



>>> On 25.02.15 at 21:20, <dslutz@xxxxxxxxxxx> wrote:
> On 02/24/15 10:34, Jan Beulich wrote:
>>>>> On 17.02.15 at 00:05, <dslutz@xxxxxxxxxxx> wrote:
>>> @@ -501,22 +542,50 @@ static void hvm_free_ioreq_gmfn(struct domain *d, 
>>> unsigned long gmfn)
>>>      clear_bit(i, &d->arch.hvm_domain.ioreq_gmfn.mask);
>>>  }
>>>  
>>> -static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, bool_t buf)
>>> +static void hvm_unmap_ioreq_page(struct hvm_ioreq_server *s, int buf)
>>>  {
>>> -    struct hvm_ioreq_page *iorp = buf ? &s->bufioreq : &s->ioreq;
>>> +    struct hvm_ioreq_page *iorp = NULL;
>>> +
>>> +    switch ( buf )
>>> +    {
>>> +    case 0:
>>> +        iorp = &s->ioreq;
>>> +        break;
>>> +    case 1:
>>> +        iorp = &s->bufioreq;
>>> +        break;
>>> +    case 2:
>>> +        iorp = &s->vmport_ioreq;
>>> +        break;
>>> +    }
>> 
>> These literals should become an enum.
>> 
> 
> Paul Durrant asked for #defined values.  So it is not clear which way to
> go. Will wait for response.

Obviously either would be fine. An enum has the potential advantage
of the compiler being able to check switch statements for completeness
(albeit there are cases where this ends up being a disadvantage).

>>> @@ -2429,9 +2552,6 @@ struct hvm_ioreq_server 
>>> *hvm_select_ioreq_server(struct 
> domain *d,
>>>      if ( list_empty(&d->arch.hvm_domain.ioreq_server.list) )
>>>          return NULL;
>>>  
>>> -    if ( p->type != IOREQ_TYPE_COPY && p->type != IOREQ_TYPE_PIO )
>>> -        return d->arch.hvm_domain.default_ioreq_server;
>> 
>> Shouldn't this rather be amended than deleted?
>> 
> 
> The reason is below:
> 
>>> @@ -2474,7 +2594,12 @@ struct hvm_ioreq_server 
>>> *hvm_select_ioreq_server(struct domain *d,
>>>          BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
>>>          BUILD_BUG_ON(IOREQ_TYPE_COPY != HVMOP_IO_RANGE_MEMORY);
>>>          BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != HVMOP_IO_RANGE_PCI);
>>> +        BUILD_BUG_ON(IOREQ_TYPE_VMWARE_PORT != HVMOP_IO_RANGE_VMWARE_PORT);
>>> +        BUILD_BUG_ON(IOREQ_TYPE_TIMEOFFSET != HVMOP_IO_RANGE_TIMEOFFSET);
>>> +        BUILD_BUG_ON(IOREQ_TYPE_INVALIDATE != HVMOP_IO_RANGE_INVALIDATE);
>>>          r = s->range[type];
>>> +        if ( !r )
>>> +            continue;
>> 
>> Why, all of the sudden?
>> 
> 
> This is the replacement for the deleted "if" above.  Continue will lead
> to the same return that was remove above (it is at the end).  They are
> currently the same because all ioreq servers have the same set of
> ranges.  But if it would help, I can change "continue" into the "return
> default".

So further down you talk of the "special range 1" (see there for
further remarks in this regard) - how would r be NULL here in the
first place? That said - yes, making this explicitly do what is
intended (perhaps rather using "break" instead of "return") would
seem very desirable. There simply is no point in continuing the
loop.

>>> @@ -2501,6 +2626,13 @@ struct hvm_ioreq_server 
>>> *hvm_select_ioreq_server(struct domain *d,
>>>              }
>>>  
>>>              break;
>>> +        case IOREQ_TYPE_VMWARE_PORT:
>>> +        case IOREQ_TYPE_TIMEOFFSET:
>>> +        case IOREQ_TYPE_INVALIDATE:
>>> +            if ( rangeset_contains_singleton(r, 1) )
>>> +                return s;
>> 
>> This literal 1 at least needs explanation (i.e. a comment).
>> 
> 
> The comment is below (copied here).  Will duplicate it here (with any
> adjustments needed):
> 
>  + * NOTE: The 'special' range of 1 is what is checked for outside
>  + * of the three types of I/O.
> 
> How about /* The 'special' range of 1 is checked for being enabled */?

Along these lines, yes (fixed for coding style). And then "1" is not
a range of any kind. I suppose writing it as a proper range (e.g.
[1,1]) would already help.

>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.h
>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.h
>>> @@ -112,6 +112,8 @@ struct __packed segment_register {
>>>  #define X86EMUL_RETRY          3
>>>   /* (cmpxchg accessor): CMPXCHG failed. Maps to X86EMUL_RETRY in caller. */
>>>  #define X86EMUL_CMPXCHG_FAILED 3
>>> + /* Send part of registers also to DM. */
>>> +#define X86EMUL_VMPORT_SEND    4
>> 
>> Introducing a new value here seems rather fragile, as various code
>> paths you don't touch would need auditing that they do the right
>> thing upon this value being returned. Plus even conceptually this
>> doesn't belong here - the instruction emulator shouldn't be concerned
>> with things like VMware emulation.
>> 
> 
> The only place I know of where rc is not checked by name is in
> x86_emulate.c.  There are a lot of 0 and != 0 checks.  Also in area of
> code there are places that return X86EMUL_OKAY when it looks to me that
> the return value is checked for 0 and ignored otherwise.

The point aren't the checks against zero, but the ones against the
#define-d values. Code may exist that, after excluding certain
values, assumes that only some specific value can be left. While
we aim at adding ASSERT()s for such cases, I'm nowhere near to
being convinced this is the case everywhere.

> So I will agree that the use of these defines is complex.  However, I
> need a way to pass back X86EMUL_UNHANDLEABLE and send a few registers to
> QEMU.  Now since the code path that I need to do this is:
> 
> ...
>  hvmemul_do_io
>   hvm_portio_intercept
>    hvm_io_intercept
>     process_portio_intercept
>      vmport_ioport
> 
> 
> Since there is only 1 caller to hvm_portio_intercept() --
> hvmemul_do_io, and hvmemul_do_io does not let X86EMUL_VMPORT_SEND be
> returned. I feel that all code paths currently have been checked.
> Adding a return code to me was the simpler code change.  It is possible
> to change process_portio_intercept() by adding special code there to
> adjust the ioreq_t p there, but that looked worse to me.
> 
> I can change to using a bit in the return of portio_action_t that would
> be masked in process_portio_intercept() and make the code in
> hvmemul_do_io() less clear since the change of p->type changes in
> process_portio_intercept(), and change to hvmemul_do_io() is much more
> involved.
> 
> I am happy to use some other number like 65539 if that would help. Also
> any other name like X86EMUL_UNHANDLEABLE_EXTENDED, SPECIAL_DM_HANDLING,
> etc would be fine with me.  I have no issue with defining this in a
> different header file if that would help.

I understand all this is non-trivial and not necessarily obvious. But
as said before - the x86 instruction emulator should please remain
something acting along _only_ architectural specifications. Any
extensions to support things like what you want to do here should
be added using neutral hooks, responsible for keeping state they
need on their own.

>>> --- a/xen/include/public/hvm/hvm_op.h
>>> +++ b/xen/include/public/hvm/hvm_op.h
>>> @@ -314,6 +314,9 @@ 
>>> DEFINE_XEN_GUEST_HANDLE(xen_hvm_get_ioreq_server_info_t);
>>>   *
>>>   * NOTE: unless an emulation request falls entirely within a range mapped
>>>   * by a secondary emulator, it will not be passed to that emulator.
>>> + *
>>> + * NOTE: The 'special' range of 1 is what is checked for outside
>>> + * of the three types of I/O.
>>>   */
>>>  #define HVMOP_map_io_range_to_ioreq_server 19
>>>  #define HVMOP_unmap_io_range_from_ioreq_server 20
>>> @@ -324,6 +327,9 @@ struct xen_hvm_io_range {
>>>  # define HVMOP_IO_RANGE_PORT   0 /* I/O port range */
>>>  # define HVMOP_IO_RANGE_MEMORY 1 /* MMIO range */
>>>  # define HVMOP_IO_RANGE_PCI    2 /* PCI segment/bus/dev/func range */
>>> +# define HVMOP_IO_RANGE_VMWARE_PORT 3 /* VMware port special range */
>>> +# define HVMOP_IO_RANGE_TIMEOFFSET 7 /* TIMEOFFSET special range */
>>> +# define HVMOP_IO_RANGE_INVALIDATE 8 /* INVALIDATE special range */
>> 
>> I can't seem to connect the comment you add above to the three
>> new #define-s.
>> 
> 
> Will work on better wording for the comment here and in the code
> that is checking for it.  Paul Durrant pointed out that
> HVMOP_IO_RANGE_INVALIDATE is not needed and so the plan is to drop it.
> Does:
> 
>  + * NOTE: The 'special' range of 1 is what is checked for on
>  + * VMWARE_PORT and TIMEOFFSET.
> 
> help?

Calling out the affected ones explicitly is certainly better. Beyond
that see my earlier remark.

>>> --- a/xen/include/public/hvm/ioreq.h
>>> +++ b/xen/include/public/hvm/ioreq.h
>>> @@ -35,6 +35,7 @@
>>>  #define IOREQ_TYPE_PIO          0 /* pio */
>>>  #define IOREQ_TYPE_COPY         1 /* mmio ops */
>>>  #define IOREQ_TYPE_PCI_CONFIG   2
>>> +#define IOREQ_TYPE_VMWARE_PORT  3
>>>  #define IOREQ_TYPE_TIMEOFFSET   7
>>>  #define IOREQ_TYPE_INVALIDATE   8 /* mapcache */
>> 
>> Are you sure (re-)using a value in the middle is safe? Did you check
>> where the sparseness originates from?
>> 
> 
> 99% sure.  It looks like a RHEL 5 dom0 still has code for IOREQ_TYPE_OR.
> I have no idea how a Xen 4.6 HVM guest would pass a ioreq to Dom0 kernel
> instead of to QEMU or QEMU Traditional.
> 
> Some background.  When Julien Grall added 2, this was said:
> 
> Keir Fraser
> [...]
> They were almost certainly used for representing R-M-W ALU operations back
> in the days of the old IO emulator, very long ago. Still, there''s no
> harm in
> leaving them unused.
> [...]
> I did find the old defn:
> 
> dcs-xen-54:~/xen>git show 4ff8936 | grep IOREQ_TYPE_
> #define IOREQ_TYPE_PIO          0 /* pio */
> #define IOREQ_TYPE_COPY         1 /* mmio ops */
> #define IOREQ_TYPE_AND          2
> #define IOREQ_TYPE_OR           3
> #define IOREQ_TYPE_XOR          4
> #define IOREQ_TYPE_XCHG         5
> #define IOREQ_TYPE_ADD          6
> [...]
> Which matches what Keir Fraser said.
> 
> I did not find why Paul Durrant did not use 9.  I can only find it as 2
> in all Paul's patch sets.  Which is closely connected to:
> 
>          BUILD_BUG_ON(IOREQ_TYPE_PIO != HVMOP_IO_RANGE_PORT);
>          BUILD_BUG_ON(IOREQ_TYPE_COPY != HVMOP_IO_RANGE_MEMORY);
>          BUILD_BUG_ON(IOREQ_TYPE_PCI_CONFIG != HVMOP_IO_RANGE_PCI);
> 
> (a new hyper call arg).  This also did not add a hole in "range" so
> Paul Durrant did not have to check for a "range" hole.
> 
> So I did just like Paul Durrant did.  He also approved the patch with
> this number in QEMU.  Since this is now in upstream QEMU, changing it at
> this time is a slower process.  Since the only time QEMU uses its
> version is when Xen header files are missing, I do not see how a QEMU
> built with its version would be usable as a QEMU for Xen.  So I am
> happy to change to a new number like 9.

Yes please. I'm not saying we absolutely have to correct the one
Paul added (unless we learn it causes problems), but I think we
should avoid making the same (even if only potential) mistake twice.
Of course it would help to get insight from Paul (now Cc-ed) here.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.