[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 06/14] vpci/header: implement guest BAR register handlers



On 31.01.2022 14:30, Oleksandr Andrushchenko wrote:
> 
> 
> On 31.01.22 13:39, Jan Beulich wrote:
>> On 31.01.2022 12:23, Oleksandr Andrushchenko wrote:
>>> On 31.01.22 13:10, Roger Pau Monné wrote:
>>>> Right (see my previous reply to this comment). I think it would be
>>>> easier (and cleaner) if you switched the default behavior regarding
>>>> unhandled register access for domUs at the start of the series (drop
>>>> writes, reads returns ~0), and then you won't need to add all those
>>>> dummy handler to drop writes and return ~0 for reads.
>>>>
>>>> It's going to be more work initially as you would need to support
>>>> passthrough of more registers, but it's the right approach that we
>>>> need implementation wise.
>>> While I agree in general, this effectively means that I'll need to provide
>>> handling for all PCIe registers and capabilities from the very start.
>>> Otherwise no guest be able to properly initialize a PCI device without that.
>>> Of course, we may want starting from stubs instead of proper emulation,
>>> which will direct the access to real HW and later on we add proper 
>>> emulation.
>>> But, again, this is going to be a rather big piece of code where we need
>>> to explicitly handle every possible capability.
>> Since the two sub-threads are now about exactly the same topic, I'm
>> answering here instead of there.
>>
>> No, you are not going to need to emulate all possible capabilities.
>> We (or really qemu) don't do this on x86 either. Certain capabilities
>> may be a must, but not everything. There are also device specific
>> registers not covered by any capability structures - what to do with
>> those is even more of a question.
>>
>> Furthermore for some of the fields justification why access to the
>> raw hardware value is fine is going to be easy: r/o fields like
>> vendor and device ID, for example. But every bit you allow direct
>> access to needs to come with justification.
>>
>>> At the moment we are not going to claim that vPCI provides all means to
>>> pass through a PCI device safely with this respect and this is why the 
>>> feature
>>> itself won't even be a tech preview yet. For that reason I think we can 
>>> still
>>> have implemented only crucial set of handlers and still allow the rest to
>>> be read/write directly without emulation.
>> I think you need to separate what you need for development from what
>> goes upstream: For dev purposes you can very well invert the policy
>> from white- to black-listing. But if we accepted the latter into the
>> main tree, the risk would be there that something gets missed at the
>> time where the permission model gets changed around.
>>
>> You could even have a non-default mode operating the way you want it
>> (along the lines of pciback's permissive mode), allowing you to get
>> away without needing to carry private patches. Things may also
>> initially only work in that mode. But the default should be a mode
>> which is secure (and which perhaps initially offers only very limited
>> functionality).
> Ok, so to make it clear:
> 1. We do not allow unhandled access for guests: for that I will create a
> dedicated patch which will implement such restrictions. Something like
> the below (for both vPCI read and write):
> 
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index c5e67491c24f..9ef2a1b5af58 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -347,6 +347,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, 
> unsigned int size)
>       const struct vpci_register *r;
>       unsigned int data_offset = 0;
>       uint32_t data = ~(uint32_t)0;
> +    bool handled = false;
> 
>       if ( !size )
>       {
> @@ -405,6 +406,8 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, 
> unsigned int size)
>           if ( cmp > 0 )
>               continue;
> 
> +        handled = true; /* Found the handler for this access. */
> +
>           if ( emu.offset < r->offset )
>           {
>               /* Heading gap, read partial content from hardware. */
> @@ -432,6 +435,10 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, 
> unsigned int size)
>       }
>       spin_unlock(&pdev->vpci_lock);
> 
> +    /* All unhandled guest requests return all 1's. */
> +    if ( !is_hardware_domain(d) && !handled )
> +        return ~(uint32_t)0;
> +
>       if ( data_offset < size )
>       {
>           /* Tailing gap, read the remaining. */

Except that like for the "tailing gap" you also need to avoid the
"heading gap" ending up in a read of the underlying hardware
register. Effectively you want to deal properly with all
vpci_read_hw() invocations (including the one when no pdev was
found, which for a DomU may simply mean domain_crash()).

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.