|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 2/3] x86/hvm: Allow writes to registers on the same page as MSI-X table
On Sat, Mar 25, 2023 at 03:49:23AM +0100, Marek Marczykowski-Górecki wrote:
> Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers
> on the same page as MSI-X table. Device model (especially one in
> stubdomain) cannot really handle those, as direct writes to that page is
> refused (page is on the mmio_ro_ranges list). Instead, add internal ioreq
> server that handle those writes.
>
> Doing this, requires correlating write location with guest view
> of MSI-X table address. Since QEMU doesn't map MSI-X table to the guest,
> it requires msixtbl_entry->gtable, which is HVM-only. Similar feature
> for PV would need to be done separately.
>
> This can be also used to read Pending Bit Array, if it lives on the same
> page, making QEMU not needing /dev/mem access at all (especially helpful
> with lockdown enabled in dom0). If PBA lives on another page, QEMU will
> map it to the guest directly.
> If PBA lives on the same page, forbid writes and log a message.
> Technically, writes outside of PBA could be allowed, but at this moment
> the precise location of PBA isn't saved.
>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
> ---
> v2:
> - adjust commit message
> - pass struct domain to msixtbl_page_handler_get_hwaddr()
> - reduce local variables used only once
> - log a warning if write is forbidden if MSI-X and PBA lives on the same
> page
> - do not passthrough unaligned accesses
> - handle accesses both before and after MSI-X table
> ---
> xen/arch/x86/hvm/vmsi.c | 154 +++++++++++++++++++++++++++++++++
> xen/arch/x86/include/asm/msi.h | 5 ++
> xen/arch/x86/msi.c | 38 ++++++++
> 3 files changed, 197 insertions(+)
>
> diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
> index 9c82bf9b4ec2..9293009a4075 100644
> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -438,6 +438,152 @@ static const struct hvm_io_ops msixtbl_mmio_ops = {
> .write = _msixtbl_write,
> };
>
> +static void __iomem *msixtbl_page_handler_get_hwaddr(
> + const struct domain *d,
> + uint64_t address,
> + bool write)
> +{
> + const struct pci_dev *pdev = NULL;
> + const struct msixtbl_entry *entry;
> + int adj_type;
unsigned AFAICT.
> +
> + rcu_read_lock(&msixtbl_rcu_lock);
> + /*
> + * Check if it's on the same page as the end of the MSI-X table, but
> + * outside of the table itself.
> + */
> + list_for_each_entry( entry, &d->arch.hvm.msixtbl_list, list )
> + {
> + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable) &&
> + address < entry->gtable )
> + {
> + adj_type = ADJ_IDX_FIRST;
> + pdev = entry->pdev;
> + break;
> + }
> + if ( PFN_DOWN(address) == PFN_DOWN(entry->gtable + entry->table_len)
> &&
Should be entry->gtable + entry->table_len - 1, or else you are
off-by-one.
> + address >= entry->gtable + entry->table_len )
> + {
> + adj_type = ADJ_IDX_LAST;
> + pdev = entry->pdev;
> + break;
> + }
> + }
> + rcu_read_unlock(&msixtbl_rcu_lock);
> +
> + if ( !pdev )
> + return NULL;
> +
> + ASSERT(pdev->msix);
> +
> + if ( !pdev->msix->adj_access_table_idx[adj_type] )
> + {
> + gdprintk(XENLOG_WARNING,
> + "Page for adjacent MSI-X table access not initialized for
> %pp\n",
> + pdev);
> +
> + return NULL;
> + }
> +
> + /* If PBA lives on the same page too, forbid writes. */
> + if ( write && (
> + (adj_type == ADJ_IDX_LAST &&
> + pdev->msix->table.last == pdev->msix->pba.first) ||
> + (adj_type == ADJ_IDX_FIRST &&
> + pdev->msix->table.first == pdev->msix->pba.last)) )
> + {
> + gdprintk(XENLOG_WARNING,
> + "MSI-X table and PBA of %pp live on the same page, "
> + "writing to other registers there is not implemented\n",
> + pdev);
Don't you actually need to check that the passed address falls into the
PBA array? PBA can be sharing the same page as the MSI-X table, but
that doesn't imply there aren't holes also not used by either the PBA
or the MSI-X table in such page.
> + return NULL;
> + }
> +
> + return fix_to_virt(pdev->msix->adj_access_table_idx[adj_type]) +
> + (address & (PAGE_SIZE - 1));
You can use PAGE_OFFSET().
> +
> +}
> +
> +static bool cf_check msixtbl_page_accept(
> + const struct hvm_io_handler *handler, const ioreq_t *r)
> +{
> + ASSERT(r->type == IOREQ_TYPE_COPY);
> +
> + return msixtbl_page_handler_get_hwaddr(
> + current->domain, r->addr, r->dir == IOREQ_WRITE);
I think you want to accept it also if it's a write to the PBA, and
just drop it. You should always pass write=false and then drop it in
msixtbl_page_write() if it falls in the PBA region (but still return
X86EMUL_OKAY).
> +}
> +
> +static int cf_check msixtbl_page_read(
> + const struct hvm_io_handler *handler,
> + uint64_t address, uint32_t len, uint64_t *pval)
Why use hvm_io_ops and not hvm_mmio_ops? You only care about MMIO
accesses here.
> +{
> + void __iomem *hwaddr;
const
I would also initialize *pval = ~0ul for safety.
> +
> + if ( address & (len - 1) )
> + return X86EMUL_UNHANDLEABLE;
You can use IS_ALIGNED for clarity. In my fix for this for vPCI Jan
asked to split unaligned accesses into 1byte ones, but I think for
guests it's better to just drop them unless there's a specific case
that we need to handle.
Also you should return X86EMUL_OKAY here, as the address is handled
here, but the current way to handle it is to drop the access.
Printing a debug message can be helpful in case unaligned accesses
need to be implemented in order to support some device.
> +
> + hwaddr = msixtbl_page_handler_get_hwaddr(
> + current->domain, address, false);
> +
> + if ( !hwaddr )
> + return X86EMUL_UNHANDLEABLE;
Maybe X86EMUL_RETRY, since the page was there in the accept handler.
> +
> + switch ( len )
> + {
> + case 1:
> + *pval = readb(hwaddr);
> + break;
> + case 2:
> + *pval = readw(hwaddr);
> + break;
> + case 4:
> + *pval = readl(hwaddr);
> + break;
> + case 8:
> + *pval = readq(hwaddr);
> + break;
Nit: we usually add a newline after every break;
> + default:
> + return X86EMUL_UNHANDLEABLE;
I would rather use ASSERT_UNREACHABLE() here and fallthrough to the
return below. Seeing such size is likely an indication of something
else gone very wrong, better to just terminate the access at once.
> + }
> + return X86EMUL_OKAY;
> +}
> +
> +static int cf_check msixtbl_page_write(
> + const struct hvm_io_handler *handler,
> + uint64_t address, uint32_t len, uint64_t val)
> +{
> + void __iomem *hwaddr = msixtbl_page_handler_get_hwaddr(
> + current->domain, address, true);
> +
You don't seem to check whether the access is aligned here?
Otherwise I have mostly the same comments as in msixtbl_page_read().
> + if ( !hwaddr )
> + return X86EMUL_UNHANDLEABLE;
> +
> + switch ( len ) {
> + case 1:
> + writeb(val, hwaddr);
> + break;
> + case 2:
> + writew(val, hwaddr);
> + break;
> + case 4:
> + writel(val, hwaddr);
> + break;
> + case 8:
> + writeq(val, hwaddr);
> + break;
> + default:
> + return X86EMUL_UNHANDLEABLE;
> + }
> + return X86EMUL_OKAY;
> +
> +}
> +
> +static const struct hvm_io_ops msixtbl_mmio_page_ops = {
> + .accept = msixtbl_page_accept,
> + .read = msixtbl_page_read,
> + .write = msixtbl_page_write,
> +};
> +
> static void add_msixtbl_entry(struct domain *d,
> struct pci_dev *pdev,
> uint64_t gtable,
> @@ -593,6 +739,14 @@ void msixtbl_init(struct domain *d)
> handler->type = IOREQ_TYPE_COPY;
> handler->ops = &msixtbl_mmio_ops;
> }
> +
> + /* passthrough access to other registers on the same page */
> + handler = hvm_next_io_handler(d);
> + if ( handler )
> + {
> + handler->type = IOREQ_TYPE_COPY;
> + handler->ops = &msixtbl_mmio_page_ops;
> + }
> }
>
> void msixtbl_pt_cleanup(struct domain *d)
> diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h
> index a53ade95c9ad..d13cf1c1f873 100644
> --- a/xen/arch/x86/include/asm/msi.h
> +++ b/xen/arch/x86/include/asm/msi.h
> @@ -207,6 +207,10 @@ struct msg_address {
> PCI_MSIX_ENTRY_SIZE + \
> (~PCI_MSIX_BIRMASK & (PAGE_SIZE - 1)))
>
> +/* indexes in adj_access_table_idx[] below */
> +#define ADJ_IDX_FIRST 0
> +#define ADJ_IDX_LAST 1
> +
> struct arch_msix {
> unsigned int nr_entries, used_entries;
> struct {
> @@ -214,6 +218,7 @@ struct arch_msix {
> } table, pba;
> int table_refcnt[MAX_MSIX_TABLE_PAGES];
> int table_idx[MAX_MSIX_TABLE_PAGES];
> + int adj_access_table_idx[2];
> spinlock_t table_lock;
> bool host_maskall, guest_maskall;
> domid_t warned;
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index d0bf63df1def..680853f84685 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -961,6 +961,34 @@ static int msix_capability_init(struct pci_dev *dev,
> domain_crash(d);
> /* XXX How to deal with existing mappings? */
> }
> +
> + /*
> + * If the MSI-X table doesn't start at the page boundary, map the
> first page for
> + * passthrough accesses.
> + */
I think you should initialize
msix->adj_access_table_idx[ADJ_IDX_{FIRST,LAST}] to -1?
> + if ( table_paddr & (PAGE_SIZE - 1) )
> + {
> + int idx = msix_get_fixmap(msix, table_paddr, table_paddr);
> +
> + if ( idx >= 0 )
> + msix->adj_access_table_idx[ADJ_IDX_FIRST] = idx;
> + else
> + gprintk(XENLOG_ERR, "Failed to map first MSI-X table page:
> %d\n", idx);
> + }
> + /*
> + * If the MSI-X table doesn't span full page(s), map the last page
> for
> + * passthrough accesses.
> + */
> + if ( (table_paddr + msix->nr_entries * PCI_MSIX_ENTRY_SIZE) &
> (PAGE_SIZE - 1) )
> + {
> + uint64_t entry_paddr = table_paddr + msix->nr_entries *
> PCI_MSIX_ENTRY_SIZE;
> + int idx = msix_get_fixmap(msix, table_paddr, entry_paddr);
> +
> + if ( idx >= 0 )
> + msix->adj_access_table_idx[ADJ_IDX_LAST] = idx;
> + else
> + gprintk(XENLOG_ERR, "Failed to map last MSI-X table page:
> %d\n", idx);
> + }
> }
> WARN_ON(msix->table.first != (table_paddr >> PAGE_SHIFT));
> ++msix->used_entries;
> @@ -1090,6 +1118,16 @@ static void _pci_cleanup_msix(struct arch_msix *msix)
> WARN();
> msix->table.first = 0;
> msix->table.last = 0;
> + if ( msix->adj_access_table_idx[ADJ_IDX_FIRST] )
> + {
> + msix_put_fixmap(msix, msix->adj_access_table_idx[ADJ_IDX_FIRST]);
> + msix->adj_access_table_idx[ADJ_IDX_FIRST] = 0;
> + }
> + if ( msix->adj_access_table_idx[ADJ_IDX_LAST] )
> + {
> + msix_put_fixmap(msix, msix->adj_access_table_idx[ADJ_IDX_LAST]);
> + msix->adj_access_table_idx[ADJ_IDX_LAST] = 0;
Isn't 0 a valid idx? You should check for >= 0 and also set
to -1 once the fixmap entry has been put.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |