[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] dump_p2m_table: For IOMMU



>>> On 08.08.12 at 19:17, Santosh Jodh <santosh.jodh@xxxxxxxxxx> wrote:
> New key handler 'o' to dump the IOMMU p2m table for each domain.
> Skips dumping table for domain0.
> Intel and AMD specific iommu_ops handler for dumping p2m table.

I'm sorry Santosh, but this is still not quite right.

> @@ -512,6 +513,69 @@ static int amd_iommu_group_id(u16 seg, u
>  
>  #include <asm/io_apic.h>
>  
> +static void amd_dump_p2m_table_level(struct page_info* pg, int level, u64 
> gpa)

static void amd_dump_p2m_table_level(struct page_info *pg, int level, paddr_t 
gpa)

> +{
> +    u64 address;

paddr_t

> +    void *table_vaddr, *pde;
> +    u64 next_table_maddr;

This ought to also be paddr_t, but I realize that the whole AMD
IOMMU code is using u64 instead. So here it's probably okay to
remain u64.

> +    int index, next_level, present;
> +    u32 *entry;
> +
> +    if ( level <= 1 )
> +        return;
> +
> +    table_vaddr = __map_domain_page(pg);
> +    if ( table_vaddr == NULL )
> +    {
> +        printk("Failed to map IOMMU domain page %016"PRIx64"\n", 
> page_to_maddr(pg));

We specifically have PRIpaddr for this purpose.

> +        return;
> +    }
> +
> +    for ( index = 0; index < PTE_PER_TABLE_SIZE; index++ )
> +    {
> +        if ( !(index % 2) )
> +            process_pending_softirqs();
> +
> +        pde = table_vaddr + (index * IOMMU_PAGE_TABLE_ENTRY_SIZE);
> +        next_table_maddr = amd_iommu_get_next_table_from_pte(pde);
> +        entry = (u32*)pde;
> +
> +        next_level = get_field_from_reg_u32(entry[0],
> +                                            IOMMU_PDE_NEXT_LEVEL_MASK,
> +                                            IOMMU_PDE_NEXT_LEVEL_SHIFT);
> +
> +        present = get_field_from_reg_u32(entry[0],
> +                                         IOMMU_PDE_PRESENT_MASK,
> +                                         IOMMU_PDE_PRESENT_SHIFT);
> +
> +        address = gpa + amd_offset_level_address(index, level);
> +        if ( (next_table_maddr != 0) && (next_level != 0)
> +            && present )
> +        {
> +            amd_dump_p2m_table_level(
> +                maddr_to_page(next_table_maddr), level - 1, address);

I guess I see now that you started by cloning
deallocate_next_page_table(), which has almost all the same
issues I have been complaining about here.

Wei - here I'm particularly worried about the use of "level - 1"
instead of "next_level", which would similarly apply to the
original function. If the way this is currently done is okay, then
why is next_level being computed in the first place? (And similar
to the issue Santosh has already fixed here - the original
function pointlessly maps/unmaps the page when "level <= 1".
Furthermore, iommu_map.c has nice helper functions
iommu_next_level() and amd_iommu_is_pte_present() - why
aren't they in a header instead, so they could be used here,
avoiding the open coding of them?)

> +        }
> +
> +        if ( present )
> +        {
> +            printk("gfn: %016"PRIx64"  mfn: %016"PRIx64"\n",
> +                   address >> PAGE_SHIFT, next_table_maddr >> PAGE_SHIFT);

I'd prefer you to use PFN_DOWN() here.

Also, depth first, as requested by Tim, to me doesn't mean
recursing before printing. I think you really want to print first,
then recurse. Otherwise how would be output be made sense
of?

> +        }
> +    }
> +
> +    unmap_domain_page(table_vaddr);
> +}
>...
> --- a/xen/drivers/passthrough/iommu.c Tue Aug 07 18:37:31 2012 +0100
> +++ b/xen/drivers/passthrough/iommu.c Wed Aug 08 09:56:50 2012 -0700
> @@ -54,6 +55,8 @@ bool_t __read_mostly amd_iommu_perdev_in
>  
>  DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
>  
> +void setup_iommu_dump(void);
> +

This is completely bogus. If the function was called from another
source file, the declaration would belong into a header file. Since
it's only used here, it ought to be static.

>  static void __init parse_iommu_param(char *s)
>  {
>      char *ss;
> @@ -119,6 +122,7 @@ void __init iommu_dom0_init(struct domai
>      if ( !iommu_enabled )
>          return;
>  
> +    setup_iommu_dump();
>      d->need_iommu = !!iommu_dom0_strict;
>      if ( need_iommu(d) )
>      {
>...
> +void __init setup_iommu_dump(void)
> +{
> +    register_keyhandler('o', &iommu_p2m_table);
> +}

Furthermore, there's no real need for a separate function here
anyway. Just call register_key_handler() directly. Or
alternatively this ought to match other code doing the same -
using an initcall.

> --- a/xen/drivers/passthrough/vtd/iommu.c     Tue Aug 07 18:37:31 2012 +0100
> +++ b/xen/drivers/passthrough/vtd/iommu.c     Wed Aug 08 09:56:50 2012 -0700
> +static void vtd_dump_p2m_table_level(u64 pt_maddr, int level, u64 gpa)
> +{
> +    u64 address;

Again, both gpa and address ought to be paddr_t, and the format
specifiers should match.

> +    int i;
> +    struct dma_pte *pt_vaddr, *pte;
> +    int next_level;
> +
> +    if ( pt_maddr == 0 )
> +        return;
> +
> +    pt_vaddr = (struct dma_pte *)map_vtd_domain_page(pt_maddr);

Pointless cast.

> +    if ( pt_vaddr == NULL )
> +    {
> +        printk("Failed to map VT-D domain page %016"PRIx64"\n", pt_maddr);
> +        return;
> +    }
> +
> +    next_level = level - 1;
> +    for ( i = 0; i < PTE_NUM; i++ )
> +    {
> +        if ( !(i % 2) )
> +            process_pending_softirqs();
> +
> +        pte = &pt_vaddr[i];
> +        if ( !dma_pte_present(*pte) )
> +            continue;
> +        
> +        address = gpa + offset_level_address(i, level);
> +        if ( next_level >= 1 )
> +            vtd_dump_p2m_table_level(dma_pte_addr(*pte), next_level, 
> address);
> +
> +        if ( level == 1 )
> +            printk("gfn: %016"PRIx64" mfn: %016"PRIx64" superpage=%d\n", 
> +                    address >> PAGE_SHIFT_4K, pte->val >> PAGE_SHIFT_4K, 
> dma_pte_superpage(*pte)? 1 : 0);

Why do you print leaf (level 1) tables here only?

And the last line certainly is above 80 chars, so needs breaking up.

(Also, just to avoid you needing to do another iteration: Don't
switch to PFN_DOWN() here.)

I further wonder whether "superpage" alone is enough - don't we
have both 2M and 1G pages? Of course, that would become mute
if higher levels got also dumped (as then this knowledge is implicit).

Which reminds me to ask that both here and in the AMD code the
recursion level should probably be reflected by indenting the
printed strings.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.