[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 1/2] IOMMU/spinlock: Fix a bug found in AMD IOMMU initialization



On Fri, Mar 11, 2016 at 10:55 AM, Dario Faggioli
<dario.faggioli@xxxxxxxxxx> wrote:
> On Fri, 2016-03-11 at 09:49 -0500, Meng Xu wrote:
>> > Yes.
>> > Consistency may be helpful to avoid some easy-to-avoid lock errors.
>> > Moreover, without my fix, I think it would not lead dead lock, as
>> > the pcidevs_lock is not being taken
>> > In IRQ context. Right?
>> I think without your fix, the deadlock may still happen due to the
>> rendezvous condition.
>>            CPU A                                |    CPU B
>>      | CPU C
>> Step 1| spin_lock                           |
>> Step 2|                                           |
>> spin_lock_irq           |
>> Step 3|                                            | wait for A to
>> unlock |
>> Step 4|
>>               | send rendezvous IPI to A and B
>> Step 5| receive IPI                             | wait for A to
>> unlock |
>> Step 6| wait for B to handle the IPI    | wait for A to unlock |
>> Step 7| spin_unlock
>>
>>
>> Deadlock occurs at Step 6, IMO.
>>
>> Unless we can prove that rendezvous won't happen while
>> spin_lock_irqsave is taken, we have the deadlock hazard.
>>
> Yes. But, in the case of Quan's patch (without it, I mean), have you
> seen where in the code it is that we use spin_lock_irqsave()?
>
> It's inside a function that is called during Xen boot, whose callchain
> starts with iommu_setup(), from __start_xen(). Here's a (big, sorry)
> code snippet of what is around iommu_setup():
>
>     ...
>     init_idle_domain();
>
>     this_cpu(stubs.addr) = alloc_stub_page(smp_processor_id(),
>                                            &this_cpu(stubs).mfn);
>     BUG_ON(!this_cpu(stubs.addr));
>
>     trap_init();
>     rcu_init();
>
>     early_time_init();
>
>     arch_init_memory();
>
>     alternative_instructions();
>
>     local_irq_enable();
>
>     pt_pci_init();
>
>     vesa_mtrr_init();
>
>     acpi_mmcfg_init();
>
>     early_msi_init();
>
>     iommu_setup();    /* setup iommu if available */
>
>     smp_prepare_cpus(max_cpus);
>
>     spin_debug_enable();
>
>     /*
>      * Initialise higher-level timer functions. We do this fairly late
>      * (after interrupts got enabled) because the time bases and scale
>      * factors need to be updated regularly.
>      */
>     init_xen_time();
>     initialize_keytable();
>     console_init_postirq();
>
>     system_state = SYS_STATE_smp_boot;
>     do_presmp_initcalls();
>     for_each_present_cpu ( i )
>     {
>         /* Set up cpu_to_node[]. */
>         srat_detect_node(i);
>         /* Set up node_to_cpumask based on cpu_to_node[]. */
>         numa_add_cpu(i);
>
>         if ( (num_online_cpus() < max_cpus) && !cpu_online(i) )
>         {
>             int ret = cpu_up(i);
>             if ( ret != 0 )
>                 printk("Failed to bring up CPU %u (error %d)\n", i, ret);
>         }
>     }
>     printk("Brought up %ld CPUs\n", (long)num_online_cpus());
>     ...
>
> As you can see, it is only *after* iommu_setup() that we call functions
> like smp_prepare_cpus(), do_presmp_initcalls(), and then the loop that
> waits for all the present CPUs to come online.
>
> What that means is that, at iommu_setup() time, there still is only one
> CPU online, and there is not much chances that one single CPU deadlocks
> in a rendezvous!
>
> Honestly, the biggest issue that I think Quan's patch solves, is that
> if we ever want/manage to move spin_debug_enable() up above it, then
> the BUG_ON in check_lock() would trigger the first time that
> pcidevs_lock would be taken with interrupts enabled.

Right! I got it now.. :-)
The lock is initialized as SPIN_LOCK_UNLOCKED.
check_lock() will trigger the BUG_ON when it is used in a interrupt
disabled situation.

>
> Until then, code is technically fine, and, as a matter of fact, I think
> that removing the locking from that particular instance would be an
> equally effective fix!
>
> All that being said, consistency is indeed important, and for the sake
> of it and for other reasons too, even if, strictly speaking, there
> isn't any actual buggy behavior to be fixed here, and it is worthwhile
> conforming to a locking pattern that is consistent with the rules that
> we sat ourselves, unless there's specific reasons not to.

I see. Even though no deadlock may happen, consistency can be the
reason to modify. :-)
But it's good to know the real reason of making the change, which
could be avoiding the deadlock, be consistency, or both.
It seems to me that this is more for consistency, and avoid the
potential deadlock (although not so sure if it will eventually happen,
it could be a hazard.).

Thank you, Dario and Quan, very much for your explanation! :-) They
are really helpful! :-D

BTW, I'm sorry for the messed format of the table. Let me reformat it here:

Condition 1 (C1): Can run in irq context?
      | spin_lock | spin_lock_irq | spin_lock_irqsave
C1 | No           |  Yes              | Yes
C2 |  No            |  No             | Yes

It may be too fast (incorrect) to get the generalized conclusion, but
it should give some overview of the differences of these three locks.
Just for reference later. ;-)

Thanks,

Meng

-----------
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.