[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC



On 20/04/18 00:04, Boris Ostrovsky wrote:
> On 04/19/2018 02:18 PM, Andrew Cooper wrote:
>> On 19/04/18 16:54, Natarajan, Janakarajan wrote:
>>> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>>>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>>>>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
>>>>> unsigned int index)
>>>>>       return &d->avic_physical_id_table[index];
>>>>>   }
>>>>>   +static void avic_vcpu_load(struct vcpu *v)
>>>>> +{
>>>>> +    unsigned long tmp;
>>>>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>>>>> +    int h_phy_apic_id;
>>>>> +    struct avic_physical_id_entry *entry = (struct
>>>>> avic_physical_id_entry *)&tmp;
>>>>> +
>>>>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>>>>> +
>>>>> +    /*
>>>>> +     * Note: APIC ID = 0xff is used for broadcast.
>>>>> +     *       APIC ID > 0xff is reserved.
>>>>> +     */
>>>>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>>>>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>>>>> +
>>>>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>>>>> +    entry->host_phy_apic_id = h_phy_apic_id;
>>>>> +    entry->is_running = 1;
>>>>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>>>> What is the purpose of s->avic_last_phy_id ?
>>>>
>>>> As far as I can tell, it is always an unchanging pointer into the
>>>> physical ID table, which is only ever updated synchronously in current
>>>> context.
>>>>
>>>> If so, I don't see why it needs any of these hoops to be jumped though.
>>> s->avic_last_phy_id is used to quickly access the entry in the table.
>>>
>>> When the code was pushed for Linux, memory barriers were used and it
>>> was suggested that atomic operations
>>> be used instead to ensure compiler ordering. The same is done here.
>> Ok - summing up a conversation on IRC, and some digging around the manual.
>>
>> Per VM, there is a single Physical APIC Table, which lives in a 4k
>> page.  This table is referenced by the VMCB, and read by hardware when
>> processing guest actions.
>>
>> The contents of this table a list of 64bit entries,
>>
>> struct __packed avic_physical_id_entry {
>>     u64 host_phy_apic_id  : 8;
>>     u64 res1              : 4;
>>     u64 bk_pg_ptr_mfn     : 40;
>>     u64 res2              : 10;
>>     u64 is_running        : 1;
>>     u64 valid             : 1;
>> };
>>
>> which are indexed by guest APIC_ID.
>>
>> AMD hardware allows writes to the APIC_ID register, but OSes don't do
>> this in practice (the register is read/discard on some hardware, and
>> strictly read-only in x2apic).  The implementation in Xen is to crash
>> the domain if we see a write here, and that is reasonable behaviour
>> which I don't expect to change going forwards.
>>
>> As a result, the layout of the Physical APIC Table is fixed based on the
>> APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
>> valid bit (valid) are set up during construction, and remain unchanged
>> for the lifetime of the domain.
>>
>> The only fields which change during runtime are the host_phys_apic_id,
>> and its valid bit (is_running), and these change on vcpu context switch.
>>
>> Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
>> e->is_running to signify that the vcpu isn't allocated to a pcpu.
>>
>> On ctxt_switch_to(), we want a simple
>>
>> e->host_phy_apic_id = this_pcpu_apic_id;
>> smp_wmb();
>> __set_bit(e->is_running);
>>
>> which guarantees that the host physical apic id field is valid and up to
>> date, before hardware sees it being reported as valid.  As these changes
>> are only made in current context, there are no other ordering or
>> atomicity concerns.
>>
>> This table is expected to live in regular WB RAM, and the manual has no
>> comment/reference to requiring special accesses.  Therefore, I'm
>> moderately confident that the above ordering is sufficient for correct
>> behaviour, and no explicitly atomic actions are required.
>>
>> Thoughts/comments/suggestions?
>
> The entry can also be written as a single raw 64-bit value (I think you
> suggested in one of the reviews to make it a union with a uint64_t).

Yes it can, but the handling for that is more complicated and (AFAICT)
unnecessary.

You'd need something like:

e = ACCESS_ONCE(*ptr);
e->host_phy_apic_id = ...;
e->is_running = 1;
ACCESS_ONCE(*ptr) = e;

which would be the most flexible way of expressing the result.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.