[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/8] x86/SVM: Add vcpu scheduling support for AVIC



On 19/04/18 16:54, Natarajan, Janakarajan wrote:
> On 4/13/2018 12:57 PM, Andrew Cooper wrote:
>> On 04/04/18 00:01, Janakarajan Natarajan wrote:
>>> @@ -63,6 +64,54 @@ avic_get_physical_id_entry(struct svm_domain *d,
>>> unsigned int index)
>>>       return &d->avic_physical_id_table[index];
>>>   }
>>>   +static void avic_vcpu_load(struct vcpu *v)
>>> +{
>>> +    unsigned long tmp;
>>> +    struct arch_svm_struct *s = &v->arch.hvm_svm;
>>> +    int h_phy_apic_id;
>>> +    struct avic_physical_id_entry *entry = (struct
>>> avic_physical_id_entry *)&tmp;
>>> +
>>> +    ASSERT(!test_bit(_VPF_blocked, &v->pause_flags));
>>> +
>>> +    /*
>>> +     * Note: APIC ID = 0xff is used for broadcast.
>>> +     *       APIC ID > 0xff is reserved.
>>> +     */
>>> +    h_phy_apic_id = cpu_data[v->processor].apicid;
>>> +    ASSERT(h_phy_apic_id < AVIC_PHY_APIC_ID_MAX);
>>> +
>>> +    tmp = read_atomic((u64*)(s->avic_last_phy_id));
>>> +    entry->host_phy_apic_id = h_phy_apic_id;
>>> +    entry->is_running = 1;
>>> +    write_atomic((u64*)(s->avic_last_phy_id), tmp);
>> What is the purpose of s->avic_last_phy_id ?
>>
>> As far as I can tell, it is always an unchanging pointer into the
>> physical ID table, which is only ever updated synchronously in current
>> context.
>>
>> If so, I don't see why it needs any of these hoops to be jumped though.
>
> s->avic_last_phy_id is used to quickly access the entry in the table.
>
> When the code was pushed for Linux, memory barriers were used and it
> was suggested that atomic operations
> be used instead to ensure compiler ordering. The same is done here.

Ok - summing up a conversation on IRC, and some digging around the manual.

Per VM, there is a single Physical APIC Table, which lives in a 4k
page.  This table is referenced by the VMCB, and read by hardware when
processing guest actions.

The contents of this table a list of 64bit entries,

struct __packed avic_physical_id_entry {
    u64 host_phy_apic_id  : 8;
    u64 res1              : 4;
    u64 bk_pg_ptr_mfn     : 40;
    u64 res2              : 10;
    u64 is_running        : 1;
    u64 valid             : 1;
};

which are indexed by guest APIC_ID.

AMD hardware allows writes to the APIC_ID register, but OSes don't do
this in practice (the register is read/discard on some hardware, and
strictly read-only in x2apic).  The implementation in Xen is to crash
the domain if we see a write here, and that is reasonable behaviour
which I don't expect to change going forwards.

As a result, the layout of the Physical APIC Table is fixed based on the
APIC assignment during domain creation.  Also, the bk_pg_ptr_mfn and its
valid bit (valid) are set up during construction, and remain unchanged
for the lifetime of the domain.

The only fields which change during runtime are the host_phys_apic_id,
and its valid bit (is_running), and these change on vcpu context switch.

Therefore, on ctxt_switch_from(), we want a straight __clear_bit() on
e->is_running to signify that the vcpu isn't allocated to a pcpu.

On ctxt_switch_to(), we want a simple

e->host_phy_apic_id = this_pcpu_apic_id;
smp_wmb();
__set_bit(e->is_running);

which guarantees that the host physical apic id field is valid and up to
date, before hardware sees it being reported as valid.  As these changes
are only made in current context, there are no other ordering or
atomicity concerns.

This table is expected to live in regular WB RAM, and the manual has no
comment/reference to requiring special accesses.  Therefore, I'm
moderately confident that the above ordering is sufficient for correct
behaviour, and no explicitly atomic actions are required.

Thoughts/comments/suggestions?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.