[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/7] introduce time managment in xtf



On 08/04/2018 15:00, Paul Semel wrote:
> On 04/07/2018 10:58 PM, Paul Semel wrote:
>> On 04/07/2018 10:39 PM, Andrew Cooper wrot>>>>> However, both of your
>> patches have (different) barrier issues, and
>>>>>>> different (mis)uses of the shared memory clocks, which will need
>>>>>>> to be
>>>>>>> addressed. One general comment for the full series is to not bother
>>>>>>> trying to make time 32bits in a 32bit build.  XTF is an entirely
>>>>>>> self-contained binary, and I don't much care for legacy
>>>>>>> behaviours for
>>>>>>> legacy sake.
>>>>>>>
>>>>>>
>>>>
>>>> Alright, so I took a look at all of this. So, if I understood it well,
>>>> you want me to drop all the `uint32_t` part to only have 64 bits.
>>>> But are you sure that it works the same on 32bits architecture ?
>>>
>>> At the end of the day, all you are passing is units of nanoseconds of
>>> larger.  The parameter passing won't be identical between 32 or 64bit
>>> builds, but everything else will be fine.
>>>
>>>>
>>>> Also, I have another question. I don't see any place where I have
>>>> memory barrier issues, but I am certainely missing something. Can you
>>>> elaborate on this one ?
>>>
>>> Your code does:
>>>
>>>> +    do
>>>> +    {
>>>> +        do
>>>> +        {
>>>> +            wc_version = shared_info.wc_version ;
>>>> +            version = shared_info.vcpu_info[0].time.version;
>>>> +        } while ( (version & 1) == 1 || (wc_version & 1) == 1);
>>>> +
>>>> +        system_time = shared_info.vcpu_info[0].time.system_time;
>>>> +        old_tsc = shared_info.vcpu_info[0].time.tsc_timestamp;
>>>> +    } while (
>>>> +            version != shared_info.vcpu_info[0].time.version ||
>>>> +            wc_version != shared_info.wc_version
>>>> +            );
>>>
>>> First of all, I'm not sure why you are looking at wc_version.  You are
>>> only reading data out of shared_info.vcpu_info[0].time so you should
>>> only need to check time.version
>>>
>>> That said, it is critical to force two things:
>>> 1) The first read of version is strictly before
>>> system_time/tsc_timestamp, and the second read is afterwards.
>>> 2) That the compiler can't optimise the code by making repeated
>>> reads of
>>> shared memory.
>>>
>>> 1 is a correctness issue and C, whose memory model is that there are no
>>> unknown updates to memory, sees that it hasn't written to time.version,
>>> and therefore can optimises the entire outer do loop to a single
>>> iteration.
>>>
>>> 2 is a security issue and we've had many XSAs in the past.  The worst
>>> example IIRC was a compiler which managed to use a jump table, using a
>>> 32bit integer index read from shared memory.  When using shared memory,
>>> it is critical to force the compiler to read a value and stash it
>>> locally (either in a live register, or on the stack), then audit its
>>> value, then act upon the value.
>>>
>>> This way, no optimisation the compiler can make code which will result
>>> in a repeated read of memory, where a malicious advisory could change
>>> the value between reads.
>>>
>>> 1 strictly needs memory barriers to make work, and in this case
>>> smp_rmb().  Things tend to work on x86 because of the fairly restricted
>>> memory model, but architectures such as ARM with weaker memory models
>>> typically tends to explode in funny ways.
>>>
>>> 2 could be fixed with ACCESS_ONCE() to explicitly tell the compiler
>>> that
>>> this is a volatile access and the value in memory may no longer be
>>> the same.
>>>
>>> A sample (which I think is correct, but you never quite know for sure
>>> with barriers!) is:
>>>
>>>>      do {
>>>>          uint32_t ver1, ver2;
>>>>
>>>>          do {
>>>>              ver1 = shared_info.vcpu_info[0].time.version;
>>>>              smp_rmb();
>>>>          } while ( (ver1 & 1) == 1 );
>>>>
>>>>          system_time = shared_info.vcpu_info[0].time.system_time;
>>>>          old_tsc = shared_info.vcpu_info[0].time.tsc_timestamp;
>>>>          smp_rmb();
>>>>          ver2 = shared_info.vcpu_info[0].time.version;
>>>>          smp_rmb();
>>>>
>>>>      } while ( ver1 != ver2 );
>>>
>>> The smp_rmb() are full CPU memory barriers (enforcing the ordering of
>>> reads even from remote cores), and compiler barriers (preventing the
>>> compiler from reodering memory accesses across the barrier).
>>>
>>> Therefore, ver1 and ver2 are forced to be kept in registers, or spilled
>>> onto the stack.
>>>
>>> ~Andrew
>>>
>>
>> Thank you very much for replying ! I completely missed this function,
>> I do really apologize about it !
>>
>> Effectively, there is serious issues here. As far as I can remember,
>> I inspired myself by another OS code to write this function (as I
>> didn't know really much how the time thing was working in Xen).
>>
>> Thanks for writing this piece of code, I think I will use it and add
>> some ACCESS_ONCE on my reads of the different variables in the shared
>> memory (version, system_time etc..).
>>
>> Anyway, I already got rid of the 32 bits part, so that even 32 bits
>> architectures are using 64 bits integer (by making sure to use the
>> `divmod64` function 🙂) for the division and modulos.
>>
>> I will change this part of the code this night or maybe tomorrow and
>> send you another version of the patch !
>>
> So, to be completely secure, I guess we would need some locking
> mechanisms for 64 bits shared memory accesses on 32 bits arch. Am I
> right ?

That depends.

If you have two mutually untrusted entities, then yes, you have to be
very careful. 

In this case, one half is the hypervisor, and you've got to trust that. 
For example, Xen could leave the version field with the bottom bit set,
and cause us to livelock waiting for it to drop, but that defeats the
purpose of Xen running virtual machines.

Here, the check of the two samples of the version field is sufficient to
determine that we read consistent data, irrespective of whether the
64bit fields were broken down into two 32bit accesses or not.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.