Xen project Mailing List

Re: [Xen-devel] [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses

From: Norbert Manthey <nmanthey@xxxxxxxxx>

Date: Mon, 18 Feb 2019 14:49:45 +0100

Autocrypt: addr=nmanthey@xxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFoJQc0BEADM8Z7hB7AnW6ErbSMsYkKh4HLAPfoM+wt7Fd7axHurcOgFJEBOY2gz0isR /EDiGxYyTgxt5PZHJIfra0OqXRbWuLltbjhJACbu35eaAo8UM4/awgtYx3O1UCbIlvHGsYDg kXjF8bBrVjPu0+g55XizX6ot/YPAgmWTdH8qXoLYVZVWJilKlTqpYEVvarSn/BVgCbIsQIps K93sOTN9eJKDSqHvbkgKl9XG3WsZ703431egIpIZpfN0zZtzumdZONb7LiodcFHJ717vvd89 3Hv2bYv8QLSfYsZcSnyU0NVzbPhb1WtaduwXwNmnX1qHJuExzr8EnRT1pyhVSqouxt+xkKbV QD9r+cWLChumg3g9bDLzyrOTlEfAUNxIqbzSt03CRR43dWgfgGiLDcrqC2b1QR886WDpz4ok xX3fdLaqN492s/3c59qCGNG30ebAj8AbV+v551rsfEba+IWTvvoQnbstc6vKJCc2uG8rom5o eHG/bP1Ug2ht6m/0uWRyFq9C27fpU9+FDhb0ZsT4UwOCbthe35/wBZUg72zDpT/h5lm64G6C 0TRqYRgYcltlP705BJafsymmAXOZ1nTCuXnYAB9G9LzZcKKq5q0rP0kp7KRDbniirCUfp7jK VpPCOUEc3tS1RdCCSeWNuVgzLnJdR8W2h9StuEbb7hW4aFhwRQARAQABtCROb3JiZXJ0IE1h bnRoZXkgPG5tYW50aGV5QGFtYXpvbi5kZT6JAj0EEwEIACcFAloJQc0CGyMFCQlmAYAFCwkI BwIGFQgJCgsCBBYCAwECHgECF4AACgkQZ+8yS8zN62ajmQ/6AlChoY5UlnUaH/jgcabyAfUC XayHgCcpL1SoMKvc2rCA8PF0fza3Ep2Sw0idLqC/LyAYbI6gMYavSZsLcsvY6KYAZKeaEriG 7R6cSdrbmRcKpPjwvv4iY6G0DBTeaqfNjGe1ECY8u522LprDQVquysJIf3YaEyxoK/cLSb0c kjzpqI1P9Vh+8BQb5H9gWpakbhFIwbRGHdAF1roT7tezmEshFS0IURJ2ZFEI+ZgWgtl1MBwN sBt65im7x5gDo25h8A5xC9gLXTc4j3tk+3huaZjUJ9mCbtI12djVtspjNvDyUPQ5Mxw2Jwar C3/ZC+Nkb+VlymmErpnEUZNltcq8gsdYND4TlNbZ2JhD0ibiYFQPkyuCVUiVtimXfh6po9Yt OkE0DIgEngxMYfTTx01Zf6iwrbi49eHd/eQQw3zG5nn+yZsEG8UcP1SCrUma8p93KiKOedoL n43kTg4RscdZMjj4v6JkISBcGTR4uotMYP4M0zwjklnFXPmrZ6/E5huzUpH9B7ZIe/SUu8Ur xww/4dN6rfqbNzMxmya8VGlEQZgUMWcck+cPrRLB09ZOk4zq9i/yaHDEZA1HNOfQ9UCevXV5 7seXSX7PCY6WDAdsT3+FuaoQ7UoWN3rdpb+064QKZ0FsHeGzUd7MZtlgU4EKrh25mTSNZYRs nTz2zT/J33e5Ag0EWglBzQEQAKioD1gSELj3Y47NE11oPkzWWdxKZdVr8B8VMu6nVAAGFRSf Dms4ZmwGY27skMmOH2srnZyTfm9FaTKr8RI+71Fh9nfB9PMmwzA7OIY9nD73/HqPywzTTleG MlALmnuY6xFRSDmqmvxDHgWyzB4TgPWt8+hW3+TJKCx2RgLAdSuULZla4lia+NlS8WNRUDGK sFJCCB3BW5I/cocfpBEUqLbbmnPuD9UKpEnFcYWD9YaDNcBTjSc7iDsvtpdrBXg5VETOz/TQ /CmVs9h/5zug8O4bXxHEEJpCAxs4cGKxowBqx/XJfkwdWeo/LdaeR+LRbXvq4A32HSkyj9sV vygwt2OFEk493JGik8qtAA/oPvuqVPJGacxmZ7zKR12c0mnKCHiexFJzFbC7MSiUhhe8nNiM p6Sl6EZmsTUXhV2bd2M12Bqcss3TTJ1AcW04T4HYHVCSxwl0dVfcf3TIaH0BSPiwFxz0FjMk 10umoRvUhYYoYpPFCz8dujXBlfB8q2tnHltEfoi/EIptt1BMNzTYkHKArj8Fwjf6K+nQ3a8p 1cWfkYpA5bRqbhbplzpa0u1Ex0hZk6pka0qcVgqmH31O2OcSsqeKfUfHkzj3Q6dmuwm1je/f HWH9N1gDPEp1RB5bIxPnOG1Z4SNl9oVQJhc4qoJiqbvkciivYcH7u2CBkboFABEBAAGJAiUE GAEIAA8FAloJQc0CGwwFCQlmAYAACgkQZ+8yS8zN62YU9Q//WTnN28aBX1EhDidVho80Ql2b tV1cDRh/vWTcM4qoM8vzW4+F/Ive6wDVAJ7zwAv8F8WPzy+acxtHLkyYk14M6VZ1eSy0kV0+ RZQdQ+nPtlb1MoDKw2N5zhvs8A+WD8xjDIA9i21hQ/BNILUBINuYKyR19448/41szmYIEhuJ R2fHoLzNdXNKWQnN3/NPTuvpjcrkXKJm2k32qfiys9KBcZX8/GpuMCc9hMuymzOr+YlXo4z4 1xarEJoPOQOXnrmxN4Y30/qmf70KHLZ0GQccIm/o/XSOvNGluaYv0ZVJXHoCnYvTbi0eYvz5 OfOcndqLOfboq9kVHC6Yye1DLNGjIVoShJGSsphxOx2ryGjHwhzqDrLiRkV82gh6dUHKxBWd DXfirT8a4Gz/tY9PMxan67aSxQ5ONpXe7g7FrfrAMe91XRTf50G3rHb8+AqZfxZJFrBn+06i p1cthq7rJSlYCqna2FedTUT+tK1hU9O0aK4ZYYcRzuTRxjd4gKAWDzJ1F/MQ12ftrfCAvs7U sVbXv2TndGIleMnheYv1pIrXEm0+sdz5v91l2/TmvkyyWT8s2ksuZis9luh+OubeLxHq090C hfavI9WxhitfYVsfo2kr3EotGG1MnW+cOkCIX68w+3ZS4nixZyJ/TBa7RcTDNr+gjbiGMtd9 pEddsOqYwOs=

Cc: Juergen Gross <jgross@xxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Martin Pohlack <mpohlack@xxxxxxxxx>, wipawel@xxxxxxxxx, Julien Grall <julien.grall@xxxxxxx>, David Woodhouse <dwmw@xxxxxxxxxxxx>, "Martin Mazein\(amazein\)" <amazein@xxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Julian Stecklina <jsteckli@xxxxxxxxx>, Bjoern Doebel <doebel@xxxxxxxxx>

Delivery-date: Mon, 18 Feb 2019 13:50:05 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Openpgp: preference=signencrypt

On 2/15/19 11:34, Jan Beulich wrote: >>>> On 15.02.19 at 10:55, <nmanthey@xxxxxxxxx> wrote: >> On 2/13/19 12:50, Jan Beulich wrote: >>>>>> On 08.02.19 at 14:44, <nmanthey@xxxxxxxxx> wrote: >>>> Guests can issue grant table operations and provide guest controlled >>>> data to them. This data is also used for memory loads. To avoid >>>> speculative out-of-bound accesses, we use the array_index_nospec macro >>>> where applicable. However, there are also memory accesses that cannot >>>> be protected by a single array protection, or multiple accesses in a >>>> row. To protect these, a nospec barrier is placed between the actual >>>> range check and the access via the block_speculation macro. >>>> >>>> As different versions of grant tables use structures of different size, >>>> and the status is encoded in an array for version 2, speculative >>>> execution might touch zero-initialized structures of version 2 while >>>> the table is actually using version 1. >>> Why zero-initialized? Did I still not succeed demonstrating to you >>> that speculation along a v2 path can actually overrun v1 arrays, >>> not just access parts with may still be zero-initialized? >> I believe a speculative v2 access can touch data that has been written >> by valid v1 accesses before, zero initialized data, or touch the NULL >> page. Given the macros for the access I do not believe that a v2 access >> can touch a page that is located behind a page holding valid v1 data. > I've given examples before of how I see this to be possible. Would > you mind going back to one of the instances, and explaining to me > how you do _not_ see any room for an overrun there? Having > given examples, I simply don't know how else I can explain this to > you without knowing at what specific part of the explanation we > diverge. (And no, I'm not excluding that I'm making up an issue > where there is none.) What we want to real out is that the actually use version1, while speculation might use version2, right? I hope you refer to this example of your earlier email. On 1/29/19 16:11, Jan Beulich wrote: > Let's look at an example: gref 256 points into the middle of > the first page when using v1 calculations, but at the start > of the second page when using v2 calculations. Hence, if the > maximum number of grant frames was 1, we'd overrun the > array, consisting of just a single element (256 is valid as a > v1 gref in that case, but just out of bounds as a v2 one). From how I read your example and my explanation, the key difference is in the size of the shared_raw array. In case gref 256 is a valid v1 handle, then the shared_raw array has space for at least 256 entries, as shared_raw was allocated for the number of requested entries. The access to shared_raw is controlled with the macro shared_entry_v2: 222 #define SHGNT_PER_PAGE_V2 (PAGE_SIZE / sizeof(grant_entry_v2_t)) 223 #define shared_entry_v2(t, e) \ 224 ((t)->shared_v2[(e)/SHGNT_PER_PAGE_V2][(e)%SHGNT_PER_PAGE_V2]) Since the direct access to the shared_v2 array depends on the SHGNT_PER_PAGE_V2 value, this has to be less than the size of that array. Hence, shared_raw will not be overrun (neither for version 1 nor version 2). However, this division might result in accessing an element of shared_raw that has not been initialized by version1 before. However, right after allocation, shared_raw is zero initialized. Hence, this might result in an access of the NULL page. The second access in the macro allows to access only a single page, as the value e is bound to the elements per page of the correct version (the version 1 macro uses the corresponding value for the modulo operation). Either, this refers to the NULL page, or it refers to a page that has been initialized by version1 (partially). I do not see how an out-of-bound access would be possible there. >>>> @@ -963,9 +965,13 @@ map_grant_ref( >>>> PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n", >>>> op->ref, rgt->domain->domain_id); >>>> >>>> + /* Make sure the above check is not bypassed speculatively */ >>>> + block_speculation(); >>>> + >>>> act = active_entry_acquire(rgt, op->ref); >>>> shah = shared_entry_header(rgt, op->ref); >>>> - status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, >>>> op->ref); >>>> + status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags >>>> + : &status_entry(rgt, >>>> op->ref); >>> Did you consider folding the two pairs of fences you emit into >>> one? Moving up the assignment to status ought to achieve this, >>> as then the block_speculation() could be dropped afaict. >>> >>> Then again you don't alter shared_entry_header(). If there's >>> a reason for you not having done so, then a second fence >>> here is needed in any event. >> Instead of the block_speculation() macro, I can also protect the op->ref >> usage before evaluate_nospec via the array_index_nospec function. > That's an option (as before), but doesn't help with shared_entry_header()'s > evaluation of gt_version. That would have to be protected separately locally in that function, similarly to the nr_grant_entries function. > >>> What about the version check in nr_grant_entries()? It appears >>> to me as if at least its use in grant_map_exists() (which simply is >>> the first one I've found) is problematic without an adjustment. >>> Even worse, ... >>> >>>> @@ -1321,7 +1327,8 @@ unmap_common( >>>> goto unlock_out; >>>> } >>>> >>>> - act = active_entry_acquire(rgt, op->ref); >>>> + act = active_entry_acquire(rgt, array_index_nospec(op->ref, >>>> + >> nr_grant_entries(rgt))); >>> ... you add a use e.g. here to _guard_ against speculation. >> The adjustment you propose is to exchange the switch statement in >> nr_grant_entries with a if( evaluate_nospec( gt->gt_version == 1 ), so >> that the returned values are not speculated? > At this point I'm not proposing a particular solution. I'm just > putting on the table an issue left un-addressed. I certainly > wouldn't welcome converting the switch() to an if(), even if > right now there's no v3 on the horizon. (It's actually quite > the inverse: If someone came and submitted a patch to change > the various if()-s on gt_version to switch()-es, I'd welcome this.) I am happy to add block_speculation() macros into each case of the switch statement. >> Already before this >> modification the function is called and not inlined. > How does this matter when considering speculation? Likely not at all. > >> Do you want me to >> cache the value in functions that call this method regularly to avoid >> the penalty of the introduced lfence for each call? > That would go back to the question of what good it does to > latch value into a local variable when you don't know whether > the compiler will put that variable in a register or in memory. > IOW I'm afraid that to be on the safe side there's no way > around the repeated LFENCEs. The difference here would be that in case the value is stored into a local variable (independently of memory or register) and an lfence was executed, this value can be trusted and does not have to be checked again, as it's no longer guest controlled. > >>> And what about _set_status(), unmap_common_complete(), >>> gnttab_grow_table(), gnttab_setup_table(), >>> release_grant_for_copy(), the 2nd one in acquire_grant_for_copy(), >>> several ones in gnttab_set_version(), gnttab_release_mappings(), >>> the 3rd one in mem_sharing_gref_to_gfn(), gnttab_map_frame(), >>> and gnttab_get_status_frame()? >> Protecting the function itself should allow to not modify the >> speculation guards in these functions. I would have to check each of >> them, whether the guest actually has control, and whether it makes sense >> to introduce a _nospec variant of the nr_grant_entries function to not >> penalize everywhere. Do you have an opinion on this? > As per above, yes, I think the only sufficiently reliable way is > to modify nr_grant_entries() itself. The question is how to > correctly do this without replacing the switch() there, the more > that the other change of yours has deliberately forced the > compiler into using rows of compares instead of jump tables (not > that I'd expect the compiler to have used a jump table here, i.e. > the remark is just wrt the general case). As explained above, using the block_speculation() macro in each case statement should result in similar code than converting the switch into a chain of if statements that are protected by evaluate_nospec macros. Best, Norbert Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich Ust-ID: DE 289 237 879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.