Xen project Mailing List

Re: [Xen-devel] [PATCH] x86emul/fuzz: adjust canonicalization in sanitize_input()

From: George Dunlap <George.Dunlap@xxxxxxxxxx>

Date: Fri, 29 Mar 2019 19:20:45 +0000

Accept-language: en-GB, en-US

Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>

Delivery-date: Fri, 29 Mar 2019 19:20:52 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHU5j8gEhzS6m3hh0GrS2hly9rEx6YiptKAgAACkwCAAAVIAIAACSSAgAAz8IA=

Thread-topic: [PATCH] x86emul/fuzz: adjust canonicalization in sanitize_input()

> On Mar 29, 2019, at 4:14 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote: > >>>> On 29.03.19 at 16:42, <George.Dunlap@xxxxxxxxxx> wrote: >>> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>>>>> On 29.03.19 at 16:14, <George.Dunlap@xxxxxxxxxx> wrote: >>>> FAOD: >>>> 1. I don’t oppose this, but >>>> 2. I don’t support it either; however, >>>> 3. I don’t think my Ack is necessary. >>> >>> Well, preferably I would address your concerns despite 3. So could >>> you clarify what you would suggest instead? Keep things as they >>> are? Drop all canonicalization? I've basically tried to find a middle >>> ground between the two extremes. >> >> I appreciate that. :-) But the main reason I wrote this was #3: I didn’t >> want my silence interpreted as a nack. >> >> I don’t think it will help fuzzing to remove canonicalization of ebp; it may >> help to have it in. In fact I’d prefer to CANONICALIZE_MAYBE() more >> registers. > > But canonicalization removes potentially interesting bits from fuzzed > input, which is liable to be relevant if a register is used for other than > a base address in an effective address calculation. As an example, > take BSR: You'd remove the possibility to get results in the range > [48,62]. Or take the XSA-195 case: The memory range covered by > BT{,C,R,S} is dramatically much smaller when the register holding the > bit offset first got canonicalized. > > Granted the canonicalization is conditional, so it wouldn't be making it > entirely impossible to get into such a state, but since fuzzing is all > about likelihood, we'd like to avoid reducing our chances of hitting > interesting cases. OK, since you want to discuss it, let’s discuss it. :-) Just to summarize everything: “Canonical” addresses must have bits 48-63 identical to bit 47. This means the chance of any “random” 64-bit number being canonical is 1 in 65536 (2^16). A large number of registers (rax, rbp, &c) *can* be used as memory addresses; a handful of registers (esp, rip) are designated to be used primarily for that purpose. It is asserted that the canonicity* of any registers has no effect on the codepaths the emulator takes; it simply does calculations and passes the results to various callbacks (e.g., insn_fetch, read, write, &c). It is these callbacks which, in Xen, check the calculated linear address for canonicity and return an error for non-canonical addresses. [* This is the form of this word my spell-checker doesn’t red-underline] However, the whole point of testing is to find places where your assumptions are violated. If the emulator ever *did* behave differently for canonical and non-canonical addresses, or near the boundary of canonicity, we’d want those behaviors to be tested. In fact, it’s *possible* that the emulator already does have this kind of behavior in it, and we just haven’t realized it. Fuzzing is probabilistic; theoretically, if we run the fuzzer long enough, we’ll hit all “interesting” inputs. But practically speaking, “focusing” the fuzzing such that “interesting” inputs are found more quickly seems like a good idea. AFL’s fuzzing takes two modes: * “Deterministic” fuzzing, where individual bits are flipped, bytes / words / &c are incremented and decremented, bytes / words / &c are replaced with “interesting” numbers like 0, 1, -1, and so on * “Random” fuzzing, where completely random changes are made: random values added, random bits of the file removed, random bits of the file duplicated, &c At the moment, all registers can be modified at start to any random 64-bit value. But there’s also a bitfield near the beginning that controls whether some of those registers (namely rip, rsp, and rbp) are “canonicalized”: If the register’s bitfield is 1, bits 48-63 will be set to the value of bit 47; otherwise the register will be left alone. This allows AFL to choose whether the register will be canonicalized or not. Effectively, a “maybe canonicalized” register has a 50% chance (plus change) of being “canonical”, and a 50% chance (minus a bit) of being non-canonical. This canonicalization has several effects. First of all, it means that any interesting “canonical” addresses (if they exist) has approximately a 65536x higher chance of being selected; or to put it a different way, AFL will find any interesting “canonical” addresses 65536x faster than otherwise. On the other hand, any interesting non-canonical addresses (or non-address values) have a 50% lower chance of being selected; AFL will take twice as long to find an interesting “non-canonical” / non-address values address as it would otherwise. The other effect, at the moment, is to add another branch to fuzz-emulate.c. This means that the very act of adding a select-able bit to canonicalize (or not) a register adds another path that AFL will consider “interesting”; given two otherwise-identical inputs, A and A’, where one is canonicalized and the other not, AFL will consider them two different test cases, *even if they go through the same paths in x86_emulate()*. It will keep them both in its “queue", and will spend time mutating them both. For random fuzzing, even if x86_emulate() is the same, I don’t think this will be wasted effort: 5 seconds fuzzing A and then 5 seconds fuzzing A’ will be the same as 10 seconds of fuzzing A. It may skew AFL’s priorities a bit by fuzzing something for 10 seconds rather than 5, but it still has a chance of finding new data. For deterministic fuzzing, if x86_emulate() is the same, then there is duplicated work: the 5 seconds spent fuzzing A’ will go through exactly the same sets of mutations as the 5 seconds spent fuzzing A. (It may be possible to exclude fuzz-emul.c from instrumentation, in which case this effect would go away.) Now, we have two possibilities: Either canonicity does not and never will affect x86_emulate(); or it may, either now or later. And we have a few choices: We maybe-canonicalize no registers; we maybe-canonicalize only registers that are almost always used as an address; or we maybe-canonicalize more (maybe all) registers. Let us call these C0, C1, and C2 (for “canonicity matters, and we canonicalize none, some, or most, respectively"), and N0, N1, and N2 (for “canonicity doesn’t mater, and we canonicalize none, some, or most respectively”). (And remember, N vs C is a fact about the world that we don’t know; and 0-2 are options we can choose.) N0: Everything is the fastest: “interesting” non-address values are found the fastest, AFL isn’t fooled into double-random-fuzzing some entries, AFL doesn’t waste time double-deterministic-fuzzing entries. N1: “Interesting” non-address values take twice as long to find; but since the registers in question are used almost exclusively for addresses, this is less important. Canonicity has no effect, so it doesn’t matter whether we choose canonical or non-canonical values for the registers. We do spend twice as long random-fuzzing entries (for some benefit), and we twice as much time deterministically fuzzing some entries to no benefit; but only when fuzzing these address-mostly registers. N2: “Interesting” non-address values take twice as long to find. We spend twice as long random-fuzzing entries (to some benefit), and twice as long deterministically fuzzing some entries (to no benefit). C0: Finding “interesting” non-address values is as fast as N0. However, finding “interesting” address values take 65536 times as long. Extra time spent fuzzing finds useful differences. C1: “Interesting” non-address values for address-mostly registers take twice as long to find; but “interesting” address-values for mostly-address registers take about the same amount of time to find (i.e., 65536x faster than C0). “Interesting” address values for other registers take 65536x times as long to find. Again, extra fuzzing time is spent usefully. C2: “Interesting” non-address values for all registers take twice as long to find; but “interesting” address values for all registers take that same amount of time. Again, extra fuzzing time is spent usefully. If N is true, and we go with option 2, it will take on average 2 hours to find an “interesting" value that would have taken 1 hour with option 0. If C is true, and we go with 0, it will take on average 65536 hours (7.48 years) to find an “interesting” value that would have taken 1 hour in option 2. I think the chances of C are less than 50%; but I think it's certainly higher than 1 in 65536. So given that we don’t know whether we’re in case N or C, from a cost/benefits analysis, I think option 1 is a minimum; and that 2 is an obvious choice — particularly if we can get rid of the “wasted” fuzzing, perhaps by not instrumenting fuzz-emul.c. Obviously there’s a lot of hand-waving and guessing here. My understanding is that in practice, ebp is *mostly* used as an address and not a general-purpose register, which is why I maybe-canonicalized it in the first place, and would rather it remain so (option 1). But I can see the arguments for classifying it as a general-purpose register; so as long as rip and rsp are maybe-canonicalized, I won’t spend a lot of time arguing against this patch. This was very long — I hope it makes sense. :-) Peace, -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.