[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86emul/fuzz: adjust canonicalization in sanitize_input()
On 29/03/2019 19:20, George Dunlap wrote: > >> On Mar 29, 2019, at 4:14 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >> >>>>> On 29.03.19 at 16:42, <George.Dunlap@xxxxxxxxxx> wrote: >>>> On Mar 29, 2019, at 3:23 PM, Jan Beulich <JBeulich@xxxxxxxx> wrote: >>>>>>> On 29.03.19 at 16:14, <George.Dunlap@xxxxxxxxxx> wrote: >>>>> FAOD: >>>>> 1. I don’t oppose this, but >>>>> 2. I don’t support it either; however, >>>>> 3. I don’t think my Ack is necessary. >>>> Well, preferably I would address your concerns despite 3. So could >>>> you clarify what you would suggest instead? Keep things as they >>>> are? Drop all canonicalization? I've basically tried to find a middle >>>> ground between the two extremes. >>> I appreciate that. :-) But the main reason I wrote this was #3: I didn’t >>> want my silence interpreted as a nack. >>> >>> I don’t think it will help fuzzing to remove canonicalization of ebp; it >>> may >>> help to have it in. In fact I’d prefer to CANONICALIZE_MAYBE() more >>> registers. >> But canonicalization removes potentially interesting bits from fuzzed >> input, which is liable to be relevant if a register is used for other than >> a base address in an effective address calculation. As an example, >> take BSR: You'd remove the possibility to get results in the range >> [48,62]. Or take the XSA-195 case: The memory range covered by >> BT{,C,R,S} is dramatically much smaller when the register holding the >> bit offset first got canonicalized. >> >> Granted the canonicalization is conditional, so it wouldn't be making it >> entirely impossible to get into such a state, but since fuzzing is all >> about likelihood, we'd like to avoid reducing our chances of hitting >> interesting cases. > OK, since you want to discuss it, let’s discuss it. :-) > > Just to summarize everything: > > “Canonical” addresses must have bits 48-63 identical to bit 47. This means > the chance of any “random” 64-bit number being canonical is 1 in 65536 (2^16). As with everything, this is more complicated. With 5-level paging support in IceLake hardware, we now get a different type of canonical address, where the sign extension is from bit 57, rather than 48. In practice, that now means that, depending on CR4.LA57 (which is the control bit), we may have a 1 in 2^7 chance of being canonical. > > A large number of registers (rax, rbp, &c) *can* be used as memory addresses; > a handful of registers (esp, rip) are designated to be used primarily for > that purpose. > > It is asserted that the canonicity* of any registers has no effect on the > codepaths the emulator takes; it simply does calculations and passes the > results to various callbacks (e.g., insn_fetch, read, write, &c). It is > these callbacks which, in Xen, check the calculated linear address for > canonicity and return an error for non-canonical addresses. Correct. Registers are just numbers without any specific meaning. It is only when considering a resulting memory operand (which can be derived from multiple registers and embedded constants) that canonicity* is relevant. As said before, we'll shortly gain a second type of canonical address, but at the end of the day, all it amounts to is "is this a valid linear address or not". > [* This is the form of this word my spell-checker doesn’t red-underline] [* My spell-checker doesn't like it, but meh - the meaning is clear] > However, the whole point of testing is to find places where your assumptions > are violated. If the emulator ever *did* behave differently for canonical > and non-canonical addresses, or near the boundary of canonicity, we’d want > those behaviors to be tested. In fact, it’s *possible* that the emulator > already does have this kind of behavior in it, and we just haven’t realized > it. Accepting this point, we do have a couple of places in x86_emulate.c which check canonicity. * validate_far_branch() * protmode_load_seg() * LIDT/LGDT * SYSEXIT * WR{FS,GS}BASE validate_far_branch() is strictly on %rip. protmode_load_seg() and LIDT/LGDT are both to do with the content about to be loaded into a system segment. WR{FS,GS}BASE is for loading into a user segment. These should arguably be deferred to the ->write_segment() hook. The use in SYSEXIT is explicitly documented in the manual, but they shouldn't be necessary. The check for %rdx will be done as part of validate_far_branch(), whereas SYSEXIT shouldn't care about %rcx, and be able to load a non-canonical %rsp. In some copious free time, I'll try seeing how hardware actually behaves. All of these uses result in a difference in behaviour which ALF is liable to pick up on. I'm afraid that I'm not sufficiently caffeinated to work out if/how this affects the rest of your reasoning. > Fuzzing is probabilistic; theoretically, if we run the fuzzer long enough, > we’ll hit all “interesting” inputs. But practically speaking, “focusing” > the fuzzing such that “interesting” inputs are found more quickly seems like > a good idea. > > AFL’s fuzzing takes two modes: > * “Deterministic” fuzzing, where individual bits are flipped, bytes / words / > &c are incremented and decremented, bytes / words / &c are replaced with > “interesting” numbers like 0, 1, -1, and so on > * “Random” fuzzing, where completely random changes are made: random values > added, random bits of the file removed, random bits of the file duplicated, &c > > At the moment, all registers can be modified at start to any random 64-bit > value. But there’s also a bitfield near the beginning that controls whether > some of those registers (namely rip, rsp, and rbp) are “canonicalized”: If > the register’s bitfield is 1, bits 48-63 will be set to the value of bit 47; > otherwise the register will be left alone. This allows AFL to choose whether > the register will be canonicalized or not. > > Effectively, a “maybe canonicalized” register has a 50% chance (plus change) > of being “canonical”, and a 50% chance (minus a bit) of being non-canonical. > > This canonicalization has several effects. > > First of all, it means that any interesting “canonical” addresses (if they > exist) has approximately a 65536x higher chance of being selected; or to put > it a different way, AFL will find any interesting “canonical” addresses > 65536x faster than otherwise. > > On the other hand, any interesting non-canonical addresses (or non-address > values) have a 50% lower chance of being selected; AFL will take twice as > long to find an interesting “non-canonical” / non-address values address as > it would otherwise. > > The other effect, at the moment, is to add another branch to fuzz-emulate.c. > This means that the very act of adding a select-able bit to canonicalize (or > not) a register adds another path that AFL will consider “interesting”; given > two otherwise-identical inputs, A and A’, where one is canonicalized and the > other not, AFL will consider them two different test cases, *even if they go > through the same paths in x86_emulate()*. It will keep them both in its > “queue", and will spend time mutating them both. > > For random fuzzing, even if x86_emulate() is the same, I don’t think this > will be wasted effort: 5 seconds fuzzing A and then 5 seconds fuzzing A’ will > be the same as 10 seconds of fuzzing A. It may skew AFL’s priorities a bit > by fuzzing something for 10 seconds rather than 5, but it still has a chance > of finding new data. > > For deterministic fuzzing, if x86_emulate() is the same, then there is > duplicated work: the 5 seconds spent fuzzing A’ will go through exactly the > same sets of mutations as the 5 seconds spent fuzzing A. > > (It may be possible to exclude fuzz-emul.c from instrumentation, in which > case this effect would go away.) > > Now, we have two possibilities: Either canonicity does not and never will > affect x86_emulate(); or it may, either now or later. > > And we have a few choices: We maybe-canonicalize no registers; we > maybe-canonicalize only registers that are almost always used as an address; > or we maybe-canonicalize more (maybe all) registers. > > Let us call these C0, C1, and C2 (for “canonicity matters, and we > canonicalize none, some, or most, respectively"), and N0, N1, and N2 (for > “canonicity doesn’t mater, and we canonicalize none, some, or most > respectively”). (And remember, N vs C is a fact about the world that we > don’t know; and 0-2 are options we can choose.) > > N0: Everything is the fastest: “interesting” non-address values are found the > fastest, AFL isn’t fooled into double-random-fuzzing some entries, AFL > doesn’t waste time double-deterministic-fuzzing entries. > > N1: “Interesting” non-address values take twice as long to find; but since > the registers in question are used almost exclusively for addresses, this is > less important. Canonicity has no effect, so it doesn’t matter whether we > choose canonical or non-canonical values for the registers. We do spend > twice as long random-fuzzing entries (for some benefit), and we twice as much > time deterministically fuzzing some entries to no benefit; but only when > fuzzing these address-mostly registers. > > N2: “Interesting” non-address values take twice as long to find. We spend > twice as long random-fuzzing entries (to some benefit), and twice as long > deterministically fuzzing some entries (to no benefit). > > C0: Finding “interesting” non-address values is as fast as N0. However, > finding “interesting” address values take 65536 times as long. Extra time > spent fuzzing finds useful differences. > > C1: “Interesting” non-address values for address-mostly registers take twice > as long to find; but “interesting” address-values for mostly-address > registers take about the same amount of time to find (i.e., 65536x faster > than C0). “Interesting” address values for other registers take 65536x times > as long to find. Again, extra fuzzing time is spent usefully. > > C2: “Interesting” non-address values for all registers take twice as long to > find; but “interesting” address values for all registers take that same > amount of time. Again, extra fuzzing time is spent usefully. > > If N is true, and we go with option 2, it will take on average 2 hours to > find an “interesting" value that would have taken 1 hour with option 0. > > If C is true, and we go with 0, it will take on average 65536 hours (7.48 > years) to find an “interesting” value that would have taken 1 hour in option > 2. > > I think the chances of C are less than 50%; but I think it's certainly higher > than 1 in 65536. > > So given that we don’t know whether we’re in case N or C, from a > cost/benefits analysis, I think option 1 is a minimum; and that 2 is an > obvious choice — particularly if we can get rid of the “wasted” fuzzing, > perhaps by not instrumenting fuzz-emul.c. > > Obviously there’s a lot of hand-waving and guessing here. My understanding > is that in practice, ebp is *mostly* used as an address and not a > general-purpose register, which is why I maybe-canonicalized it in the first > place, and would rather it remain so (option 1). But I can see the arguments > for classifying it as a general-purpose register; so as long as rip and rsp > are maybe-canonicalized, I won’t spend a lot of time arguing against this > patch. There is a selection bias here. It is true that, with -fno-omit-frame-pointer, %rbp is always an address, but we aren't considering compiler-crafted code here. To a first approximation, AFL will generate instruction fragments with a uniform probability of which of the 8 registers are encoded (or 16 for 64bit, but lets take the simple path here). Therefore, %rbp will be encoded with a 1/8th probability, as will %rsp, and %rax, etc. At this point, what matters is how many instructions use %rbp/%rsp as an implicit memory operand. %rsp is used by all stack operations, which is a large quantity of the encoding space. I can't think of any instruction which uses %rbp in this way. ENTER/LEAVE/PUSHA/POPA use/modify it, but only in its integer form - not as a memory address. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |