[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 09/12] fuzz/x86_emulate: Make input more compact



> On Oct 10, 2017, at 6:31 PM, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx> wrote:
> 
> On 10/10/17 18:13, George Dunlap wrote:
>> On 10/10/2017 06:11 PM, Andrew Cooper wrote:
>>> On 10/10/17 18:01, George Dunlap wrote:
>>>> On 10/10/2017 05:59 PM, Andrew Cooper wrote:
>>>>> On 10/10/17 17:20, George Dunlap wrote:
>>>>>> At the moment, AFL reckons that for any given input, 87% of it is
>>>>>> completely irrelevant: that is, it can change it as much as it wants
>>>>>> but have no impact on the result of the test; and yet it can't remove
>>>>>> it.
>>>>>> 
>>>>>> This is largely because we interpret the blob handed to us as a large
>>>>>> struct, including CR values, MSR values, segment registers, and a full
>>>>>> cpu_user_regs.
>>>>>> 
>>>>>> Instead, modify our interpretation to have a "set state" stanza at the
>>>>>> front.  Begin by reading a 16-bit value; if it is lower than a certain
>>>>>> threshold, set some state according to what byte it is, and repeat.
>>>>>> Continue until the byte is above a certain threshold.
>>>>>> 
>>>>>> This allows AFL to compact any given test case much smaller; to the
>>>>>> point where now it reckons there is not a single byte of the test file
>>>>>> which becomes irrelevant.  Testing have shown that this option both
>>>>>> allows AFL to reach coverage much faster, and to have a total coverage
>>>>>> higher than with the old format.
>>>>>> 
>>>>>> Make this an option (rather than a unilateral change) to enable
>>>>>> side-by-side performance comparison of the old and new formats.
>>>>>> 
>>>>>> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxx>
>>>>> I am still of the opinion that this is a waste of effort, which would be
>>>>> better spent actually removing the irrelevant state in the first place;
>>>>> not building an obfuscation algorithm.
>>>>> 
>>>>> I'm not going to nack the patch because that is probably over the top,
>>>>> but I'm not in favour if this change going in.
>>>> Did you look at the evidence I presented, demonstrating that this
>>>> significantly increases the effectiveness of AFL?
>>> I can easily believe that you've found an obfucation algorithm which
>>> does better than the current state layout.
>>> 
>>> I do not believe that any amount of obfuscation will be better than
>>> actually fixing the root cause of the problem; that the current state
>>> really is mostly irrelevant, and can easily be shrunk.
>> Right; well I've already explained why I don't think "obfuscation" is
>> the right term.
> 
> 

> How else would describe it?  You are purposefully hiding the structure
> of the data by doing conditional initialisation based on earlier values.

Consider the following progression:

1. Have a big structure, which includes RAX, and have the corpus be loaded into 
the whole thing (meaning we always fuzz the whole thing).

2. Have <state, value> tuples which will set specific state to specific values; 
e.g., <FUZZ_rax, 0xfefefefefefefefe>.  

3. Have <offset, value> tuples which write $value into a specific offset of the 
processor state (including RAX).

I would consider #3 a generalized form of #2.


>> For the time being, we have something which improves
>> efficiency;
> 
> How much of this measured efficiently is actually ALF finding paths
> around setup_state() rather than finding new paths in the emulator
> itself?  I can't think of a test which would isolate this properly.

Come now; we’re talking about thousands of unique “tuples” difference between 
“all paths found after 24 hours” for the compact and non-compact versions.  
(Like, 12k vs 8k.)  Do you really think there are 4000 unique tuples inside 
that one function?

We now have a way to generate gcov data from AFL test cases, so that’s a theory 
that can be tested now, if you care to do so.

> 
>> let's check it in now, and in the future if you or someone
>> else finds a way to fix it "properly" we can do that.
> 
> I wouldn't really mind so much if this change didn't make it harder to
> fix the root cause.  As it is, your prerequisite patch prohibits any
> ability to overlay a minimal structure over the fuzzing corpus.

I’d be interested to know what you mean by “a minimal structure” (and 
“architectural values in fuzz_state derived from bitfields in fuzz_corpus” in 
another email).

Consider the following three input formats.

1. We bits to fuzz_state->options for whether we should set the GPRs to fuzzed 
data or not.  If, for instance, the FUZZ_rax bit is set, then we read 8 bytes 
out of the fuzz state into RAX as part of the setup.

2. A format similar to what I have, but we have an explicit enumeration.  Read 
“fuzz_target” byte.  If fuzz_target == FUZZ_rax, then you read 64 bytes and 
write it into regs->rax.

3. The format I have: read an offset, write bytes into that offset.

Clearly there will be similarly ‘compact’ corpus files in all three that 
specify: “Set RAX to 0xfefefefefefefefe, then execute mov RAX, RBX”.  In all 
three formats AFL can efficiently mutate the corpus to setting RAX to 
0xffffffffffffffff, or RBX to 0xfefefefefefefefe, or running “mov RAX, RDX” 
instead.  

From a fuzzing perspective, I think #2 and #3 are almost identical.

One advantage of #2 is that it could in theory align more closely with AFL’s 
“deterministic” fuzzing steps: If the value that ends up being written into RAX 
always stored as an aligned 64-bit number inside the file, for example, then 
maybe AFL’s “arithmetic” and “special value” heuristics might be more effective 
at finding interesting testcases, rather than relying mainly on “havoc” finding 
those same testcases by random bit-mashing.

One *disadvantage* of #2 is that it will only fuzz the parts of the data 
structure you expect to be set.  #3 will set *all* parts of your cpu structure, 
even the parts you don’t think are relevant.  From a “finding bugs in the 
emulator” perspective, I think #3 is clearly superior.

See the response I gave to Jan about clearing the upper bits of GPRs in 16- or 
32-bit mode. Yes, in theory the upper bits should have no effect on the outcome 
of the emulation.  But if they *did* have an effect, we’d want to know, 
wouldn’t we?  Similarly, yes, in theory bits of the cpu structure that aren’t 
used shouldn’t have any effect on the emulation.  But if they *did*, we’d want 
to know: it would definitely be a bug, and we’d want to make sure that nobody 
could trigger that state 

I’m not sure if #2 is anything like you had in mind, but it wouldn’t actually 
be very difficult to morph what I’ve done here into it: we already have a 
“range” of “offset” values for MSRs, segment registers, and so on; all it would 
take is adding specifically-enumerated state into that list.

If that’s not your idea, all I can say is this: the primary way the more 
“compact” state increases effectiveness is by reducing file size.  Anything 
that doesn’t reduce file size will, I predict, be ineffective.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.