[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 07/12] fuzz/x86_emulate: Move all state into fuzz_state



On 10/10/2017 07:20 PM, Andrew Cooper wrote:
> On 10/10/17 17:20, George Dunlap wrote:
>> This is in preparation for adding the option for a more "compact"
>> interpretation of the fuzzing data, in which we only change select
>> bits of the state.
>>
>> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxx>
>> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> v3:
>>  - Move DATA_OFFSET inside the structure
>>  - Remove a stray blank line
>> v2: Port over previous changes
>>
>> CC: Ian Jackson <ian.jackson@xxxxxxxxxx>
>> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
>> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> CC: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>>  tools/fuzz/x86_instruction_emulator/fuzz-emul.c | 89 
>> +++++++++++++------------
>>  1 file changed, 45 insertions(+), 44 deletions(-)
>>
>> diff --git a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c 
>> b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> index 8998f21fe1..20d52b33f8 100644
>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> @@ -24,14 +24,8 @@
>>  /* Layout of data expected as fuzzing input. */
>>  struct fuzz_corpus
>>  {
>> -    unsigned long cr[5];
>> -    uint64_t msr[MSR_INDEX_MAX];
>> -    struct cpu_user_regs regs;
>> -    struct segment_register segments[SEG_NUM];
>> -    unsigned long options;
>>      unsigned char data[4096];
>>  } input;
>> -#define DATA_OFFSET offsetof(struct fuzz_corpus, data)
>>  
>>  /*
>>   * Internal state of the fuzzing harness.  Calculated initially from the 
>> input
>> @@ -39,7 +33,14 @@ struct fuzz_corpus
>>   */
> 
> You've invalidated a number of the comments describing behaviours,
> including the description of the difference between fuzz_state and
> fuzz_corpus.

Well completely apart from the 'compact' format, I think this move makes
sense.  The state moved is actually the state of the "emulated cpu" --
the emulator actually modifies this state as instructions are executed.
I think it makes sense to keep the "current state of the virtual
processor" separate from "input we get from a file".

In fact, the comment above this says:  "Internal state of the fuzzing
harness.  Calculated initially from the input corpus, and later
[mutated] by the emulation callbacks."

This actually makes that comment *more* true, since before this patch
almost the only state modified by the emulation callbacks was actually
in fuzz_corpus, not fuzz_state.

> Why do you need to move any of this in the first place?  If you insist
> on keeping the compact mode, what's wrong with conditionally reading the
> AFL input into either fuzz_copus entirely, or into fuzz_corpus.data[]
> and then conditionally deriving this state?

I don't see any advantage of that.  In fact it would mean that the name
"input" and "fuzz_corpus" even more misleading than they are before this
patch.

> That way, you don't block work to fix the root cause, which needs to end
> up with architectural values in fuzz_state, derived from a bitfields in
> fuzz_corpus.

x86_emulate() needs a cpu_user_regs structure; so if you want the data
in fuzz_corpus not to look exactly like cpu_user_regs, then you'll need
to have another place to put cpu_user_regs, and "populate" it based on
the data (whatever that looks like).  The most obvious thing to do is to
is to do exactly what I've done -- place cpu_user_regs inside fuzz_state.

The same thing is true for the other bits of data -- the read_* and
write* callbacks need "unpacked" state which they can read and modify.
The most obvious thing to do is to have arrays in fuzz_state which they
can simply read and write, and to populate them based on whatever
structure "compact" structure you end up with.

If you want to re-introduce a more compact structure format for the
input file, there are lots of options.  We can make fuzz_corpus into a
union.  We can have 'input' be a pure char array, and cast it into
several different structures depending on how we want to interpret it.
Or, we can define a local structure on a stack and "read" from the main
input file via input_read() and friends.

This patch makes all of those options easier, not harder.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.