[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Notes from Design Summit Hypervisor Fuzzing Session



2017-07-21 15:15 GMT+02:00 Lars Kurth <lars.kurth.xen@gmail.com>:
Hi all,
please find attached my notes. A lot of it went over my head, so I may have gotten things wrong and some are missing
Feel free to modify, chip in, clarify, as needed
Lars

Session URL: http://sched.co/AjHN

OPTION 1: Userspace Approach
============================

 Dom0  Domu
[AFL] [VM nested with Xen and XTF]
[Xen                             ]

Would need
1. nested HVM support
2. VM forking

Like in the classic AFL scenario, one could feed test cases to the forked VMs and always have a completely identical environments for each test case. The VM's would run on top of a hypervisor and each again contain a hypervisor (thus nested) against which the test cases are executed.


Not an option, because both features are not implemented/broken (and are difficult to implement)

OPTION 2:
=========

The alternative is to reuse the same hypervisor for the test cases. This is the approach currently taken in my GSoC project.

A drawback in comparison to option 1 is that the environment will be "dirty" from previous test cases, as the same hypervisor is reused. That might make it harder to isolate bugs.
 

 Dom0            DomU
[AFL   ]        [VM XTF   ]
[      ] <----> [  [e]    ] e = executor
   /\              ||
   ||              \/
[Xen                      ]

This approach would need

1. Tracing (instrument binary and write to shared memory for AFL)

Almost done, but not completely deterministic yet: 

Decision if to compile code with tracing is made on a file-level, and certain files (e.g. locks) still have to be excluded.
 

2. Implemented a special hypercall that returns return code that can be converted into expected AFL output for branching info

Submitted

This is the same as 1.
 

3. Communication channel between AFL and XTF

needed to transmit test case from XTF to AFL and for synchronisation
 

Almost done

4. Using XTF because it should be the fastest option and allows us to restrict the scope of what to fuzz

Key challenge: not making unnecessary indeterministic hyper calls in the background
Use of XTF constrains the degrees of freedom and focusses the fuzzing

5. Need some way to feed info back into AFL

The original plan was to make the tracing hypercall in XTF and to use the communication channel from 3. to transmit the info to AFL. Instead the tracing hypercall is now made in AFL. This is faster and convenient as the hypercall already allows to trace a different domain. The only drawback is that one might also trace parts of the synchronisation between the domains, but this was considered negligible, at least for the initial version.


I believe there was some discussion around this, which I did not get
Discussion
==========

Dismissed Option 1. All agreed that Option 2 is best.

I missed quite a bit of this, because the discussion was quite fast at times

George:
recommends to test one thing at the time to reduce the problem space
Such as iteration, feedback, ...
Based on outcome iterate

There was a little bit of discussion around determinism:

Andy: blacklist shadop_??? with ??? = shutdown, suspend, watchdog, ...
shadop sched_op  

Also desched
Possibly there are some more functions that need to be blacklisted
This should help with determinism

Certain hypercalls will cause Xen to destroy the XTF domain, which would end the fuzzing, but not crash the hypervisor, so these hypercalls shouldn't be allowed.
 
Ian: To backup test cases on could use a network connection, as that should be faster than writing to disk. 

This approach was dropped in favour of just flushing the disk (easier to implement).
 
Andy: Going to have problems such as dealing with partial hypercall operations
Wei: Already included this - only 1 thread in XTF => deterministic
Andy: What happoens if HV gets interrupted
Juergen: put XTF into null scheduler pool to minimise risk of interrupts and increase determinism
Wei: That would exclude IRQs in such a scenario

There was a little bit of around feedback loop and protocol between AFL and XTF

Andy: easiest way to get a feedback loop starting. XTF to boot, wait on event channel (shadop call with - 0 timeout)
AFL does the hypercall with edge tracing, ...

Jurgen: starting measurement can be done be initiated AFL (Dom0), and disabled from XTF (DomU)
Wei: follow the same pattern as xl already does (I don't know the sample code though)

There was a bit of discussion on the impact pf QEMU

Wei: can't use QEMU to emulate a machine with vhdx (following on from a question by Ian)

Ian: this will be fast, not quite so reliable. But a good first step

The point here was stability. AFL expects the same hypercall to return the same tracing result every time it is executed. There is some indeterminism introduced due to the synchronisation and the fact that the same hypervisor is reused (see option 1 for a different approach), but that should initially be neglected.
 

And some other topics

Andy: there is also syzkaller, with fuzzing entity being some userspace calls
Wei: used as a reference material as Oracle did something similar

For further info one can also check out the mails between me and Wei in the mailing list archive.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.