Xen project Mailing List

Hi,

I'm Felix Schmoll, one of the GSoC students this year. Go Xen!

In order to begin I am herewith posting an implementation proposal for the first part of the project for comments.

==================================

1. Motivation and Description

==================================

Fuzzing is a recent trend for systematic testing of interfaces by trying more or less random inputs and iterating over them. A subset of fuzzers uses code-coverage as feedback when permuting and choosing inputs, among them the popular user-space fuzzer American Fuzzy Lop. Recently there have been attempts to port fuzzers to the kernel and in a similar manner should now the hypercall interface of Xen be tested.

While this is overall a very comprehensive problem this project will help to develop a better understanding of the problem space and make at least first advances of the source tree into the necessary direction. A generic mechanism will be implemented allowing fuzzers to obtain feedback on code-coverage. In the next step this output will be further processed in order to actually run a particular fuzzer (such as AFL), although there might not be sufficient time to commit this to the source tree.

To sum up, the overall steps to getting a fuzzer running are the following:

1. Extracting the execution path from the hypervisor via a hypercall

2. Parse the execution path into a format consumable by a user-space fuzzer

3. Drive a domU to execute the test cases of the fuzzer

This proposal is only concerned with how to extract the execution path.

==================================

2. Implementation Plan

==================================

2.1 Tracing

==================================

The gcc-6 fsanitize-coverage=trace_pc feature will be the foundation to implement the tracing necessary for the hypercall. It inserts a customisable function at every edge of the binary. By writing the current program counter to a buffer passed in from user-space this will allow a very detailed tracing in form of a sequence of program counters (PC's).

Care has also to be taken that the returned execution path contains only executions related to the domain that is being traced and hypercalls executed by it. Thus, only appropriate files will be compiled with the option and, for example, interrupts will be excluded.

==================================

2.1.1 Function content

==================================

The "struct domain" as defined in xen/include/xen/sched.h should be extended to include:

* a pointer to the trace buffer (NULL if domain is not traced)

* the next position to write to in the trace buffer

* size of the trace buffer

An alternative considered here was to have some sort of global array to store the data relevant for tracing in, but this limits the number of domains.

Pseudo code:

/* Check if the current domain is being traced and, if appropriate, write the program counter to the buffer. */

if(domain is traced && buffer not full) {

current_domain->trace_buffer[current_domain->trace_buffer_pos++] =

__builtin_return_address(0)

}

==================================

2.2 Hypercall-Interface

==================================

As stated in the preceding sections, a hypercall is needed to extract the execution path. The proposed interface is the following:

* @brief Traces the execution path of hypercalls executed by a domain.

* @param domain_id Domain whose execution path is supposed to be traced

* @param buffer Buffer to write program counters to

* @param size Size of the buffer

* @param mode, if to trace or to stop tracing

* @return Success or error in some form (e.g. number of PC’s written for success)

int trace_execution(int domain_id, int* buffer, int size, int mode);

This interface together with the previous snippet content seem to imply that some program counters of this hypercall might be included in the buffer (there will be edges between setting the buffer and returning to the kernel if a domain traces itself). For the purpose of fuzzing this doesn't matter as long as this is the same for all runs.

==================================

2.3 Adjustments to libxc

==================================

With this interface the only modification to libxc would be to add the new hypercall.

An alternative considered was to implement an event notification system which informs the trace hypercall when a hypercall starts and ends. One could then change the interface to just trace the next hypercall instead of tracing all hypercalls. This however involves changing the xencall functions and throws up some questions in regards to having multiple hypercalls at the same time. As long as the hypercall is used only for fuzzing a single hypercall at a time the difference should be irrelevant.

==================================

2.4 Build

==================================

Inserting even a single instruction at every edge is a rather costly operation in case the feature is never intended to be used. The tracing should thus be an optional build-feature that has to be explicitly enabled.

As mentioned before, there are further adjustments needed for the build system in order to compile only specific files with the option.

==================================

3. Expected Outcomes/Goals

==================================

This proposal outlines the steps for implementing coverage feedback. Overall the project aims to enable automated fuzzing of the hypervisor, which requires further steps as outlined in Section 1.

==================================

4. References

==================================

[1] Link to GSoC page of project: https://summerofcode.withgoogle.com/projects/#5585891117498368

[2] Link to originally suggested topic: https://wiki.xenproject.org/wiki/Outreach_Program_Projects#Fuzzing_Xen_hypercall_interface

Any comments appreciated,

Felix

[Xen-devel] [RFC] Proposal: Fuzzing the Hypervisor - Tracing