[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [UNIKRAFT PATCH 5/8] support/scripts: fetch trace data from running unikraft



Hi Yuri,

As a general comment, I don't get this split: putting scripts in
support/scripts/ and support/scripts/unikraft/. What's the rationale
behind it? Why aren't they all in support/scripts/unikraft/ ? Also,
having two scripts with the same name, trace.py, is misleading.

Moreover, I see that in another series (patch 'support/scripts:
reorganise tracing scripts') you rename these scripts, why didn't it
happen in the current patch series?

Please see my other comments inline.

On 5/10/19 9:29 PM, Yuri Volchkov wrote:
> Fetch tracebuffer, key-values and tracepoint definitions using objcopy
> and gdb.
> 
> This patch introduces uk-gdb.py script, which is a gdb-extension
> providing handy helpers. Currently it is only tracepoints related
> staff, but possibilities are going beyond that.

s/staff/stuff

> 
> The trace.py is a tool for analyzing traces. Currently it does only
> the basic staff: fetches tracing data from a running instance (can run

s/staff/stuff

> gdb for you) and prints out parsed information (in the next
> patch). But in potentially it will be able to do more. For example

"But it will potentially/eventually be able..."

> statistics of most frequently occurred events.
> 
> Signed-off-by: Yuri Volchkov <yuri.volchkov@xxxxxxxxx>
> ---
>  support/scripts/trace.py             | 104 +++++++++++++++++++++++
>  support/scripts/uk-gdb.py            | 122 +++++++++++++++++++++++++++
>  support/scripts/unikraft/__init__.py |   0
>  support/scripts/unikraft/trace.py    |  57 +++++++++++++
>  4 files changed, 283 insertions(+)
>  create mode 100755 support/scripts/trace.py
>  create mode 100644 support/scripts/uk-gdb.py
>  create mode 100644 support/scripts/unikraft/__init__.py
>  create mode 100644 support/scripts/unikraft/trace.py
> 
> diff --git a/support/scripts/trace.py b/support/scripts/trace.py
> new file mode 100755
> index 00000000..49b469a5
> --- /dev/null
> +++ b/support/scripts/trace.py
> @@ -0,0 +1,104 @@
> +#!/usr/bin/env python3
> +# SPDX-License-Identifier: BSD-3-Clause */
> +#
> +# Authors: Yuri Volchkov <yuri.volchkov@xxxxxxxxx>
> +#
> +# Copyright (c) 2019, NEC Laboratories Europe GmbH, NEC Corporation.
> +#
> +# Redistribution and use in source and binary forms, with or without
> +# modification, are permitted provided that the following conditions
> +# are met:
> +#
> +# 1. Redistributions of source code must retain the above copyright
> +#    notice, this list of conditions and the following disclaimer.
> +# 2. Redistributions in binary form must reproduce the above copyright
> +#    notice, this list of conditions and the following disclaimer in the
> +#    documentation and/or other materials provided with the distribution.
> +# 3. Neither the name of the copyright holder nor the names of its
> +#    contributors may be used to endorse or promote products derived from
> +#    this software without specific prior written permission.
> +#
> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
> +# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> +# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
> +# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> +# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> +# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> +# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> +# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> +# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> +# POSSIBILITY OF SUCH DAMAGE.
> +#
> +# THIS HEADER MAY NOT BE EXTRACTED OR MODIFIED IN ANY WAY.
> +
> +import click

For the record, I'd prefer using argparse instead, it avoids adding
extra dependencies. The series introducing installation support make
things even worse, I prefer simplicity. But I know you really want this
so I'm not going to insist on this one.

> +import os, sys
> +import pickle
> +import subprocess
> +scripts_dir = os.path.dirname(os.path.realpath(__file__))
> +sys.path.append(scripts_dir)
> +
> +import unikraft.trace as trace
> +
> +@click.group()
> +def cli():
> +    pass
> +
> +def parse_tf(trace_file):
> +    try:
> +        with open(trace_file, 'rb') as tf:
> +            unpickler = pickle.Unpickler(tf)
> +
> +            keyvals = unpickler.load()
> +            elf = unpickler.load()
> +            ptr_size = unpickler.load()
> +            tp_defs = unpickler.load()
> +            trace_buff = unpickler.load()
> +    except EOFError:
> +        print("Unexpected end of trace file", file=sys.stderr)
> +        quit(-1)
> +    except Exception as inst:
> +        print("Problem occurred during reading the tracefile: %s" % 
> str(inst))
> +        quit(-1)
> +
> +    return trace.sample_parser(keyvals, tp_defs, trace_buff, ptr_size)
> +
> +@cli.command()
> +@click.argument('uk_img', type=click.Path(exists=True))
> +@click.option('--out', '-o', type=click.Path(),
> +              default='tracefile', show_default=True,
> +              help='Output binary file')
> +@click.option('--remote', '-r', type=click.STRING,
> +              default=':1234', show_default=True,
> +              help='How to connect to the gdb session '+
> +              '(parameters for "target remote" command)')
> +@click.option('--verbose', is_flag=True, default=False)
> +def fetch(uk_img, out, remote, verbose):
> +    """Fetch binary trace file from Unikraft (using gdb)"""
> +
> +    if os.path.exists(out):
> +        os.remove(out)
> +
> +    gdb_cmd = ['gdb', '-nh', '-batch',
> +               click.format_filename(uk_img),
> +               '-ex', 'target remote ' + remote,
> +               '-ex', 'source %s/uk-gdb.py' % scripts_dir,
> +               '-ex', 'uk trace save ' + out
> +    ]
> +
> +    proc = subprocess.Popen(gdb_cmd,
> +                            stdout=subprocess.PIPE,
> +                            stderr=subprocess.STDOUT
> +    )
> +    _stdout, _ = proc.communicate()
> +    _stdout = _stdout.decode()
> +    if proc.returncode or not os.path.exists(out):
> +        print(_stdout)
> +        sys.exit(1)
> +    if verbose:
> +        print(_stdout)
> +
> +
> +if __name__ == '__main__':
> +    cli()
> diff --git a/support/scripts/uk-gdb.py b/support/scripts/uk-gdb.py
> new file mode 100644
> index 00000000..6e9b3930
> --- /dev/null
> +++ b/support/scripts/uk-gdb.py
> @@ -0,0 +1,122 @@
> +#!/user/bin/env python3
> +# SPDX-License-Identifier: BSD-3-Clause */
> +#
> +# Authors: Yuri Volchkov <yuri.volchkov@xxxxxxxxx>
> +#
> +# Copyright (c) 2019, NEC Laboratories Europe GmbH, NEC Corporation.
> +#
> +# Redistribution and use in source and binary forms, with or without
> +# modification, are permitted provided that the following conditions
> +# are met:
> +#
> +# 1. Redistributions of source code must retain the above copyright
> +#    notice, this list of conditions and the following disclaimer.
> +# 2. Redistributions in binary form must reproduce the above copyright
> +#    notice, this list of conditions and the following disclaimer in the
> +#    documentation and/or other materials provided with the distribution.
> +# 3. Neither the name of the copyright holder nor the names of its
> +#    contributors may be used to endorse or promote products derived from
> +#    this software without specific prior written permission.
> +#
> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
> +# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> +# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
> +# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> +# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> +# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> +# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> +# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> +# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> +# POSSIBILITY OF SUCH DAMAGE.
> +#
> +# THIS HEADER MAY NOT BE EXTRACTED OR MODIFIED IN ANY WAY.
> +
> +import gdb
> +import pickle
> +import os, sys
> +import tempfile, shutil
> +
> +scripts_dir = os.path.dirname(os.path.realpath(__file__))
> +sys.path.append(scripts_dir)
> +
> +import unikraft.trace as trace
> +
> +type_char = gdb.lookup_type('char')
> +type_void = gdb.lookup_type('void')
> +
> +PTR_SIZE = type_void.pointer().sizeof
> +
> +def get_trace_buffer():
> +    inf = gdb.selected_inferior()
> +
> +    try:
> +        trace_buff = gdb.parse_and_eval('uk_trace_buffer')
> +        trace_buff_size = trace_buff.type.sizeof
> +        trace_buff_addr = int(trace_buff.address)
> +        trace_buff_writep = int(gdb.parse_and_eval('uk_trace_buffer_writep'))
> +    except gdb.error:
> +        gdb.write("Error getting the trace buffer. Is tracing enabled?\n")
> +        raise gdb.error
> +
> +    if (trace_buff_writep == 0):
> +        # This can happen as effect of compile optimization if none of
> +        # tracepoints were called
> +        used = 0
> +    else:
> +        used = trace_buff_writep - trace_buff_addr
> +
> +    return bytes(inf.read_memory(trace_buff_addr, used))
> +
> +def save_traces(out):
> +    elf = gdb.current_progspace().filename
> +
> +    pickler = pickle.Pickler(out)
> +
> +    # keyvals need to go first, because they have format_version
> +    # key. Even if the format is changed we must guarantee that at
> +    # least keyvals are always stored first. However, ideally, next
> +    # versions should just have modifications at the very end to keep
> +    # compatibility with previously collected data.
> +    pickler.dump(trace.get_keyvals(elf))
> +    pickler.dump(elf)
> +    pickler.dump(PTR_SIZE)
> +    # We are saving raw trace buffer here. Another option is to pickle
> +    # already parsed samples. But in the chosen case it is a lot
> +    # easier to debug the parser, because python in gdb is not very
> +    # convenient for development.
> +    pickler.dump(trace.get_tp_sections(elf))
> +    pickler.dump(get_trace_buffer())
> +

I think this is not the best design. The serialization for traces is
here, while the deserialization is in support/scripts/trace.py . It's
very easy to make a mistake when adding more tracing information. I'd
prefer the serialization and deserialization to stay together.

> +class uk(gdb.Command):
> +    def __init__(self):
> +        gdb.Command.__init__(self, 'uk',
> +                             gdb.COMMAND_USER, gdb.COMPLETE_COMMAND, True)
> +
> +class uk_trace(gdb.Command):
> +    def __init__(self):
> +        gdb.Command.__init__(self, 'uk trace', gdb.COMMAND_USER,
> +                             gdb.COMPLETE_COMMAND, True)
> +    def invoke(self, arg, from_tty):
> +        # TODO
> +        pass
> +
> +class uk_trace_save(gdb.Command):
> +    def __init__(self):
> +        gdb.Command.__init__(self, 'uk trace save', gdb.COMMAND_USER,
> +                             gdb.COMPLETE_COMMAND)

If there will be a new version, I prefer these initializers (I mean
__init__ calls) to have the same style (e.g. like the first one). It
improves the readability and it eases my OCD.

> +    def invoke(self, arg, from_tty):
> +        if not arg:
> +            gdb.write('Missing argument. Usage: uk trace save <filename>\n')
> +            return
> +
> +        gdb.write('Saving traces to %s ...\n' % arg)
> +
> +        with tempfile.NamedTemporaryFile() as out:
> +            save_traces(out)
> +            out.flush()
> +            shutil.copyfile(out.name, arg)
> +
> +uk()
> +uk_trace()
> +uk_trace_save()
> diff --git a/support/scripts/unikraft/__init__.py 
> b/support/scripts/unikraft/__init__.py
> new file mode 100644
> index 00000000..e69de29b
> diff --git a/support/scripts/unikraft/trace.py 
> b/support/scripts/unikraft/trace.py
> new file mode 100644
> index 00000000..407f47e3
> --- /dev/null
> +++ b/support/scripts/unikraft/trace.py
> @@ -0,0 +1,57 @@
> +# SPDX-License-Identifier: BSD-3-Clause */
> +#
> +# Authors: Yuri Volchkov <yuri.volchkov@xxxxxxxxx>
> +#
> +# Copyright (c) 2019, NEC Laboratories Europe GmbH, NEC Corporation.
> +#
> +# Redistribution and use in source and binary forms, with or without
> +# modification, are permitted provided that the following conditions
> +# are met:
> +#
> +# 1. Redistributions of source code must retain the above copyright
> +#    notice, this list of conditions and the following disclaimer.
> +# 2. Redistributions in binary form must reproduce the above copyright
> +#    notice, this list of conditions and the following disclaimer in the
> +#    documentation and/or other materials provided with the distribution.
> +# 3. Neither the name of the copyright holder nor the names of its
> +#    contributors may be used to endorse or promote products derived from
> +#    this software without specific prior written permission.
> +#
> +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
> +# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> +# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
> +# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
> +# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> +# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
> +# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
> +# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
> +# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
> +# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
> +# POSSIBILITY OF SUCH DAMAGE.
> +#
> +# THIS HEADER MAY NOT BE EXTRACTED OR MODIFIED IN ANY WAY.
> +
> +import subprocess
> +import re
> +import tempfile
> +
> +def get_tp_sections(elf):
> +    f = tempfile.NamedTemporaryFile()
> +    objcopy_cmd  = 'objcopy -O binary %s ' % elf
> +    objcopy_cmd += '--only-section=.uk_tracepoints_list ' + f.name
> +    objcopy_cmd = objcopy_cmd.split()
> +    subprocess.check_call(objcopy_cmd)
> +    return f.read()
> +
> +def get_keyvals(elf):
> +    readelf_cmd = 'readelf -p .uk_trace_keyvals %s' % elf
> +    readelf_cmd = readelf_cmd.split()
> +    raw_data = subprocess.check_output(readelf_cmd).decode()
> +    filtered = re.findall(r'^\s*\[ *\d+\]\s+(.+) = (.+)$', raw_data,
> +                          re.MULTILINE)
> +
> +    ret = dict()
> +    for key,val in filtered:
> +        ret[key] = val
> +
> +    return ret
> 

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.