[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[MirageOS-devel] Systematic crash on create_bounce_frame when hitting specific data allocation threshold
Hi,
I'm running a unikernel on XEN that basically accesses a remote DB,
fetches and computes some data, sends out the result. Apparently, if
I try to fetch and parse a JSON response greater than a empirically
found threshold (details at the bottom of the email), the PVM XEN
unikernel just crashes and this is wait I see when running sudo
xl dmesg:
(XEN) Pagetable walk from 00000000002c9ff8:
(XEN) L4[0x000] = 00000010b5f67067 0000000000000567
(XEN) L3[0x000] = 00000010b5f68067 0000000000000568
(XEN) L2[0x001] = 00000010b5f6a067 000000000000056a
(XEN) L1[0x0c9] = 00100010b1ac9025 00000000000002c9
(XEN) domain_crash_sync called from entry.S: fault at
ffff82d0802261be create_bounce_frame+0x66/0x13a
(XEN) Domain 23 (vcpu#0) crashed on cpu#17:
(XEN) ----[ Xen-4.6.0 x86_64 debug=n Not tainted ]----
(XEN) CPU: 17
(XEN) RIP: e033:[<0000000000258cf4>]
(XEN) RFLAGS: 0000000000010206 EM: 1 CONTEXT: pv guest (d23v0)
(XEN) rax: 0000000000258cf0 rbx: 0000000000000000 rcx:
0000000000000073
(XEN) rdx: 0000000000442528 rsi: 0000000000000000 rdi:
00000000002ca018
(XEN) rbp: 00000000002ca1e8 rsp: 00000000002ca000 r8:
0000000000000002
(XEN) r9: 0000000000000007 r10: 0000000000000007 r11:
0000000000000000
(XEN) r12: 00000000002ca118 r13: 0000000000000000 r14:
00000011238fa000
(XEN) r15: 0000000000000074 cr0: 0000000080050033 cr4:
00000000001526e0
(XEN) cr3: 00000010b5f66000 cr2: 00000000002c9ff8
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs:
e033
(XEN) Guest stack trace from rsp=00000000002ca000:
(XEN) 00000000002ca118 0000000000000000 000000000025933f
0000000000000074
(XEN) 00000011238fa000 0000000000000000 00000000002ca118
00000000002ca1e8
(XEN) 0000000000000000 0000000000000000 0000000000000007
0000000000000007
(XEN) 0000000000000002 ffff800000000000 0000000000000073
0000000000442528
(XEN) 00000000002ca118 0000000000000000 ffffffffffffffff
0000000000256708
(XEN) 000000010000e030 0000000000010006 00000000002ca0c8
000000000000e02b
(XEN) 0000000000000ffc 3736353433323130 4645444342413938
4e4d4c4b4a494847
(XEN) 00000000002ca18b 00000000002ca1e8 00000000002ca18a
0000000000000074
(XEN) 00000000002566a0 00000000002ca118 00000000002561bc
7561662065676150
(XEN) 696c20746120746c 646461207261656e 3062642073736572
706972202c306433
(XEN) 2c38303736353220 3030207367657220 3030303030303030
202c383333616332
(XEN) 6533616332207073 735f72756f202c38 3030303030302070
3261633230303030
(XEN) 65646f63202c3866 ffffffff0a0d3020 0000000000000bfc
61665f686374614d
(XEN) 0200006572756c69 0000000000000073 0000000000000000
ffffffffffffffef
(XEN) 0000000000000000 00000000002ca2e8 0000000000000000
00000011238fa000
(XEN) 0000000000000074 00000000002ca338 000000000025630a
636f6c625f737953
(XEN) 0000003000000030 00000000002ca2e0 00000000002ca218
ffffffffffffffeb
(XEN) 0000000000db03d0 0000000000256708 00000000002ca338
00000000002ca3e8
(XEN) 00000000002ca2f8 ffffffffffffffe9 00000000000013fc
656e696665646e55
(XEN) 7372756365725f64 75646f6d5f657669 050000000000656c
00000000003df368
I've tried to destroy/create multiple times the same unikernel and I
always receive the same error. When running on Unix I don't bump
into this issue, even when fetching and parsing multiple MB of data.
By filling my code with logs, I figured out where exactly the
unikernel stops. Specifically during the JSON response parsing (I'm
using the YoJson library):
let directExtractionn rawJson =
Log.info (fun f -> f "Initializing direct
extraction");
let json = Yojson.Basic.from_string rawJson in
let result = [json] |> filter_member "results" |>
flatten |> filter_member "series"
|> flatten |> filter_member "values" |> flatten
in
List.map (
fun item ->
let datapoint = match item |>
index 1 with
| `String a -> a
| `Float f -> string_of_float
f
| `Int i -> string_of_float
(float_of_int i)
| `Bool b -> string_of_bool b
in
datapoint
) result |> computeAverage >>= fun aver ->
log_lwt ~inject:(fun f -> f "Result %f" aver)
I know that probably my code is not really optimized and clean but
I'm quite shocked to see that my unikernel crashes when it has to
extract roughly 3500 datapoints (it's more or less the threshold at
which it crashes). The function computeAverage is not even called.
If I run the same code on Unix I can parse and process up to a 1M
datapoints in less than a second. I've also tried to increase the
number of vcpus and memory, but nothing changed (16 vcpus and 4GB of
memory).
I would like to add that this threshold changes depending on the
host machine:
- Machine A (Ubuntu 14.04, Xen 4.6.0, 32 Cores, 128 GB RAM, 10 GB
Network Interface) -> Threshold is around 107Kb
- Machine B (Debian 8.5, Xen 4.4.1, 4 cores, 8 GB RAM, 1GB Network
Interface) -> Threshold is around 33Kb
|
Attachment:
smime.p7s
Description: Firma crittografica S/MIME
_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
|