Xen project Mailing List

Re: [PATCH] xen/console: do not drop serial output from the hardware domain

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Tue, 14 Jun 2022 11:38:28 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EMEGXRZfJV8cRfQIPJiIUOeSoPS/a1AqUpOTX0HMXjM=; b=PeoXy3PqK36k8NKbtlulixtCQA0P79qwdX1XKCRsA4twAv7yJGer4nZgAOFhRde0fJDwvpcESSacMcfmMm3bGjbthJY3CGtu3v2rYNmyCF59wr+6TGjqwOgNbQRKnfvDdV5gMebKKjPnhi1fc1pTNk83XTV2vgv6gGshVjDgdCYIYimI/S6NDqLhzC+L1tsmz5/TEgtJXDfeeDfwAH9tEF+Wfe0puAzxNZdECZ9r2P2f9aSLaU3SHyE7c8Z5S39SCbOmDSIv1nqG98UcXdyK/Ju4YZ5QGf1x8/BK0cq4787Hv5wSFmjw3QLG16KLknWDRhbGKPo0BXjk5SlO3mKAlQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Dw7KAUf88nyrjxfuuJPGCi1uvZtB8cwQov9qagSyZA1PHFWuI3eGbHHM2U5yiYPpjC82goArrVsTS+U+WgG9Thu3FJvayeUeEQ+2JhUtUcSwu9vhMNkmKZI4jXrErD8nZnaonGL7YDU0lck3JUOKcpHzd7LHfzEvbo3SM01HsuJkxaE6BuAJAX9zRXK8JCDETiEQbG8Vm0e3iVbqEkqNh1dS858ilVJAwgVYdJ8agHMaVwaUYcyzJ6AnlyOKW/Y7ffrtBNYpUXo/nzUr8TEzWuhFhACGhd7nG0TzrEx1uXYzeRQbuYtZCcG3/N14oas5krWeASel/kHDsALHKVIRtA==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Tue, 14 Jun 2022 09:38:47 +0000

Ironport-data: A9a23:Vr2w468DpDsXPYN1KCz/DrUDnH+TJUtcMsCJ2f8bNWPcYEJGY0x3y 2UcWzqCaayCZjD2KtFxYYq29UkE65DczNBrHQE/pCw8E34SpcT7XtnIdU2Y0wF+jyHgoOCLy +1EN7Es+ehtFie0Si+Fa+Sn9T8mvU2xbuKU5NTsY0idfic5DnZ44f5fs7Rh2NQw34HlW1rlV e7a+KUzBnf0g1aYDUpMg06zgEsHUCPa4W5wUvQWPJinjXeG/5UnJMt3yZKZdhMUdrJ8DO+iL 9sv+Znilo/vE7XBPfv++lrzWhVirrc/pmFigFIOM0SpqkAqSiDfTs/XnRfTAKtao2zhojx/9 DlCnZu6dCEFYrLAoc0UdUhiLjh9HvZL47CSdBBTseTLp6HHW13F5qw0SW0TY8gf8OsxBnxS/ /sFLjxLdgqEm++93LO8TK9rm9gnK87oeogYvxmMzxmAVapgHc+FHvSMvIAHtNszrpkm8fL2f c0WZCApdB3dSxZOJk0WGNQ1m+LAanzXLGYF9AnJ//RfD277lwBWzLf3D//pZ4aIfexRx2OVp mT38DGsav0dHJnFodafyVqujOLSmSLwWKoJCaa1sPVthTW7xHEXCRAQfUu2p7++kEHWc8lEN 0Ue9y4qrK4z3E+mVN/wW1u/unHslgEYc8pdFas98g7l4rrZ5UOVC3YJShZFacc6r4kmSDoyz FiLktj1Qzt1v9WopWm1876VqXa+PHYTJGpbPyscF1JavJ/kvZ05iQ/JQpB7Cqmpg9bpGDb2h TeXsCw5gLZVhskOv0mmwW36b/uXjsChZmYICs//BApJMisRiFaZWrGV

Ironport-hdrordr: A9a23:Rq3RG6Ce+dC485PlHeg+sceALOsnbusQ8zAXPh9KJCC9I/bzqy nxpp8mPH/P5wr5lktQ/OxoHJPwOU80kqQFmrX5XI3SJTUO3VHFEGgM1+vfKlHbak7DH6tmpN 1dmstFeaLN5DpB/KHHCWCDer5PoeVvsprY49s2p00dMT2CAJsQizuRZDzrcHGfE2J9dOcE/d enl4N6jgvlXU5SQtWwB3EDUeSGj9rXlKj+aRpDIxI88gGBgR6h9ba/SnGjr1wjegIK5Y1n3X nOkgT/6Knmm/anyiXE32uWy5hNgtPuxvZKGcTJoMkILTfHjBquee1aKvW/lQFwhNvqxEchkd HKrRtlF8Nv60nJdmXwmhfp0xmI6kdb11bSjXujxVfzq83wQzw3T+Bbg5hCTxff4008+Plhza NixQuixtVqJCKFuB64y8nDVhlsmEbxi2Eli/Qvg3tWVpZbQKNNrLYY4FheHP47bW7HAbgcYa hT5fznlbZrmQvwVQGbgoAv+q3gYp0LJGbJfqBY0fblkQS/nxhCvj4lLYIk7zI9HakGOuh5Dt T/Q9pVfY51P78rhIJGdZA8qJiMexrwqSylChPgHX3XUIc6Blnql7nbpJ0I2cDCQu178HJ1ou WKbG9l

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote: > On 14.06.2022 10:32, Roger Pau Monné wrote: > > On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote: > >> On 14.06.2022 08:52, Roger Pau Monné wrote: > >>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote: > >>>> On 13.06.2022 14:32, Roger Pau Monné wrote: > >>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote: > >>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote: > >>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote: > >>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote: > >>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote: > >>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote: > >>>>>>>>>>> Prevent dropping console output from the hardware domain, since > >>>>>>>>>>> it's > >>>>>>>>>>> likely important to have all the output if the boot fails without > >>>>>>>>>>> having to resort to sync_console (which also affects the output > >>>>>>>>>>> from > >>>>>>>>>>> other guests). > >>>>>>>>>>> > >>>>>>>>>>> Do so by pairing the console_serial_puts() with > >>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped. > >>>>>>>>>> > >>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) > >>>>>>>>>> more > >>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this > >>>>>>>>>> aiming at boot output only, wouldn't you want to stop the > >>>>>>>>>> overriding > >>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't > >>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm > >>>>>>>>>> not convinced we'd want to let through everything, but perhaps just > >>>>>>>>>> Dom0's kernel messages? > >>>>>>>>> > >>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so > >>>>>>>>> this request is something that come up internally. > >>>>>>>>> > >>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate > >>>>>>>>> limiting based on log levels I was assuming that non-ratelimited > >>>>>>>>> messages wouldn't be dropped. But yes, I agree that Xen (non-guest > >>>>>>>>> triggered) output shouldn't be rate limited either. > >>>>>>>> > >>>>>>>> Which would raise the question of why we have log levels for > >>>>>>>> non-guest > >>>>>>>> messages. > >>>>>>> > >>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log > >>>>>>> levels and rate limiting. If I set log level to WARNING I would > >>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or > >>>>>>> above. It's still useful to have log levels for non-guest messages, > >>>>>>> since user might want to filter out DEBUG non-guest messages for > >>>>>>> example. > >>>>>> > >>>>>> It was me who was confused, because of the two log-everything variants > >>>>>> we have (console and serial). You're right that your change is > >>>>>> unrelated > >>>>>> to log levels. However, when there are e.g. many warnings or when an > >>>>>> admin has lowered the log level, what you (would) do is effectively > >>>>>> force sync_console mode transiently (for a subset of messages, but > >>>>>> that's secondary, especially because the "forced" output would still > >>>>>> be waiting for earlier output to make it out). > >>>>> > >>>>> Right, it would have to wait for any previous output on the buffer to > >>>>> go out first. In any case we can guarantee that no more output will > >>>>> be added to the buffer while Xen waits for it to be flushed. > >>>>> > >>>>> So for the hardware domain it might make sense to wait for the TX > >>>>> buffers to be half empty (the current tx_quench logic) by preempting > >>>>> the hypercall. That however could cause issues if guests manage to > >>>>> keep filling the buffer while the hardware domain is being preempted. > >>>>> > >>>>> Alternatively we could always reserve half of the buffer for the > >>>>> hardware domain, and allow it to be preempted while waiting for space > >>>>> (since it's guaranteed non hardware domains won't be able to steal the > >>>>> allocation from the hardware domain). > >>>> > >>>> Getting complicated it seems. I have to admit that I wonder whether we > >>>> wouldn't be better off leaving the current logic as is. > >>> > >>> Another possible solution (more like a band aid) is to increase the > >>> buffer size from 4 pages to 8 or 16. That would likely allow to cope > >>> fine with the high throughput of boot messages. > >> > >> You mean the buffer whose size is controlled by serial_tx_buffer? > > > > Yes. > > > >> On > >> large systems one may want to simply make use of the command line > >> option then; I don't think the built-in default needs changing. Or > >> if so, then perhaps not statically at build time, but taking into > >> account system properties (like CPU count). > > > > So how about we use: > > > > min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096)) > > That would _reduce_ size on small systems, wouldn't it? Originally > you were after increasing the default size. But if you had meant > max(), then I'd fear on very large systems this may grow a little > too large. See previous followup about my mistake of using min() instead of max(). On a system with 512 CPUs that would be 512KB, I don't think that's a lot of memory, specially taking into account that a system with 512 CPUs should have a matching amount of memory I would expect. It's true however that I very much doubt we would fill a 512K buffer, so limiting to 64K might be a sensible starting point? > > Maybe we should also take CPU frequency into account, but that seems > > too complex for the purpose. > > Why would frequency matter? Other aspects I could see mattering is > node count and maybe memory size. Higher frequency likely means faster boot, and faster buffer fill, because the baudrate of the console is constant. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.