[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Xen/Debian Bookworm: smartmontools/smartd silently stopped working after update





On Sat, Mar 16, 2024 at 7:16 PM zithro <slack@xxxxxxxxx> wrote:
On 31 Jan 2024 08:20, Paul Leiber wrote:
> Hi Xen users,
>
> just a short info, perhaps this is relevant for you as well: I noticed
> that smartmontools/smartd silently stopped working in my Debian Bookworm
> setup a couple of months ago, presumably after an update (couldn't
> pinpoint the specific, right now, I'm on version 7.3-1+b1). Contrary to
> my expectation, I wouldn't have been notified via e-mail if there had
> been errors in the SMART logs. The cause is that there is a new check in
> the smartd.service file "ConditionVirtualization=no", which checks if
> smartd is running in a vm. If it detects a vm, smart doesn't start.
> There is a message in the logs if you look for it, but to look for it,
> you first have to have a reason to do so.
>
> Commenting out the line solves this issue.
>
> There is a discussion on GitHub on this:
>
> https://github.com/smartmontools/smartmontools/issues/62
>
> Best regards,
>
> Paul
>

Hi,

for completeness, this bug/feature affects dom0s but also any domU with
a passthrough'ed disk controller, with whatever virt platform btw.
So if you use any systemd-based distro as a file server VM and expect to
use SMART, take care.

smartmontools should have no business knowing it's running in a VM or
not, it should only care if it's handling "real" disk controllers.
To me, the systemd "ConditionVirtualization" is misused here, the real
condition for such service should be "ConditionHasDiskController".

There is also a related systemd bug, the executable
"systemd-detect-virt" (sdv) wrongly detecting dom0 as a VM.
This is the executable which sets the "ConditionVirtualization" when it
detects it's running in a VM.
Sure, dom0 -is- a VM in Xen, but for the (supposed) purpose of sdv, it
simply shouldn't be detected as such.

Bug report on systemd's github :
https://github.com/systemd/systemd/issues/28113

Debian BTS bug report :
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1038901

I'm not aware of such problems in other init systems or services.

Thinking about this a bit more -- if you'd passed through a physical disk controller to a VM via PCI passthrough, you'd also presumably want the SMART tools to be monitoring things.  What they actually need is a "is-real-device <dev>" function somehow; but "has-real-hardware" might be a proxy, that would work both for dom0 systems and for systems with pass-through; if we could get a reliable test.

 -George

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.