[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: v5.4.289 failed to boot with error megasas_build_io_fusion 3219 sge_count (-12) is out of range


  • To: Harshvardhan Jha <harshvardhan.j.jha@xxxxxxxxxx>, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
  • From: Jürgen Groß <jgross@xxxxxxxx>
  • Date: Thu, 30 Jan 2025 13:35:12 +0100
  • Autocrypt: addr=jgross@xxxxxxxx; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNH0p1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmNvbT7CwHkEEwECACMFAlOMcK8CGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRCw3p3WKL8TL8eZB/9G0juS/kDY9LhEXseh mE9U+iA1VsLhgDqVbsOtZ/S14LRFHczNd/Lqkn7souCSoyWsBs3/wO+OjPvxf7m+Ef+sMtr0 G5lCWEWa9wa0IXx5HRPW/ScL+e4AVUbL7rurYMfwCzco+7TfjhMEOkC+va5gzi1KrErgNRHH kg3PhlnRY0Udyqx++UYkAsN4TQuEhNN32MvN0Np3WlBJOgKcuXpIElmMM5f1BBzJSKBkW0Jc Wy3h2Wy912vHKpPV/Xv7ZwVJ27v7KcuZcErtptDevAljxJtE7aJG6WiBzm+v9EswyWxwMCIO RoVBYuiocc51872tRGywc03xaQydB+9R7BHPzsBNBFOMcBYBCADLMfoA44MwGOB9YT1V4KCy vAfd7E0BTfaAurbG+Olacciz3yd09QOmejFZC6AnoykydyvTFLAWYcSCdISMr88COmmCbJzn sHAogjexXiif6ANUUlHpjxlHCCcELmZUzomNDnEOTxZFeWMTFF9Rf2k2F0Tl4E5kmsNGgtSa aMO0rNZoOEiD/7UfPP3dfh8JCQ1VtUUsQtT1sxos8Eb/HmriJhnaTZ7Hp3jtgTVkV0ybpgFg w6WMaRkrBh17mV0z2ajjmabB7SJxcouSkR0hcpNl4oM74d2/VqoW4BxxxOD1FcNCObCELfIS auZx+XT6s+CE7Qi/c44ibBMR7hyjdzWbABEBAAHCwF8EGAECAAkFAlOMcBYCGwwACgkQsN6d 1ii/Ey9D+Af/WFr3q+bg/8v5tCknCtn92d5lyYTBNt7xgWzDZX8G6/pngzKyWfedArllp0Pn fgIXtMNV+3t8Li1Tg843EXkP7+2+CQ98MB8XvvPLYAfW8nNDV85TyVgWlldNcgdv7nn1Sq8g HwB2BHdIAkYce3hEoDQXt/mKlgEGsLpzJcnLKimtPXQQy9TxUaLBe9PInPd+Ohix0XOlY+Uk QFEx50Ki3rSDl2Zt2tnkNYKUCvTJq7jvOlaPd6d/W0tZqpyy7KVay+K4aMobDsodB3dvEAs6 ScCnh03dDAFgIq5nsB11j3KPKdVoPlfucX2c7kGNH+LUMbzqV6beIENfNexkOfxHfw==
  • Cc: Konrad Wilk <konrad.wilk@xxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, Harshit Mogalapalli <harshit.m.mogalapalli@xxxxxxxxxx>, stable@xxxxxxxxxxxxxxx
  • Delivery-date: Thu, 30 Jan 2025 12:35:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.01.25 19:46, Harshvardhan Jha wrote:

On 30/01/25 12:13 AM, Jürgen Groß wrote:
On 29.01.25 19:35, Harshvardhan Jha wrote:

On 29/01/25 4:52 PM, Juergen Gross wrote:
On 29.01.25 10:15, Harshvardhan Jha wrote:

On 29/01/25 2:34 PM, Greg KH wrote:
On Wed, Jan 29, 2025 at 02:29:48PM +0530, Harshvardhan Jha wrote:
Hi Greg,

On 29/01/25 2:18 PM, Greg KH wrote:
On Wed, Jan 29, 2025 at 02:13:34PM +0530, Harshvardhan Jha wrote:
Hi there,

On 29/01/25 2:05 PM, Greg KH wrote:
On Wed, Jan 29, 2025 at 02:03:51PM +0530, Harshvardhan Jha wrote:
Hi All,

+stable

There seems to be some formatting issues in my log output. I
have
attached it as a file.
Confused, what are you wanting us to do here in the stable tree?

thanks,

greg k-h
Since, this is reproducible on 5.4.y I have added stable. The
culprit
commit which upon getting reverted fixes this issue is also
present in
5.4.y stable.
What culprit commit?  I see no information here :(

Remember, top-posting is evil...
My apologies,

The stable tag v5.4.289 seems to fail to boot with the following
prompt in an infinite loop:
[   24.427217] megaraid_sas 0000:65:00.0: megasas_build_io_fusion
3273 sge_count (-12) is out of range. Range is:  0-256

Reverting the following patch seems to fix the issue:

stable-5.4      : v5.4.285             - 5df29a445f3a
xen/swiotlb: add
alignment check for dma buffers

I tried changing swiotlb grub command line arguments but that didn't
seem to help much unfortunately and the error was seen again.

Ok, can you submit this revert with the information about why it
should
not be included in the 5.4.y tree and cc: everyone involved and
then we
will be glad to queue it up.

thanks,

greg k-h

This might be reproducible on other stable trees and mainline as
well so
we will get it fixed there and I will submit the necessary fix to
stable
when everything is sorted out on mainline.

Right. Just reverting my patch will trade one error with another one
(the
one which triggered me to write the patch).

There are two possible ways to fix the issue:

- allow larger DMA buffers in xen/swiotlb (today 2MB are the max.
supported
    size, the megaraid_sas driver seems to effectively request 4MB)

This seems relatively simpler to implement but I'm not sure whether it's
the most optimal approach

Just making the static array larger used to hold the frame numbers for
the
buffer seems to be a waste of memory for most configurations.
Yep definitely not required in most cases.

I'm thinking of an allocated array using the max needed size (replace a
former buffer with a larger one if needed).

This seems like the right way to go.

Can you try the attached patch, please? I don't have a system at hand
showing the problem.


Juergen

Attachment: 0001-x86-xen-allow-larger-contiguous-memory-regions-in-PV.patch
Description: Text Data

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.