[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [Patch RFC] ttm: nouveau accelerated on Xen pv-ops kernel



On Tue, Mar 23, 2010 at 2:44 AM, Michael D Labriola <mlabriol@xxxxxxxx> wrote:
> xen-devel-bounces@xxxxxxxxxxxxxxxxxxx wrote on 03/20/2010 02:01:54 AM:
>
>> On Fri, Mar 19, 2010 at 8:59 PM, Michael D Labriola <mlabriol@xxxxxxxx>
> wrote:
>> > xen-devel-bounces@xxxxxxxxxxxxxxxxxxx wrote on 03/18/2010 02:09:08 AM:
>> >
>> >> On Wed, Mar 17, 2010 at 1:09 AM, Michael D Labriola
> <mlabriol@xxxxxxxx>
>> > wrote:
>> >> > Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote on 03/16/2010
>> >> > 01:21:35 PM:
>> >> >
>> >> >> > > > And my X log ends abruptly after this line:
>> >> >> > > > (II) NOUVEAU(0): Opened GPU Channel 1
>> >> >> > > >
>> >> >> > > > Any ideas?
>> >> >> > > >
>> >> >> > >
>> >> >> > > Well, this is generally the symptom that someone is confusing
>> > mfns
>> >> > and
>> >> >> > > pfns, and therefore ends up incorrectly setting the _PAGE_IO
> flag
>> > in
>> >> >
>> >> >> > > some pte.  If you run it under strace, can you identify which
>> >> > mapping
>> >> >> > > the fault is happening in?
>> >> >> >
>> >> >> > I've attached the output of 'strace -o strace-Xorg Xorg'.
>  Figuring
>> >> > out
>> >> >> > which mapping the fault is happening in is a little over my
> head,
>> > I'm
>> >> >> > afraid.  If you need different arguments to strace, let me know
> and
>> >> > I'll
>> >> >> > do it again.
>> >> >>
>> >> >> So just to be sure, you took the 2.6.32 (xen/next or
>> >> >> xen/stable-2.6.32.x), copied the include and nouveu directory from
>> > ..?
>> >> >> 2.6.33? and then ran this.
>> >> >
>> >> > I actually took a slightly more sadistic route than Arvind... ;-)
>  A
>> > while
>> >> > back, I backported the important stuff from the Nouveau kernel git
>> > tree
>> >> > back to v2.6.31.  Basically guessed at which commits were
> important,
>> > wrote
>> >> > a script to cherry pick each and every one, and spent an entire day
>> >> > reading commit logs, resolving conflicts, and figuring out which
> other
>> >> > non-drm commits I needed.  Sounds retarded, I know, but it was a
>> > pretty
>> >> > interesting way to get myself up to speed with the code base.  The
>> >> > resulting 2.6.31-nouveau kernel runs like a champ on all my
> hardware.
>> >> >
>> >> > Then I merged that into my clone of Jeremy's xen/master which I use
>> > with
>> >> > Xen 3.4.2.
>> >> >
>> >> > Since then, I've been periodically cherry picking all new commits
> off
>> > the
>> >> > nouveau tree.  Also had to rebuild Xorg 7.5 to use xorg-server
> 1.7.5,
>> > new
>> >> > libdrm, mesa, and xf86-video-nouveau all from their respective git
>> > trees
>> >> > as of yesterday.  (drm and xf86-video-nouveau are on their master
>> >> > branches, mesa is on the 7.8 branch)
>> >> >
>> >> > This all works great using xen/master bare metal.  It used to work
>> > fine on
>> >> > my old GeForce2 MX based systems in Xen.  Arvind's patch made it
> work
>> > on
>> >> > my nice new systems in Xen, but broke it on the old ones.
>  Everything
>> >> > still works fine bare metal.
>> >> >
>> >> >> Did you have to edit your xorg.conf file or
>> >> >> it ran just fine?
>> >> >
>> >> > Well, I had to create an xorg.conf that looks like this:
>> >> >
>> >> > Section "Device"
>> >> >  Identifier "foo"
>> >> >  Driver "nouveau"
>> >> > EndSection
>> >> >
>> >> > Otherwise it uses the 'nv' driver...  and I haven't stumbled onto
> how
>> > to
>> >> > get nouveau to automatically get used (aside from uninstalling the
> nv
>> >> > driver).
>> >> >
>> >> >
>> >> >> Was this Fedora 13 or Fedora 12?
>> >> >
>> >> > This is all being done on a custom 32bit Linux distro that we use
> for
>> > our
>> >> > tightly configuration controlled system deliveries.  It was fedora
>> > based a
>> >> > looooooooong time ago (FC5), but is completely unrecognizable now.
>> >> >
>> >> >
>> >> >> Arvind explanation about the Nvidia driver pointed out that the
>> > NVidia
>> >> >> driver (drm/nouvue) can operate on different channels. Where
> channel
>> > 1
>> >> >> is the framebuffer, and the other are for well, KMS, and other
>> >> >> applications.
>> >> >>
>> >> >> I belive I was looking at the wrong section of the drivers (not
> the
>> >> >> drivers/video/gpu ones)- this certainly looks to be the issues the
>> >> >> Jeremy mentioned.
>> >> >>
>> >> >> Also I would suggest you load drm with the debug variable set to
> the
>> > 255
>> >> >> to get most of what his happening.
>> >> >
>> >> > I'll try that.
>> >> >
>> >> >
>> >> >> Based on your strace, the last call is:
>> >> >> 4000)                          = 0x9324000
>> >> >> write(0, "(II) NOUVEAU(0): Opened GPU chan"..., 38) = 38
>> >> >> ioctl(11, 0xc0106445, 0x930a908)        = 0
>> >> >> ioctl(11, 0x400c6444, 0xbfd2a210)       = 0
>> >> >> +++ killed by SIGKILL +++
>> >> >>
>> >> >> I cannot find what 0x45 is in the upstream Linux, so you must be
>> > using a
>> >> >> different nouv* driver than that. The 0x44 is:
>> >> >>
>> >> >>   DRM_IOCTL_DEF(DRM_NOUVEAU_GEM_INFO, nouveau_gem_ioctl_info,
>> > DRM_AUTH),
>> >> >>
>> >> >> Which looks to be pretty harmless. I presume it is the next thing
>> > (using
>> >> >> the address returned) that the X driver tries to do that makes it
> go
>> >> > boom.
>> >> >
>> >> I suspect that the ioctl is prior to a modeset operation. And the
>> >> mode-setting is 'booming'.
>> >> My kernel config has VGA console built-in fbcon as a module and I do
>> >> a switch to
>> >> nouveaufb at runlevel 2. Also note that the default modeset
>> >> parameter is -1 and
>> >> if VGA-CONSOLE is enabled, then modeset is set to 0 in the driver
>> >> initialisation
>> >> - which maybe the problem. Do you have modeset=1 as module parameter?
>> >
>> > I wasn't setting any module params for nouveau.  Adding 'options
> nouveau
>> > modeset=1' to modprobe.conf didn't seem to make any difference.
>> >
>> > I've got the following in my .config:
>> >
>> > CONFIG_VGA_CONSOLE=y
>> > CONFIG_FB=y
>> > CONFIG_FB_VGA16=m
>> > CONFIG_FB_VESA=y
>> > CONFIG_FB_EFI=y
>> > CONFIG_FB_NVIDIA=m
>> > CONFIG_FB_NVIDIA_I2C=y
>> > CONFIG_FB_NVIDIA_BACKLIGHT=y
>> >
>>  - EMBEDDED  - this will enable VGA_CONSOLE selection. Set sub-menu
>> choices as needed
>>  - VGA_CONSOLE builtin
>>  - FB as module
>>  - FRAMEBUFFER_CONSOLE as a module. Enables late loading of nouveau
>>  * Foll. required to avoid cfb_copyarea, cfb_fillrectangle,
>> cfb_imageblit linking problems with
>>     out-of-tree nouveau builds
>>  - FB_VGA16 as module - supported by all nVidia cards.
>>    or
>>  - FB_NVIDIA as module - only works for older cards.
>>
>> For out-of-tree nouveau builds, DO NOT select ANY accelerated drivers
>> - that would enable
>> the old in-tree DRM. New TTM / DRM modulesare in the new driver/gpu
> tree.
>>
>> For in-tree builds, if nouveau is NOT in the initrd-image, system will
>> boot on vga console
>> >
>> > How do you force the nouveaufb switch at runlevel 2?  My screen
> obviously
>> > switches into KMS mode while udev is starting up.
>> You can switch to the accelerated framebuffer console by
>> modprobe nouveau
>> modprobe fbcon
>> fbcon will take-over console from the built-in VGA. See
>> Documenation/fb/fbcon.txt
>
> Ok, thanks.  Now I've got everything compiled as modules and can load them
> post-boot to switch to the nouveau framebuffer console.  That actually
> didn't change the X behavior at all, though.  I still get the exact same
> "X: Corrupted page table" messages in dmesg and my Xorg.log is just ending
> with "NOUVEAU(0): Opened GPU channel 1".
This is strange - channel 1 is the console channel. This appears in dmesg on
nouveaufb initialisation before EDID probe to find connected outputs.
Start X manually to avoid confusion of logs.

Have attached ttm_xen.patch which updates vm_page_prot after changing flags.
This is not done in the mainline drm-tree. But in the xen (old)
drm-tree this is done in
BOTH ttm_bo_mmap AND ttm_fbdev_mmap - and the attached patch does both,
along with the conditional VM_IO in bo_mmap. And the second vm_page_prot
update is for fbdev_mmap which corresponds to channel 1. Cross fingers and try!

> If the old nvidiafb is loaded, nouveau cannot install (and vice-versa)
>
> Well, everything seems to load just fine.  I get a nice teeny font and
> dmesg messages saying I'm using nouveaufb.
You should have got it earlier too - didn't you?

>> does NOT affect unaccelerated X on the older cards?
>
> Which accelerated modes are you refering to?  My understanding was that
> the old GeForce2 cards should work for nouveaufb, the 2d xf86-nouveau
> driver, and gallium's swrast_dri stuff (via AIGLX), but not gallium's new
> dri_nouveau stuff.
Right. But gallium's swrast_dri AND dri_nouveau are still 'unsupported',
to be tried at own risk. nouveau_dri was working enough to run fgfs with
mesa-7.7, but now with mesa-7.9, glxgears works not fgfs - segfaults in
libdrm_nouveau.

>> Xorg used to hang saying 'Opened Channel 2' and not 1.
>
> Now that's strange.  Every single one of my boxes says Opened Channel 1,
> with now reference to channel 2 at all.
Channel 1 in dmesg/syslog;  Xorg.log snippet:
(II) LoadModule: "shadowfb"
(II) Loading /usr/lib/xorg/modules/libshadowfb.so
(II) Module shadowfb: vendor="X.Org Foundation"
    compiled for 1.7.5, module version = 1.0.0
    ABI class: X.Org ANSI C Emulation, version 0.4
(--) Depth 24 pixmap format is 32 bpp
(II) NOUVEAU(0): Opened GPU channel 2  <initial hang point>
(II) NOUVEAU(0): [DRI2] Setup complete    <after patch>
(II) NOUVEAU(0): GART: 512MiB available
(II) NOUVEAU(0): GART: Allocated 16MiB as a scratch buffer
(II) EXA(0): Driver allocated offscreen pixmaps
(II) EXA(0): Driver registered support for the following operations:
(II)         Solid
(II)         Copy
(II)         Composite (RENDER acceleration)
(II)         UploadToScreen
(II)         DownloadFromScreen
(==) NOUVEAU(0): Backing store disabled
(==) NOUVEAU(0): Silken mouse enabled
(II) NOUVEAU(0): [XvMC] Associated with Nouveau GeForce 8/9 Textured Video.
(II) NOUVEAU(0): [XvMC] Extension initialized.


Try with
Option "ShadowFB"  "true"
in Device section of xorg.conf (turns off acceleration) to check. The option
also sets NoAccel on and X should use the FB device

So the cards that don't work are AGP cards?

Attachment: ttm_xen.patch
Description: Text Data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.