Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

radeonkms panic on 12.2-RELEASE: Fatal trap 12: page fault while in kernel mode #256

Open
serpent7776 opened this issue Jan 4, 2021 · 7 comments

Comments

@serpent7776
Copy link

After upgrading FreeBSD to 12.2-RELEASE I got a random panic related to radeon:

It seems similar to #130

current process was qutebrowser
freebsd-version 12.2-RELEASE-p2
My video card is Radeon HD 5450

drm-fbsd12.0-kmod-4.16.g20201016
drm-kmod-g20190710
libdrm-2.4.103,1
panic: page fault

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x11d8
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff827c00f8
stack pointer	        = 0x28:0xfffffe006cffd2e0
frame pointer	        = 0x28:0xfffffe006cffd310
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 32561 (python3.7:rcs0)
trap number		= 12
panic: page fault
cpuid = 0
time = 1609764141
KDB: stack backtrace:
#0 0xffffffff80c0a8e5 at kdb_backtrace+0x65
#1 0xffffffff80bbeb9b at vpanic+0x17b
#2 0xffffffff80bbea13 at panic+0x43
#3 0xffffffff8108f911 at trap_fatal+0x391
#4 0xffffffff8108f96f at trap_pfault+0x4f
#5 0xffffffff8108efb6 at trap+0x286
#6 0xffffffff81066f68 at calltrap+0x8
#7 0xffffffff827d7fdd at radeon_sa_bo_new+0x26d
#8 0xffffffff827c72ff at radeon_ib_get+0x2f
#9 0xffffffff827b313d at radeon_cs_ioctl+0x25d
#10 0xffffffff828af2e1 at drm_ioctl_kernel+0xf1
#11 0xffffffff828af589 at drm_ioctl+0x289
#12 0xffffffff828f0e58 at linux_file_ioctl+0x318
#13 0xffffffff80c28697 at kern_ioctl+0x2b7
#14 0xffffffff80c2833a at sys_ioctl+0xfa
#15 0xffffffff810904c7 at amd64_syscall+0x387
#16 0xffffffff8106788e at fast_syscall_common+0xf8
Uptime: 21h54m31s
Dumping 1383 out of 12248 MB: (CTRL-C to abort) ..2%..11%..21%..31%..41%..51%..61%..71%..81%..91%

I'm using official packages, maybe that's why sources are newer.

warning: Source file is more recent than executable.

55		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  doadump () at src/sys/amd64/include/pcpu_aux.h:55
#1  0xffffffff80bbe7b5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:451
#2  0xffffffff80bbebf3 in vpanic (fmt=<value optimized out>, 
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:880
#3  0xffffffff80bbea13 in panic (fmt=<value optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:807
#4  0xffffffff8108f911 in trap_fatal (frame=<value optimized out>, 
    eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:921
#5  0xffffffff8108f96f in trap_pfault (frame=0xfffffe006cffd220, 
    usermode=<value optimized out>, signo=<value optimized out>, 
    ucode=<value optimized out>) at src/sys/amd64/include/pcpu_aux.h:55
#6  0xffffffff8108efb6 in trap (frame=0xfffffe006cffd220)
    at /usr/src/sys/amd64/amd64/trap.c:405
#7  0xffffffff81066f68 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:289
#8  0xffffffff827c00f8 in evergreen_default_size ()
   from /boot/kernel/radeonkms.ko
#9  0x0000000000000000 in ?? ()
Current language:  auto; currently minimal

kldstat

Id Refs Address                Size Name
 1   99 0xffffffff80200000  227ad00 kernel
 2    1 0xffffffff8247b000     3a28 cpuctl.ko
 3    1 0xffffffff8247f000    2ed30 ext2fs.ko
 4    1 0xffffffff824af000     41e0 amdtemp.ko
 5    2 0xffffffff824b4000     2550 amdsmn.ko
 6    1 0xffffffff82721000   157460 radeonkms.ko
 7    2 0xffffffff82879000    75e10 drm.ko
 8    5 0xffffffff828ef000    12d30 linuxkpi.ko
 9    4 0xffffffff82902000    13f30 linuxkpi_gplv2.ko
10    2 0xffffffff82916000      6d0 debugfs.ko
11    1 0xffffffff82917000     f0c1 ttm.ko
12    1 0xffffffff82927000     1305 radeon_CEDAR_pfp_bin.ko
13    1 0xffffffff82929000     1703 radeon_CEDAR_me_bin.ko
14    1 0xffffffff8292b000      d85 radeon_CEDAR_rlc_bin.ko
15    1 0xffffffff8292c000     5ed5 radeon_CEDAR_smc_bin.ko
16    1 0xffffffff82932000    1c5a9 radeon_CYPRESS_uvd_bin.ko
17    1 0xffffffff8294f000     2698 intpm.ko
18    1 0xffffffff82952000      b40 smbus.ko
19    1 0xffffffff82953000     1860 uhid.ko
20    1 0xffffffff82955000     2908 ums.ko
21    1 0xffffffff82958000    25248 ipfw.ko
22    1 0xffffffff8297e000   537420 vmm.ko
23    1 0xffffffff82eb6000      afc nmdm.ko
24    1 0xffffffff82eb7000     7000 if_bridge.ko
25    1 0xffffffff82ebe000     4038 bridgestp.ko
26    1 0xffffffff82ec3000      958 fire_saver.ko
27    1 0xffffffff82ec4000     2940 nullfs.ko
28    1 0xffffffff82ec7000     54f8 linprocfs.ko
29    3 0xffffffff82ecd000     4b80 linux_common.ko
30    1 0xffffffff82ed2000     87d0 tmpfs.ko
31    1 0xffffffff82edb000     1a20 fdescfs.ko
32    1 0xffffffff82edd000    3c4c0 linux.ko
33    1 0xffffffff82f1a000    35ce0 linux64.ko
@valpackett
Copy link
Contributor

I'm using official packages

On 12.x, you have to build from ports

@serpent7776
Copy link
Author

ok, rebuilt drm-fbsd12.0-kmod and drm-kmod from ports
Package versions are still the same, so maybe I was using versions from ports, I don't remember now.
Also disabled hardware acceleration in browsers.

drm-fbsd12.0-kmod-4.16.g20201016
drm-kmod-g20190710
libdrm-2.4.103,1

@serpent7776
Copy link
Author

I had another panic, but this time it was BUG_ON assertion failed.

Fri Jan  8 22:39:27 CET 2021

FreeBSD DaemONX 12.2-RELEASE-p1 FreeBSD 12.2-RELEASE-p1 GENERIC  amd64

panic: BUG ON !list_empty(&fence->cb_list) failed at /wrkdirs/usr/ports/graphics/drm-fbsd12.0-kmod/work/kms-drm-fa1387d/linuxkpi/gplv2/include/linux/dma-fence.h:91

GNU gdb (GDB) 10.1 [GDB v10.1 for FreeBSD]
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-portbld-freebsd12.2".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /boot/kernel/kernel...
Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...

Unread portion of the kernel message buffer:
panic: BUG ON !list_empty(&fence->cb_list) failed at /wrkdirs/usr/ports/graphics/drm-fbsd12.0-kmod/work/kms-drm-fa1387d/linuxkpi/gplv2/include/linux/dma-fence.h:91
cpuid = 1
time = 1610140341
KDB: stack backtrace:
#0 0xffffffff80c0a8e5 at kdb_backtrace+0x65
#1 0xffffffff80bbeb9b at vpanic+0x17b
#2 0xffffffff80bbea13 at panic+0x43
#3 0xffffffff8291e612 at ttm_bo_release_list+0x2b2
#4 0xffffffff827d0312 at radeon_bo_unref+0x22
#5 0xffffffff827c2d7e at radeon_gem_object_free+0x1e
#6 0xffffffff828abc8b at drm_gem_object_release_handle+0xcb
#7 0xffffffff828abb5e at drm_gem_handle_delete+0x8e
#8 0xffffffff828af2e1 at drm_ioctl_kernel+0xf1
#9 0xffffffff828af589 at drm_ioctl+0x289
#10 0xffffffff828f0e58 at linux_file_ioctl+0x318
#11 0xffffffff80c28697 at kern_ioctl+0x2b7
#12 0xffffffff80c2833a at sys_ioctl+0xfa
#13 0xffffffff810904c7 at amd64_syscall+0x387
#14 0xffffffff8106788e at fast_syscall_common+0xf8
Uptime: 14h21m1s
Dumping 1105 out of 12248 MB: (CTRL-C to abort) ..2%..11%..21%..31%..41%..51%..61%..71%..82%..92%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
warning: Source file is more recent than executable.
55		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:371
#2  0xffffffff80bbe7b5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:451
#3  0xffffffff80bbebf3 in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:880
#4  0xffffffff80bbea13 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:807
#5  0xffffffff8291e612 in ttm_bo_release_list () from /boot/modules/ttm.ko
#6  0xffffffff827d0312 in radeon_bo_unref () from /boot/modules/radeonkms.ko
#7  0xffffffff827c2d7e in radeon_gem_object_free ()
   from /boot/modules/radeonkms.ko
#8  0xffffffff828abc8b in drm_gem_object_release_handle ()
   from /boot/modules/drm.ko
#9  0xffffffff828abb5e in drm_gem_handle_delete () from /boot/modules/drm.ko
#10 0xffffffff828af2e1 in drm_ioctl_kernel () from /boot/modules/drm.ko
#11 0xffffffff828af589 in drm_ioctl () from /boot/modules/drm.ko
#12 0xffffffff828f0e58 in linux_file_ioctl_sub (fp=<optimized out>, 
    filp=0xfffffe006bc71888, fop=<optimized out>, cmd=<optimized out>, 
    data=<unavailable>, td=0xfffff80012cc6740)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:978
#13 linux_file_ioctl (fp=<optimized out>, cmd=<optimized out>, 
    data=<optimized out>, cred=<optimized out>, td=0xfffffe00005bf8a8)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:1588
#14 0xffffffff80c28697 in fo_ioctl (fp=0xfffff80012cd1410, com=2148033545, 
    data=<unavailable>, active_cred=<unavailable>, td=0xfffff80012cc6740)
    at /usr/src/sys/sys/file.h:337
#15 kern_ioctl (td=<unavailable>, fd=<optimized out>, com=2148033545, 
    data=<unavailable>) at /usr/src/sys/kern/sys_generic.c:805
#16 0xffffffff80c2833a in sys_ioctl (td=0xfffff80012cc6740, 
    uap=0xfffff80012cc6b00) at /usr/src/sys/kern/sys_generic.c:713
#17 0xffffffff810904c7 in syscallenter (td=0xfffff80012cc6740)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:144
#18 amd64_syscall (td=0xfffff80012cc6740, traced=0)
    at /usr/src/sys/amd64/amd64/trap.c:1163
#19 <signal handler called>
#20 0x0000000800af799a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffffffe3e8
(kgdb) 

Not sure why there's this warning
warning: Source file is more recent than executable.

Parts of dmesg that might be relevant:

Loading kernel modules:
[drm] radeon kernel modesetting enabled.
drmn0: <drmn> on vgapci0
vgapci0: child drmn0 requested pci_enable_io
vgapci0: child drmn0 requested pci_enable_io
[drm] initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x1458:0x21D8 0x00).
[drm:radeon_device_init] Unable to find PCI I/O BAR
[drm:radeon_atombios_init] Unable to find PCI I/O BAR; using MMIO for ATOM IIO
ATOM BIOS: GV
drmn0: VRAM: 1024M 0x0000000000000000 - 0x000000003FFFFFFF (1024M used)
drmn0: GTT: 1024M 0x0000000040000000 - 0x000000007FFFFFFF
Failed to add WC MTRR for [0xd0000000-0xdfffffff]: -28; performance may suffer
[drm] Detected VRAM RAM=1024M, BAR=256M
[drm] RAM width 64bits DDR
[TTM] Zone  kernel: Available graphics memory: 6270988 kiB
[TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[TTM] Initializing pool allocator
[drm] radeon: 1024M of VRAM memory ready
[drm] radeon: 1024M of GTT memory ready.
[drm] Loading CEDAR Microcode
radeon/CEDAR_pfp.bin: could not load firmware image, error 2
drmn0: fail (0) to get firmware image with name: radeon/CEDAR_pfp.bin
drmn0: successfully loaded firmware image with mapped name: radeon_CEDAR_pfp_bin
radeon/CEDAR_me.bin: could not load firmware image, error 2
drmn0: fail (0) to get firmware image with name: radeon/CEDAR_me.bin
drmn0: successfully loaded firmware image with mapped name: radeon_CEDAR_me_bin
radeon/CEDAR_rlc.bin: could not load firmware image, error 2
drmn0: fail (0) to get firmware image with name: radeon/CEDAR_rlc.bin
drmn0: successfully loaded firmware image with mapped name: radeon_CEDAR_rlc_bin
radeon/CEDAR_smc.bin: could not load firmware image, error 2
drmn0: fail (0) to get firmware image with name: radeon/CEDAR_smc.bin
drmn0: successfully loaded firmware image with mapped name: radeon_CEDAR_smc_bin
[drm] Internal thermal controller with fan control
[drm] radeon: dpm initialized
radeon/CYPRESS_uvd.bin: could not load firmware image, error 2
drmn0: fail (0) to get firmware image with name: radeon/CYPRESS_uvd.bin
drmn0: successfully loaded firmware image with mapped name: radeon_CYPRESS_uvd_bin
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
[drm] PCIE GART of 1024M enabled (table at 0x000000000014C000).
drmn0: WB enabled
drmn0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0x0xfffff80012e12c00
drmn0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0x0xfffff80012e12c0c
drmn0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0x0xfffff800d005c418
[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] Driver supports precise vblank timestamp query.
drmn0: radeon: MSI limited to 32-bit
drmn0: radeon: using MSI.
[drm] radeon: irq initialized.
[drm] ring test on 0 succeeded in 1 usecs
[drm] ring test on 3 succeeded in 2 usecs
[drm] ring test on 5 succeeded in 1 usecs
[drm] UVD initialized successfully.
[drm] ib test on ring 0 succeeded in 0 usecs
[drm] ib test on ring 3 succeeded in 0 usecs
[drm] ib test on ring 5 succeeded
[drm] Connector HDMI-A-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.HDMI-A-1
[drm]   - kern.vt.fb.default_mode
[drm] Connector DVI-I-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.DVI-I-1
[drm]   - kern.vt.fb.default_mode
[drm] Connector VGA-1: get mode from tunables:
[drm]   - kern.vt.fb.modes.VGA-1
[drm]   - kern.vt.fb.default_mode
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   HDMI-A-1
[drm]   HPD1
[drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY1
[drm] Connector 1:
[drm]   DVI-I-1
[drm]   HPD4
[drm]   DDC: 0x6450 0x6450 0x6454 0x6454 0x6458 0x6458 0x645c 0x645c
[drm]   Encoders:
[drm]     DFP2: INTERNAL_UNIPHY
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm] Connector 2:
[drm]   VGA-1
[drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[drm]   Encoders:
[drm]     CRT2: INTERNAL_KLDSCP_DAC2
[drm] fb mappable at 0xD034D000
[drm] vram apper at 0xD0000000
[drm] size 5324800
[drm] fb depth is 24
[drm]    pitch is 5888
VT: Replacing driver "vga" with new "fb".
start FB_INFO:
type=11 height=900 width=1440 depth=32
cmsize=16 size=5324800
pbase=0xd034d000 vbase=0xfffff800d034d000
name=drmn0 flags=0x0 stride=5888 bpp=32
cmap[0]=0 cmap[1]=7f0000 cmap[2]=7f00 cmap[3]=c4a000
end FB_INFO
drmn0: fb0: radeondrmfb frame buffer device
[drm] Initialized radeon 2.50.0 20080528 for drmn0 on minor 0

@serpent7776
Copy link
Author

Recompiled drm-fbsd12.0-kmod-4.16.g20201016 with DEBUG option and got another crash.
radeonkms and drm modules seem to miss debug symbols for these addresses.

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address	= 0x1300000024
fault code		= supervisor read data, page not present
instruction pointer	= 0x20:0xffffffff827d8afc
stack pointer	        = 0x28:0xfffffe006bfff480
frame pointer	        = 0x28:0xfffffe006bfff4b0
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 98365 (Xorg:rcs0)
trap number		= 12
panic: page fault
cpuid = 1
time = 1610374345
KDB: stack backtrace:
#0 0xffffffff80c0a8e5 at kdb_backtrace+0x65
#1 0xffffffff80bbeb9b at vpanic+0x17b
#2 0xffffffff80bbea13 at panic+0x43
#3 0xffffffff8108f911 at trap_fatal+0x391
#4 0xffffffff8108f96f at trap_pfault+0x4f
#5 0xffffffff8108efb6 at trap+0x286
#6 0xffffffff81066f68 at calltrap+0x8
#7 0xffffffff827b3921 at radeon_cs_ioctl+0xa41
#8 0xffffffff828af2e1 at drm_ioctl_kernel+0xf1
#9 0xffffffff828af589 at drm_ioctl+0x289
#10 0xffffffff828f0e58 at linux_file_ioctl+0x318
#11 0xffffffff80c28697 at kern_ioctl+0x2b7
#12 0xffffffff80c2833a at sys_ioctl+0xfa
#13 0xffffffff810904c7 at amd64_syscall+0x387
#14 0xffffffff8106788e at fast_syscall_common+0xf8
Uptime: 1d14h37m38s
Dumping 1428 out of 12248 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
warning: Source file is more recent than executable.
55		__asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:371
#2  0xffffffff80bbe7b5 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:451
#3  0xffffffff80bbebf3 in vpanic (fmt=<optimized out>, ap=<optimized out>)
    at /usr/src/sys/kern/kern_shutdown.c:880
#4  0xffffffff80bbea13 in panic (fmt=<unavailable>)
    at /usr/src/sys/kern/kern_shutdown.c:807
#5  0xffffffff8108f911 in trap_fatal (frame=0xfffffe006bfff3c0, 
    eva=81604378660) at /usr/src/sys/amd64/amd64/trap.c:921
#6  0xffffffff8108f96f in trap_pfault (frame=0xfffffe006bfff3c0, 
    usermode=<optimized out>, signo=<optimized out>, ucode=<optimized out>)
    at /usr/src/sys/amd64/amd64/trap.c:739
#7  0xffffffff8108efb6 in trap (frame=0xfffffe006bfff3c0)
    at /usr/src/sys/amd64/amd64/trap.c:405
#8  <signal handler called>
#9  0xffffffff827d8afc in radeon_sync_resv () from /boot/modules/radeonkms.ko
#10 0xffffffff827b3921 in radeon_cs_ioctl () from /boot/modules/radeonkms.ko
#11 0xffffffff828af2e1 in drm_ioctl_kernel () from /boot/modules/drm.ko
#12 0xffffffff828af589 in drm_ioctl () from /boot/modules/drm.ko
#13 0xffffffff828f0e58 in linux_file_ioctl_sub (fp=<optimized out>, 
    filp=0xfffffe006bfff790, fop=<optimized out>, cmd=<optimized out>, 
    data=0x7fffffff <error: Cannot access memory at address 0x7fffffff>, 
    td=0xfffff80052ffa000)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:978
#14 linux_file_ioctl (fp=<optimized out>, cmd=<optimized out>, 
    data=<optimized out>, cred=<optimized out>, td=0xfffff8025db27250)
    at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:1588
#15 0xffffffff80c28697 in fo_ioctl (fp=0xfffff801112db5a0, com=3223348326, 
    data=0x7fffffff, active_cred=0x1300000004, td=0xfffff80052ffa000)
    at /usr/src/sys/sys/file.h:337
#16 kern_ioctl (td=0x2, fd=<optimized out>, com=3223348326, 
    data=0x7fffffff <error: Cannot access memory at address 0x7fffffff>)
    at /usr/src/sys/kern/sys_generic.c:805
#17 0xffffffff80c2833a in sys_ioctl (td=0xfffff80052ffa000, 
    uap=0xfffff80052ffa3c0) at /usr/src/sys/kern/sys_generic.c:713
#18 0xffffffff810904c7 in syscallenter (td=0xfffff80052ffa000)
    at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:144
#19 amd64_syscall (td=0xfffff80052ffa000, traced=0)
    at /usr/src/sys/amd64/amd64/trap.c:1163
#20 <signal handler called>
#21 0x0000000800af799a in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdfffdec8

@serpent7776
Copy link
Author

(kgdb) f 13
#13 0xffffffff828f0e58 in linux_file_ioctl_sub (fp=<optimized out>, filp=0xfffffe006bfff790, fop=<optimized out>, cmd=<optimized out>,
    data=0x7fffffff <error: Cannot access memory at address 0x7fffffff>, td=0xfffff80052ffa000) at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:978
warning: Source file is more recent than executable.
978                             error = -OPW(fp, td, fop->unlocked_ioctl(filp,
(kgdb) p filp
$1 = (struct linux_file *) 0xfffffe006bfff790
(kgdb) p *filp
$2 = {_file = 0x0, f_op = 0x0, private_data = 0x0, f_flags = 0, f_mode = 0, f_dentry = 0x0, f_dentry_store = {d_inode = 0x0, d_pfs_node = 0x0}, f_selinfo = {si_tdlist = {
      tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {slh_first = 0x0}, kl_lock = 0x0, kl_unlock = 0x0, kl_assert_locked = 0x0, kl_assert_unlocked = 0x0,
      kl_lockarg = 0x0, kl_autodestroy = 0}, si_mtx = 0x0}, f_sigio = 0x0, f_vnode = 0x0, f_count = 0, f_shmem = 0x0, f_kqflags = 0, f_kqlock = {m = {lock_object = {
        lo_name = 0x0, lo_flags = 0, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}}, f_wait_queue = {wq = {flags = 0, private = 0x0, func = 0x0, {task_list = {next = 0x0,
          prev = 0x0}, entry = {next = 0x0, prev = 0x0}}}, wqh = 0x0, state = {counter = 0}}, f_cdev = 0x0}

Is such contents of filp ok?

@serpent7776
Copy link
Author

How do I build drm-fbsd12.0-kmod with debugging symbols?
WITH_DEBUG_PORTS+=graphics/drm-fbsd12.0-kmod in make.conf doesn't seem to work. The resulting ko files are slightly bigger, but still without debuging symbols.
Got two more crashes, this time in Xorg, disabled hardware acceleration there as well.

@Bill-Paul
Copy link

Just so everyone knows, this bug still occurs even with FreeBSD 12.3-RELEASE. I'm not sure exactly how to provoke it, but in my case it tends to happen once I start KDE5 and then use Firefox to do a little browsing for a while. Usually within a few minutes, it dies.

I have the following hardware:

vgapci0@pci0:0:1:0: class=0x030000 card=0x168b103c chip=0x96481002 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]'
device = 'Sumo [Radeon HD 6480G]'
class = display
subclass = VGA

[drm] initializing kernel modesetting (SUMO 0x1002:0x9648 0x103C:0x168B 0x00).

However I've also observed the same problem with other Radeon cards such as this one:

vgapci1@pci0:131:0:0: class=0x030000 card=0x90b8103c chip=0x67711002 rev=0x00 hdr=0x00
vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]'
device = 'Caicos XTX [Radeon HD 8490 / R5 235X OEM]'
class = display
subclass = VGA

The crashes are not consistent but always seem to happen somewhere in the drm_ioctl() path.

Furthermore, I have another system with Intel HD Graphics 4000 which is also running FreeBSD 12.3 and using the same drm-fbsd12.0-kmod package, and it does not exhibit this problem.

Based on a hint from here:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237544

I created a new drm-fbsd11.2-kmod port from the ashes of the old one (which has been removed from the ports repo as obsolete), massaged it a little to make it compile on FreeBSD 12.3, and the driver from that package works without any panics. (I'm using it right now.)

Given that the i915kms.ko driver does not crash, as an experiment I took the drm-fbsd12.0-kmod port and replaced the radeon driver directory with the older driver code from the drm-fbsd11.2-kmod directory (that is, I replaced the 4.16 radeon code with the 4.11 radeon code). However this did not fix the problem.

Unfortunately there are extensive changes in both the drm and ttm modules between 4.11 and 4.16. I suspect that there is a bug in the memory management code and for some reason only the Radeon driver triggers it. (I noticed that the i915 driver does not use the ttm APIs.) But because there are so many changes it's hard to say just what's broken. My guess is that there is a locking problem somewhere.

If anyone cares, they can download my hacked drm-fbsd11.2-kmod port here:

https://people.freebsd.org/~wpaul/radeon/

Unpack the drm-fbsd11.2-kmod.tar.gz tarball under /usr/ports/graphics and copy the 4.11 source to /usr/ports/distfiles, then to to drm-fbsd11.2-kmod and type make. Once the build is done, remove the drm-fbsd12.0.kmod package, then do a make install to install the fbsd11.2 one instead.

I hope somebody tracks this bug down eventually. It's apparently been there for a very long time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants