drm/i915: Resetting chip after gpu hang
MICHAEL A POPE
New Member Posts: 8 ✭
I'm running ubilinux 4.0 with Kernel 4.12.0-0.bpo.2-amd64 and this morning whilst playing a video off the twitch website within google-chrome the whole computer froze. The computer still had plenty of disk and memory space and was not overheating. I was still able to ssh in though and capture this from the dmesg log;
To get around Video tearing I have done this in my /etc/X11/xorg.conf file;
What can I do so this problem doesn't happen again?
[32344.796818] drm/i915: Resetting chip after gpu hang [32344.800711] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070 [32344.809593] IP: reset_common_ring+0x8d/0x110 [i915] [32344.815084] PGD 0 [32344.815086] P4D 0 [32344.821281] Oops: 0000 [#1] SMP [32344.824805] Modules linked in: bnep 8021q garp mrp stp llc snd_hda_codec_hdmi xt_connmark cpufreq_conservative iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 cpufreq_userspace nf_nat_ipv4 nf_nat nf_conntrack libcrc32c cpufreq_powersave iptable_mangle iptable_filter bluetooth joydev ecdh_generic rfkill hid_logitech_hidpp usb_f_acm usb_f_fs usb_f_serial u_serial libcomposite udc_core configfs hid_generic hid_logitech_dj binfmt_misc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal coretemp i2c_designware_platform kvm_intel i2c_designware_core kvm evdev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_soc_skl intel_rapl_perf efi_pstore snd_soc_skl_ipc snd_soc_sst_ipc efivars snd_soc_sst_dsp snd_hda_ext_core pcspkr snd_soc_sst_match lpc_ich snd_soc_core snd_compress snd_hda_intel [32344.904437] snd_hda_codec i915 snd_hda_core snd_hwdep idma64 snd_pcm ftdi_sio snd_timer usbserial drm_kms_helper snd intel_lpss_pci intel_lpss soundcore drm mei_me shpchp mfd_core mei i2c_algo_bit video button usbhid hid i2c_dev parport_pc ppdev lp parport efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache mmc_block crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ahci libahci libata xhci_pci sdhci_pci i2c_i801 xhci_hcd sdhci mmc_core usbcore usb_common r8169 mii scsi_mod [32344.956903] CPU: 0 PID: 13569 Comm: kworker/0:0 Not tainted 4.12.0-0.bpo.2-amd64 #1 Debian 4.12.13-1~bpo9+1 [32344.967847] Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM18 06/23/2017 [32344.975807] Workqueue: events_long i915_hangcheck_elapsed [i915] [32344.982575] task: ffff96e549126180 task.stack: ffffa6a58c320000 [32344.989268] RIP: 0010:reset_common_ring+0x8d/0x110 [i915] [32344.995334] RSP: 0018:ffffa6a58c323b98 EFLAGS: 00010206 [32345.001201] RAX: 0000000000003f50 RBX: ffff96e590e87900 RCX: ffff96e5b3ebdc38 [32345.009214] RDX: 0000000000003f90 RSI: ffff96e5b3e0c000 RDI: ffff96e5b3ebdc00 [32345.017229] RBP: ffffa6a58c323bb8 R08: 0000000000000e44 R09: ffffa6a5b0014540 [32345.025242] R10: 00000000ffffffff R11: ffff96e5b1186040 R12: ffff96e5b1850000 [32345.033252] R13: 0000000000000000 R14: ffff96e5b51ba900 R15: ffff96e5b51b8000 [32345.041283] FS: 0000000000000000(0000) GS:ffff96e5bfc00000(0000) knlGS:0000000000000000 [32345.050368] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [32345.056817] CR2: 0000000000000070 CR3: 000000027194a000 CR4: 00000000003406f0 [32345.064829] Call Trace: [32345.067577] ? bit_wait_io_timeout+0x90/0x90 [32345.072406] ? i915_gem_reset+0xbe/0x370 [i915] [32345.077537] ? intel_uncore_forcewake_put+0x36/0x50 [i915] [32345.083702] ? bit_wait_io_timeout+0x90/0x90 [32345.088526] ? i915_reset+0xd9/0x160 [i915] [32345.093245] ? i915_reset_and_wakeup+0x17d/0x190 [i915] [32345.099152] ? i915_handle_error+0x1df/0x220 [i915] [32345.104629] ? scnprintf+0x49/0x80 [32345.108488] ? hangcheck_declare_hang+0xce/0xf0 [i915] [32345.114306] ? fwtable_read32+0x83/0x1b0 [i915] [32345.119441] ? i915_hangcheck_elapsed+0x2b1/0x2e0 [i915] [32345.125404] ? process_one_work+0x181/0x370 [32345.130105] ? worker_thread+0x4d/0x3a0 [32345.134411] ? kthread+0xfc/0x130 [32345.138129] ? process_one_work+0x370/0x370 [32345.142822] ? kthread_create_on_node+0x70/0x70 [32345.147907] ? do_group_exit+0x3a/0xa0 [32345.152114] ? ret_from_fork+0x25/0x30 [32345.156317] Code: c8 01 00 00 89 50 14 48 8b 83 80 00 00 00 8b 93 c8 01 00 00 89 50 28 48 8b bb 80 00 00 00 e8 2b 29 00 00 4d 8b ac 24 60 02 00 00 <49> 8b 45 70 48 39 43 70 74 51 4d 85 ed 74 14 4c 89 ef e8 cc c1 [32345.177577] RIP: reset_common_ring+0x8d/0x110 [i915] RSP: ffffa6a58c323b98 [32345.185298] CR2: 0000000000000070 [32345.204182] ---[ end trace e28431d538f56e18 ]--- [32345.466388] BUG: unable to handle kernel paging request at fffffffeecabe52a [32345.474229] IP: 0xfffffffeecabe52a [32345.478046] PGD 1e6c0c067 [32345.478048] P4D 1e6c0c067 [32345.481081] PUD 0 [32345.488029] Oops: 0010 [#2] SMP [32345.491541] Modules linked in: bnep 8021q garp mrp stp llc snd_hda_codec_hdmi xt_connmark cpufreq_conservative iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 cpufreq_userspace nf_nat_ipv4 nf_nat nf_conntrack libcrc32c cpufreq_powersave iptable_mangle iptable_filter bluetooth joydev ecdh_generic rfkill hid_logitech_hidpp usb_f_acm usb_f_fs usb_f_serial u_serial libcomposite udc_core configfs hid_generic hid_logitech_dj binfmt_misc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal coretemp i2c_designware_platform kvm_intel i2c_designware_core kvm evdev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_soc_skl intel_rapl_perf efi_pstore snd_soc_skl_ipc snd_soc_sst_ipc efivars snd_soc_sst_dsp snd_hda_ext_core pcspkr snd_soc_sst_match lpc_ich snd_soc_core snd_compress snd_hda_intel [32345.571112] snd_hda_codec i915 snd_hda_core snd_hwdep idma64 snd_pcm ftdi_sio snd_timer usbserial drm_kms_helper snd intel_lpss_pci intel_lpss soundcore drm mei_me shpchp mfd_core mei i2c_algo_bit video button usbhid hid i2c_dev parport_pc ppdev lp parport efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache mmc_block crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ahci libahci libata xhci_pci sdhci_pci i2c_i801 xhci_hcd sdhci mmc_core usbcore usb_common r8169 mii scsi_mod [32345.623555] CPU: 0 PID: 31039 Comm: kworker/0:1 Tainted: G D 4.12.0-0.bpo.2-amd64 #1 Debian 4.12.13-1~bpo9+1 [32345.635855] Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM18 06/23/2017 [32345.643768] Workqueue: events efivar_update_sysfs_entries [efivars] [32345.650804] task: ffff96e56d541180 task.stack: ffffa6a58c280000 [32345.660156] RIP: 0010:0xfffffffeecabe52a [32345.667245] RSP: 0018:ffffa6a58c283bd8 EFLAGS: 00010206 [32345.675766] RAX: 00000000000000ff RBX: ffffa6a58c283c58 RCX: 0000000000000440 [32345.686489] RDX: 00000000000000b2 RSI: ffff96e589e91000 RDI: 0000000000000400 [32345.697194] RBP: ffffa6a58c283dc8 R08: 00000000000000ff R09: 0000000000000000 [32345.707927] R10: 000000007a066018 R11: 0000000000000021 R12: ffffa6a58c283dd0 [32345.718585] R13: 8000000000000015 R14: 0000000000000286 R15: ffffffffb5cd1e68 [32345.729169] FS: 0000000000000000(0000) GS:ffff96e5bfc00000(0000) knlGS:0000000000000000 [32345.740765] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [32345.749639] CR2: fffffffeecabe52a CR3: 00000001e6c09000 CR4: 00000000003406f0 [32345.759999] Call Trace: [32345.765036] ? set_next_entity+0xcd/0x200 [32345.771736] ? efi_call+0x58/0x90 [32345.777624] ? virt_efi_get_next_variable+0x76/0x120 [32345.785361] ? efivar_init+0x13e/0x3d0 [32345.791746] ? efivar_release+0x20/0x20 [efivars] [32345.799273] ? up+0x12/0x60 [32345.804570] ? efivar_entry_add+0x53/0x80 [32345.811211] ? efivar_update_sysfs_entries+0x24/0x60 [efivars] [32345.819924] ? efivar_update_sysfs_entries+0x24/0x60 [efivars] [32345.828620] ? process_one_work+0x181/0x370 [32345.835442] ? worker_thread+0x4d/0x3a0 [32345.841945] ? kthread+0xfc/0x130 [32345.847803] ? process_one_work+0x370/0x370 [32345.854643] ? kthread_create_on_node+0x70/0x70 [32345.861826] ? do_group_exit+0x3a/0xa0 [32345.868101] ? ret_from_fork+0x25/0x30 [32345.874332] Code: Bad RIP value. [32345.880152] RIP: 0xfffffffeecabe52a RSP: ffffa6a58c283bd8 [32345.888252] CR2: fffffffeecabe52a [32345.894016] ---[ end trace e28431d538f56e19 ]--- [32347.511656] ------------[ cut here ]------------ [32347.516838] WARNING: CPU: 0 PID: 11608 at /build/linux-RdeW6Z/linux-4.12.13/arch/x86/kernel/fpu/core.c:46 __kernel_fpu_begin+0x9c/0xb0 [32347.530411] Modules linked in: bnep 8021q garp mrp stp llc snd_hda_codec_hdmi xt_connmark cpufreq_conservative iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 cpufreq_userspace nf_nat_ipv4 nf_nat nf_conntrack libcrc32c cpufreq_powersave iptable_mangle iptable_filter bluetooth joydev ecdh_generic rfkill hid_logitech_hidpp usb_f_acm usb_f_fs usb_f_serial u_serial libcomposite udc_core configfs hid_generic hid_logitech_dj binfmt_misc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal coretemp i2c_designware_platform kvm_intel i2c_designware_core kvm evdev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_soc_skl intel_rapl_perf efi_pstore snd_soc_skl_ipc snd_soc_sst_ipc efivars snd_soc_sst_dsp snd_hda_ext_core pcspkr snd_soc_sst_match lpc_ich snd_soc_core snd_compress snd_hda_intel [32347.609679] snd_hda_codec i915 snd_hda_core snd_hwdep idma64 snd_pcm ftdi_sio snd_timer usbserial drm_kms_helper snd intel_lpss_pci intel_lpss soundcore drm mei_me shpchp mfd_core mei i2c_algo_bit video button usbhid hid i2c_dev parport_pc ppdev lp parport efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache mmc_block crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ahci libahci libata xhci_pci sdhci_pci i2c_i801 xhci_hcd sdhci mmc_core usbcore usb_common r8169 mii scsi_mod [32347.661922] CPU: 0 PID: 11608 Comm: kworker/u8:0 Tainted: G D 4.12.0-0.bpo.2-amd64 #1 Debian 4.12.13-1~bpo9+1 [32347.674288] Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM18 06/23/2017 [32347.682190] Workqueue: writeback wb_workfn (flush-179:0) [32347.688146] task: ffff96e5b2912080 task.stack: ffffa6a58bda0000 [32347.694782] RIP: 0010:__kernel_fpu_begin+0x9c/0xb0 [32347.700155] RSP: 0018:ffffa6a58bda3760 EFLAGS: 00010202 [32347.706011] RAX: 0000000080000001 RBX: 0000000000001000 RCX: ffff96e5b2912080 [32347.714017] RDX: 0000000000000000 RSI: ffff96e59ad61000 RDI: ffffa6a58bda37e8 [32347.722006] RBP: ffffa6a58bda37e8 R08: 0000000000000000 R09: 0000000000001000 [32347.730016] R10: ffff96e5b5ebe800 R11: fffff0b0496b5840 R12: ffff96e59ad61000 [32347.738007] R13: 0000000000000000 R14: ffff96e59ad61000 R15: ffffffffc03dc050 [32347.746014] FS: 0000000000000000(0000) GS:ffff96e5bfc00000(0000) knlGS:0000000000000000 [32347.755081] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [32347.761512] CR2: 0000559a9b118588 CR3: 00000001e6c09000 CR4: 00000000003406f0 [32347.769503] Call Trace: [32347.772253] ? crc32c_pcl_intel_update+0x79/0xa0 [crc32c_intel] [32347.778899] ? crypto_shash_update+0x3f/0x110 [32347.783804] ? ext4_block_bitmap_csum_set+0x6a/0xb0 [ext4] [32347.789968] ? ext4_mb_mark_diskspace_used+0x1f4/0x490 [ext4] [32347.796438] ? ext4_mb_new_blocks+0x30a/0xac0 [ext4] [32347.802004] ? blk_attempt_req_merge+0x3c/0x60 [32347.807023] ? ext4_find_extent+0x28e/0x2d0 [ext4] [32347.812409] ? ext4_ext_map_blocks+0xb1b/0x1290 [ext4] [32347.818191] ? __alloc_pages_slowpath+0x9a1/0xd10 [32347.823501] ? ext4_map_blocks+0x164/0x590 [ext4] [32347.828789] ? ext4_writepages+0x81b/0xe50 [ext4] [32347.834065] ? update_group_capacity+0x23/0x1e0 [32347.839154] ? cpumask_next_and+0x26/0x40 [32347.843645] ? strlcpy+0x31/0x40 [32347.847257] ? do_writepages+0x17/0x60 [32347.851466] ? do_writepages+0x17/0x60 [32347.855671] ? __writeback_single_inode+0x3d/0x300 [32347.861035] ? writeback_sb_inodes+0x221/0x4f0 [32347.866030] ? __writeback_inodes_wb+0x87/0xb0 [32347.871021] ? wb_writeback+0x282/0x310 [32347.875320] ? set_next_entity+0xcd/0x200 [32347.879820] ? wb_workfn+0x2ce/0x3a0 [32347.883823] ? wb_workfn+0x2ce/0x3a0 [32347.887838] ? process_one_work+0x181/0x370 [32347.892521] ? worker_thread+0x4d/0x3a0 [32347.896831] ? kthread+0xfc/0x130 [32347.900547] ? process_one_work+0x370/0x370 [32347.905236] ? kthread_create_on_node+0x70/0x70 [32347.910319] ? do_group_exit+0x3a/0xa0 [32347.914517] ? ret_from_fork+0x25/0x30 [32347.918715] Code: 48 8d b9 40 0b 00 00 85 c0 74 24 b8 ff ff ff ff 89 c2 48 0f c7 2f 31 c0 85 c0 75 17 f3 c3 0f ff eb aa 48 0f ae 81 40 0b 00 00 c3 <0f> ff eb a8 0f ff eb d8 0f ff c3 66 0f 1f 84 00 00 00 00 00 0f [32347.939850] ---[ end trace e28431d538f56e1a ]--- [32349.815621] asynchronous wait on fence i915:Xorg[585]/0:e45 timed out [32349.875629] pipe A vblank wait timed out [32349.880108] ------------[ cut here ]------------ [32349.885366] WARNING: CPU: 1 PID: 11608 at /build/linux-RdeW6Z/linux-4.12.13/drivers/gpu/drm/i915/intel_display.c:12636 intel_atomic_commit_tail+0xf21/0xf50 [i915] [32349.901694] Modules linked in: bnep 8021q garp mrp stp llc snd_hda_codec_hdmi xt_connmark cpufreq_conservative iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 cpufreq_userspace nf_nat_ipv4 nf_nat nf_conntrack libcrc32c cpufreq_powersave iptable_mangle iptable_filter bluetooth joydev ecdh_generic rfkill hid_logitech_hidpp usb_f_acm usb_f_fs usb_f_serial u_serial libcomposite udc_core configfs hid_generic hid_logitech_dj binfmt_misc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal coretemp i2c_designware_platform kvm_intel i2c_designware_core kvm evdev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_soc_skl intel_rapl_perf efi_pstore snd_soc_skl_ipc snd_soc_sst_ipc efivars snd_soc_sst_dsp snd_hda_ext_core pcspkr snd_soc_sst_match lpc_ich snd_soc_core snd_compress snd_hda_intel [32349.980956] snd_hda_codec i915 snd_hda_core snd_hwdep idma64 snd_pcm ftdi_sio snd_timer usbserial drm_kms_helper snd intel_lpss_pci intel_lpss soundcore drm mei_me shpchp mfd_core mei i2c_algo_bit video button usbhid hid i2c_dev parport_pc ppdev lp parport efivarfs ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache mmc_block crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ahci libahci libata xhci_pci sdhci_pci i2c_i801 xhci_hcd sdhci mmc_core usbcore usb_common r8169 mii scsi_mod [32350.033243] CPU: 1 PID: 11608 Comm: kworker/u8:0 Tainted: G D W 4.12.0-0.bpo.2-amd64 #1 Debian 4.12.13-1~bpo9+1 [32350.045593] Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM18 06/23/2017 [32350.053533] Workqueue: events_unbound intel_atomic_commit_work [i915] [32350.060742] task: ffff96e5b2912080 task.stack: ffffa6a58bda0000 [32350.067422] RIP: 0010:intel_atomic_commit_tail+0xf21/0xf50 [i915] [32350.074247] RSP: 0018:ffffa6a58bda3da8 EFLAGS: 00010286 [32350.080113] RAX: 000000000000001c RBX: ffff96e5b51b8000 RCX: 0000000000000000 [32350.088096] RDX: 0000000000000000 RSI: ffff96e5bfc8dee8 RDI: ffff96e5bfc8dee8 [32350.096094] RBP: 0000000000000000 R08: ffff96e5b099ac18 R09: 00000000000003d5 [32350.104076] R10: ffffa6a58bda3da8 R11: ffffffffb5ecddcd R12: 0000000000000000 [32350.112074] R13: 0000000000000000 R14: ffff96e5b20bb000 R15: 0000000000000001 [32350.120057] FS: 0000000000000000(0000) GS:ffff96e5bfc80000(0000) knlGS:0000000000000000 [32350.129140] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [32350.135568] CR2: 000003dd16f14800 CR3: 000000022d622000 CR4: 00000000003406e0 [32350.143566] Call Trace: [32350.146319] ? remove_wait_queue+0x60/0x60 [32350.150901] ? process_one_work+0x181/0x370 [32350.155583] ? worker_thread+0x4d/0x3a0 [32350.159900] ? kthread+0xfc/0x130 [32350.163600] ? process_one_work+0x370/0x370 [32350.168279] ? kthread_create_on_node+0x70/0x70 [32350.173350] ? do_group_exit+0x3a/0xa0 [32350.177563] ? ret_from_fork+0x25/0x30 [32350.181760] Code: 4c 89 44 24 08 48 83 c7 08 e8 5c 4b 7a f4 4c 8b 44 24 08 4d 85 c0 0f 85 36 fe ff ff 8d 75 41 48 c7 c7 b0 9e 98 c0 e8 05 2b 87 f4 <0f> ff e9 20 fe ff ff 8d 70 41 48 c7 c7 80 9e 98 c0 e8 ef 2a 87 [32350.202948] ---[ end trace e28431d538f56e1b ]---
To get around Video tearing I have done this in my /etc/X11/xorg.conf file;
Section "Device" Identifier "Intel Graphics" Driver "intel" Option "TearFree" "true" EndSection
What can I do so this problem doesn't happen again?