Upboard crashes on OpenCL and USB device use

Jesse Kaukonen
Jesse Kaukonen New Member Posts: 42 ✭✭

We're experienced crashes on Ubilinux 4 when the GPU is doing OpenCL work and a USB device is plugged in. Sometimes this happens when a USB device is already open / being utilized and the GPU does work. These crashes manifest as the device booting or freezing. We're using the standard UpBoard. We've observed this on passive and active cooled versions.

To reproduce:

  • Install ubilinux 4
  • Connect 5V / 6A power supply + UART cable
  • Use minicom to operate the Up
  • Install build-essential git beignet-opencl-icd ocl-icd-opencl-dev
  • git clone https://github.com/ihaque/memtestCL.git
  • cd memtestCL
  • make -f Makefiles/Makefile.linux64
  • ./memtestCL
  • 0 -> enter
  • Connect a USB device in USB2 port. We've tested this with 4G huawei modem and Orbbec Astra Pro
  • Sometimes this causes a system crash / freeze

Further notes

  • Running 'stress -c 4' to stress test CPU and connecting a USB device does not cause any problems
  • Non-OpenCL GPU work is untested - it's not sure if any GPU stress + USB device is the problem, or if this is OpenCL specific
  • Happens with 5V / 4A and 5V / 6A powers
  • So far we've verified this happens with the USB2 ports - USB3 port is not certain yet
  • On some Ups the device hangs indefinitely, some Ups a reboot happens
  • We have reproduced this on 3 Ups so far (all devices we've tried). Two of them have a custom kernel for the librealsense changes, one is default ubilinux 4.
  • Is this a driver bug with OpenCL or USB, problem with ubilinux, or a hardware bug?
  • Sometimes this issue happens instantly as the USB is plugged in, sometimes it takes many reconnects. Sometimes it simply requires the USB device to be in use for a while. On our test video it took multiple replugs. However, before filming the video we got the freeze on first attempt.

Here's a video showing this problem: https://www.youtube.com/watch?v=l8x5Vii5jTQ

Sysinfo: Same as in this thread: https://forum.up-community.org/discussion/2852/nmi-watchdog-bug-soft-lockup-cpu-3-stuck-for-22s#latest: https://us.v-cdn.net/6030431/uploads/editor/g0/g8azf7z7yt2t.log

Comments

  • Javier Arteaga
    Javier Arteaga Emutex Posts: 163 mod

    Hi Jesse,

    Another tough one :)

    There are a few USB-related system hang errata published for Cherry Trail SoCs, but we'll need to dig in some more to figure this out.

    • Just to make sure - can you confirm this never happens on the same boards if you re-plug USB multiple times without GPU or CPU load?
    • Can you reproduce it while playing hardware-accelerated video? In the past I've tested with mpv and x264 full HD video from http://bbb3d.renderfarming.net/download.html.
    • As with your other boot issues, it'd be helpful to recreate the same testing setup based on a recent Ubuntu and try to reproduce the issue. This would help determine whether this is a hardware or a software bug.
    • My suggestion is to test only on known-good 5V / 6A PSUs.
  • Jesse Kaukonen
    Jesse Kaukonen New Member Posts: 42 ✭✭

    Tests with ubilinux4, all upboards had their BIOS upgraded to UPC1DM07. Same 5V/6A PSU on all Ups, bought from Mouser. It's the official AAEON PSU. https://www.mouser.fi/ProductDetail/AAEON-UP/EP-PS5V6A65WUPS?qs=sGAEpiMZZMve4/bfQkoj+Po0V0IgBFDNrLMM9H84DIc=

    Tests done with monitor + mouse + keyboard + ethernet connected

    u7006

    • Playing BBB via mpv normally. No issues.
    • Same, but plugging Astra in and out for the duration of the film. No issues.
    • memtestCL + plugging Astra in and out. No issues.

    u7023

    • Plugging in Astra Pro with nothing going on: 40 plugs in and out, no issues
    • Plugging in Astra Pro while playing BBB with mpv, constantly plugging the Astra in and out while the movie plays: no issues
    • Compiled memttestCL + beignet driver, memtestCL running and replugging USB: No issues
    • Tested with only UART + power connected, memtestCL and replugging Astra: No issues

    u7024, the problematic UP from the video

    • Plugging Astra in and out for 70 times with no CPU or GPU load: No issues
    • Playing BBB via mpv normally: No issues for two whole playthroughs
    • Playing BBB via mpv + plugging Astra in and out: Screen went green and the Up froze after a couple of minutes. I guess this tells us this is not OpenCL specific, but GPU load specific. Tried twice, happened twice. Kernel logs don't say anything meaningful - they simply stop while information about the USB device is written.
    • memtestCL not retested - known to be a problem as per the original post

    u7022

    • Plugging Astra in and out for 70 times with no CPU or GPU load: No issues
    • Playing BBB via mpv normally: No issues for one whole playthrough
    • Playing BBB via mpv + plugging Astra in and out: No issues for two playthroughs.
    • memtestCL + plugging Astra in and out: Display went black after some time around test iteration 8, system became unresponsive, Up failed to boot after plugging power in and required a second power cable replugging. Reproduced twice. u7022_gpu_kern.log.

    u7022 with Ubuntu 17

    • Latest updates installed, rebooted. Apparently the Intel GPU driver update tool is deprecated. glxgears -info shows "GL_RENDERER: Mesa DRI Intel HD Graphics Cherrytrail". Installed the OpenCL beignet icd.
    • Plugging in USB alone without anything else going on: No issues
    • Playing BBB using mpv normally: Loading the video took about 30-60 seconds, after that smooth playback without issues.
    • Playing BBB using mpv and plugging Astra in and out: No issues
    • memtestCL + plugging Astra in and out: System crash and reboot. System also froze on the login screen after reboot was done. u7022_gpu_ubuntu.log
    • memtestCL alone with no USB replugging. No issues.

    u7024 with Ubuntu 17

    • Latest updates + beignet icd installed, glxgears output identical to u7022
    • Plugging in USB alone without anything else going on: No issues
    • Playing BBB using mpv normally: Loading the video took about 30-60 seconds, after that smooth playback without issues.
    • Playing BBB using mpv and plugging Astra in and out: After a couple of minutes screen went gray and the Up rebooted. Ubuntu then froze in the login screen. u7024_gpu_ubuntu.log
    • memtestCL without USB devices: No issues
    • memtestCL while plugging Astra in and out: System crash on second replug. Ubuntu rebooted and froze to the login screen.

    We have seen this behavior on several of our Ups in deployment, although not all. The solution to the Ups booting constantly has generally been to disable the program utilizing OpenCL, leaving only CPU + USB usage active. After this the Ups have become stable.

  • Javier Arteaga
    Javier Arteaga Emutex Posts: 163 mod

    Hi Jesse,

    Excellent report - thank you.

    So it happens on a recent kernel from 17.10 too. My gut feeling is "SoC issue" but I'll have to look around for clues on this one. I'll keep you posted.

  • Jesse Kaukonen
    Jesse Kaukonen New Member Posts: 42 ✭✭

    We may have a solution.

    We did tests on 5 of our Upboards. We have been using a particular extension cable with the Up power plug, to get a power cable out from a device casing. All 5 of the Ups reproduced OpenCL + USB crashes with one of these cables, and all 5 were stable without it (directly plugging in the 5V / 6A PSU).

    We are continuing our tests on more devices, but it seems like these extension cables are of varying quality and some of them cause our devices to crash.

    Unfortunately, the other boot hanging issue seems to be different and happens without these extension cables.

  • ccalde
    ccalde New Member Posts: 348 ✭✭✭

    Hi @Jesse Kaukonen ,

    Good to know that you found a solution for the extension cable.

    Any update with the others issues?

    Thank you!