sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure

Options
Peter
Peter New Member Posts: 7
Hi,

I think I have a similar problem as described in this thread:

https://up-community.org/forum/public-upboardhw/1170-emmc-error-kernel-remounting-filesystem-to-read-only

I just got new UP board with 4 G RAM / 32 G eMMC and installed latest CentOS 7 with updated kernel 3.10.0-514.26.2.el7.x86_64. It has been running without a problem for a couple of days acting as an asterisk/FreePBX box. Today I started observing (3 times as of now) the following kernel warning:

[ 9854.847519] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock
[ 9864.869413] mmc0: Timeout waiting for hardware interrupt.
[ 9864.871688]
[ cut here ]
[ 9864.871764] WARNING: at drivers/mmc/host/sdhci.c:1004 sdhci_send_command+0x355/0x3b0 [sdhci]()
[ 9864.871775] Modules linked in: ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter vfat fat r8712u(C) intel_powerclamp coretemp intel_rapl iosf_mbi snd_intel_sst_acpi snd_intel_sst_core kvm_intel snd_soc_sst_mfld_platform kvm snd_soc_sst_match snd_soc_core irqbypass snd_compress crc32_pclmul ghash_clmulni_intel snd_seq aesni_intel snd_seq_device lrw gf128mul snd_pcm glue_helper ablk_helper cryptd snd_timer snd pcspkr soundcore
[ 9864.871995] shpchp acpi_pad tpm_crb int3400_thermal acpi_cpufreq ip_tables xfs libcrc32c i915 mmc_block i2c_algo_bit drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul crct10dif_common sysimgblt fb_sys_fops crc32c_intel drm r8169 mii i2c_core video fjes sdhci_acpi sdhci mmc_core dm_mirror dm_region_hash dm_log dm_mod
[ 9864.872121] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G C
3.10.0-514.26.2.el7.x86_64 #1
[ 9864.872133] Hardware name: AAEON UP-CHT01/UP-CHT01, BIOS UPC1BM0S 06/04/2016
[ 9864.872146] 0000000000000000 4351379fe9d8f3ea ffff88017fc83d28 ffffffff81687133
[ 9864.872169] ffff88017fc83d60 ffffffff81085cb0 ffff880179dbd500 ffff880179dbd500
[ 9864.872187] 0000000000000282 ffff880179dc1ad8 ffff880179dbd500 ffff88017fc83d70
[ 9864.872208] Call Trace:
[ 9864.872221] <IRQ> [<ffffffff81687133>] dump_stack+0x19/0x1b
[ 9864.872275] [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0
[ 9864.872294] [<ffffffff81085dfa>] warn_slowpath_null+0x1a/0x20
[ 9864.872331] [<ffffffffa00c8f45>] sdhci_send_command+0x355/0x3b0 [sdhci]
[ 9864.872369] [<ffffffffa00c9310>] ? sdhci_finish_data+0x370/0x370 [sdhci]
[ 9864.872403] [<ffffffffa00c90b0>] sdhci_finish_data+0x110/0x370 [sdhci]
[ 9864.872437] [<ffffffffa00c9310>] ? sdhci_finish_data+0x370/0x370 [sdhci]
[ 9864.872471] [<ffffffffa00c938d>] sdhci_timeout_timer+0x7d/0xc0 [sdhci]
[ 9864.872493] [<ffffffff81095eb6>] call_timer_fn+0x36/0x110
[ 9864.872527] [<ffffffffa00c9310>] ? sdhci_finish_data+0x370/0x370 [sdhci]
[ 9864.872546] [<ffffffff81098ba7>] run_timer_softirq+0x237/0x340
[ 9864.872571] [<ffffffff8108f63f>] __do_softirq+0xef/0x280
[ 9864.872592] [<ffffffff8169929c>] call_softirq+0x1c/0x30
[ 9864.872611] [<ffffffff8102d365>] do_softirq+0x65/0xa0
[ 9864.872630] [<ffffffff8108f9d5>] irq_exit+0x115/0x120
[ 9864.872651] [<ffffffff81699f15>] smp_apic_timer_interrupt+0x45/0x60
[ 9864.872670] [<ffffffff8169845d>] apic_timer_interrupt+0x6d/0x80
[ 9864.872681] <EOI> [<ffffffff81060fe6>] ? native_safe_halt+0x6/0x10
[ 9864.872721] [<ffffffff810347ff>] default_idle+0x1f/0xc0
[ 9864.872742] [<ffffffff81035146>] arch_cpu_idle+0x26/0x30
[ 9864.872765] [<ffffffff810e82f5>] cpu_startup_entry+0x245/0x290
[ 9864.872789] [<ffffffff8104f0da>] start_secondary+0x1ba/0x230
[ 9864.872807] ---[ end trace 0ccd1d36857b1e55 ]---
[ 9864.875011] mmcblk0: error -110 sending stop command, original cmd response 0x0, card status 0x400900
[ 9864.875118] mmcblk0: error -110 transferring data, sector 3924720, nr 576, cmd response 0x0, card status 0x0
[ 9864.875585] blk_update_request: I/O error, dev mmcblk0, sector 3924720
[ 9864.875873] blk_update_request: I/O error, dev mmcblk0, sector 3924728
[ 9864.876154] blk_update_request: I/O error, dev mmcblk0, sector 3924736
[ 9864.876468] blk_update_request: I/O error, dev mmcblk0, sector 3924744
[ 9864.876754] blk_update_request: I/O error, dev mmcblk0, sector 3924752
[ 9864.877031] blk_update_request: I/O error, dev mmcblk0, sector 3924760
[ 9864.877311] blk_update_request: I/O error, dev mmcblk0, sector 3924768
[ 9864.877632] blk_update_request: I/O error, dev mmcblk0, sector 3924776
[ 9864.877921] blk_update_request: I/O error, dev mmcblk0, sector 3924784
[ 9864.878198] blk_update_request: I/O error, dev mmcblk0, sector 3924792
[ 9864.878689] Buffer I/O error on dev dm-0, logical block 176734, lost async page write
[ 9864.879050] Buffer I/O error on dev dm-0, logical block 176735, lost async page write
[ 9864.879424] Buffer I/O error on dev dm-0, logical block 176736, lost async page write
[ 9864.879772] Buffer I/O error on dev dm-0, logical block 176737, lost async page write
[ 9864.880112] Buffer I/O error on dev dm-0, logical block 176738, lost async page write
[ 9864.880481] Buffer I/O error on dev dm-0, logical block 176739, lost async page write
[ 9864.880827] Buffer I/O error on dev dm-0, logical block 176740, lost async page write
[ 9864.881163] Buffer I/O error on dev dm-0, logical block 176741, lost async page write
[ 9864.881559] Buffer I/O error on dev dm-0, logical block 176742, lost async page write
[ 9864.881906] Buffer I/O error on dev dm-0, logical block 176743, lost async page write

The filesystem does not get remounted as readonly as in abovementioned thread (perhaps because the default filesystem for CentOS is XFS) but the warning indicates that it might be a problem that this patch has been designed for:

http://www.spinics.net/lists/linux-mmc/msg40826.html

I have a couple of questions:

- What is your opinion? Will this patch solve the problem? (I'm planing to build my own patched CentOS kernel)
- Is it possible for filesystem to get corrupted because of this problem/warning occuring?
- Do you plan to submit those patches upstream so that other distributions would run on UP board without problems?

Thanks,

Peter

Comments

  • Peter
    Peter New Member Posts: 7
    Options
    I'm happy to inform that the abovementioned patch helped solve the problem. With just that patch applied to latest CentOS 7 kernel, I built the kernel and am now running it on the UP board. Since than I saw the following in the kernel messages:

    [ 8842.316198] sdhci: Timeout waiting for Buffer Read Ready interrupt during tuning procedure, falling back to fixed sampling clock

    ...but there were no I/O errors as a consequence. Building CentOS 7 kernel was easy following these instructions:

    https://wiki.centos.org/HowTos/Custom_Kernel

    I didn't build it on UP board - it would take too long. I created a virtual machine on my fast i7 workstation with minimal CentOS installation (1 minute) and just followed further instructions from CentOS wiki...