• Alan Stern's avatar
    USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages · 22f00812
    Alan Stern authored
    The syzbot fuzzer found that the interrupt-URB completion callback in
    the cdc-wdm driver was taking too long, and the driver's immediate
    resubmission of interrupt URBs with -EPROTO status combined with the
    dummy-hcd emulation to cause a CPU lockup:
    
    cdc_wdm 1-1:1.0: nonzero urb status received: -71
    cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes
    watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625]
    CPU#0 Utilization every 4s during lockup:
    	#1:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
    	#2:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
    	#3:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
    	#4:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
    	#5:  98% system,	  1% softirq,	  3% hardirq,	  0% idle
    Modules linked in:
    irq event stamp: 73096
    hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]
    hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994
    hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
    hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
    softirqs last  enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline]
    softirqs last  enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582
    softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588
    CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G        W          6.10.0-rc2-syzkaller-g8867bbd4a056 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
    
    Testing showed that the problem did not occur if the two error
    messages -- the first two lines above -- were removed; apparently adding
    material to the kernel log takes a surprisingly large amount of time.
    
    In any case, the best approach for preventing these lockups and to
    avoid spamming the log with thousands of error messages per second is
    to ratelimit the two dev_err() calls.  Therefore we replace them with
    dev_err_ratelimited().
    Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
    Suggested-by: default avatarGreg KH <gregkh@linuxfoundation.org>
    Reported-and-tested-by: syzbot+5f996b83575ef4058638@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/linux-usb/00000000000073d54b061a6a1c65@google.com/
    Reported-and-tested-by: syzbot+1b2abad17596ad03dcff@syzkaller.appspotmail.com
    Closes: https://lore.kernel.org/linux-usb/000000000000f45085061aa9b37e@google.com/
    Fixes: 9908a32e ("USB: remove err() macro from usb class drivers")
    Link: https://lore.kernel.org/linux-usb/40dfa45b-5f21-4eef-a8c1-51a2f320e267@rowland.harvard.edu/
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/29855215-52f5-4385-b058-91f42c2bee18@rowland.harvard.eduSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    22f00812
cdc-wdm.c 32.8 KB