• Tejun Heo's avatar
    workqueue: workqueue_congested() shouldn't translate WORK_CPU_UNBOUND into node number · d3251859
    Tejun Heo authored
    df2d5ae4 ("workqueue: map an unbound workqueues to multiple per-node
    pool_workqueues") made unbound workqueues to map to multiple per-node
    pool_workqueues and accordingly updated workqueue_contested() so that,
    for unbound workqueues, it maps the specified @cpu to the NUMA node
    number to obtain the matching pool_workqueue to query the congested
    state.
    
    Before this change, workqueue_congested() ignored @cpu for unbound
    workqueues as there was only one pool_workqueue and some users
    (fscache) called it with WORK_CPU_UNBOUND.  After the commit, this
    causes the following oops as WORK_CPU_UNBOUND gets translated to
    garbage by cpu_to_node().
    
      BUG: unable to handle kernel paging request at ffff8803598d98b8
      IP: [<ffffffff81043b7e>] unbound_pwq_by_node+0xa1/0xfa
      PGD 2421067 PUD 0
      Oops: 0000 [#1] SMP
      CPU: 1 PID: 2689 Comm: cat Tainted: GF            3.9.0-fsdevel+ #4
      task: ffff88003d801040 ti: ffff880025806000 task.ti: ffff880025806000
      RIP: 0010:[<ffffffff81043b7e>]  [<ffffffff81043b7e>] unbound_pwq_by_node+0xa1/0xfa
      RSP: 0018:ffff880025807ad8  EFLAGS: 00010202
      RAX: 0000000000000001 RBX: ffff8800388a2400 RCX: 0000000000000003
      RDX: ffff880025807fd8 RSI: ffffffff81a31420 RDI: ffff88003d8016e0
      RBP: ffff880025807ae8 R08: ffff88003d801730 R09: ffffffffa00b4898
      R10: ffffffff81044217 R11: ffff88003d801040 R12: 0000000064206e97
      R13: ffff880036059d98 R14: ffff880038cc8080 R15: ffff880038cc82d0
      FS:  00007f21afd9c740(0000) GS:ffff88003d100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: ffff8803598d98b8 CR3: 000000003df49000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Stack:
       ffff8800388a2400 0000000000000002 ffff880025807b18 ffffffff810442ce
       ffffffff81044217 ffff880000000002 ffff8800371b4080 ffff88003d112ec0
       ffff880025807b38 ffffffffa00810b0 ffff880036059d88 ffff880036059be8
      Call Trace:
       [<ffffffff810442ce>] workqueue_congested+0xb7/0x12c
       [<ffffffffa00810b0>] fscache_enqueue_object+0xb2/0xe8 [fscache]
       [<ffffffffa007facd>] __fscache_acquire_cookie+0x3b9/0x56c [fscache]
       [<ffffffffa00ad8fe>] nfs_fscache_set_inode_cookie+0xee/0x132 [nfs]
       [<ffffffffa009e112>] do_open+0x9/0xd [nfs]
       [<ffffffff810e804a>] do_dentry_open+0x175/0x24b
       [<ffffffff810e8298>] finish_open+0x41/0x51
    
    Fix it by using smp_processor_id() if @cpu is WORK_CPU_UNBOUND.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Reported-by: default avatarDavid Howells <dhowells@redhat.com>
    Tested-and-Acked-by: default avatarDavid Howells <dhowells@redhat.com>
    d3251859
workqueue.c 137 KB