1. 01 Jun, 2016 25 commits
  2. 19 May, 2016 15 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.11 · 544ec5b0
      Greg Kroah-Hartman authored
      544ec5b0
    • Linus Torvalds's avatar
      nf_conntrack: avoid kernel pointer value leak in slab name · 6ff8315a
      Linus Torvalds authored
      commit 31b0b385 upstream.
      
      The slab name ends up being visible in the directory structure under
      /sys, and even if you don't have access rights to the file you can see
      the filenames.
      
      Just use a 64-bit counter instead of the pointer to the 'net' structure
      to generate a unique name.
      
      This code will go away in 4.7 when the conntrack code moves to a single
      kmemcache, but this is the backportable simple solution to avoiding
      leaking kernel pointers to user space.
      
      Fixes: 5b3501fa ("netfilter: nf_conntrack: per netns nf_conntrack_cachep")
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ff8315a
    • Arindam Nath's avatar
      drm/radeon: fix DP link training issue with second 4K monitor · 62b68367
      Arindam Nath authored
      commit 1a738347 upstream.
      
      There is an issue observed when we hotplug a second DP
      4K monitor to the system. Sometimes, the link training
      fails for the second monitor after HPD interrupt
      generation.
      
      The issue happens when some queued or deferred transactions
      are already present on the AUX channel when we initiate
      a new transcation to (say) get DPCD or during link training.
      
      We set AUX_IGNORE_HPD_DISCON bit in the AUX_CONTROL
      register so that we can ignore any such deferred
      transactions when a new AUX transaction is initiated.
      Signed-off-by: default avatarArindam Nath <arindam.nath@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62b68367
    • Imre Deak's avatar
      drm/i915/bdw: Add missing delay during L3 SQC credit programming · bafa4fbc
      Imre Deak authored
      commit d6a862fe upstream.
      
      BSpec requires us to wait ~100 clocks before re-enabling clock gating,
      so make sure we do this.
      
      CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1462280061-1457-2-git-send-email-imre.deak@intel.com
      (cherry picked from commit 48e5d68d)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bafa4fbc
    • Daniel Vetter's avatar
      drm/i915: Bail out of pipe config compute loop on LPT · bf12e894
      Daniel Vetter authored
      commit 2700818a upstream.
      
      LPT is pch, so might run into the fdi bandwidth constraint (especially
      since it has only 2 lanes). But right now we just force pipe_bpp back
      to 24, resulting in a nice loop (which we bail out with a loud
      WARN_ON). Fix this.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      References: https://bugs.freedesktop.org/show_bug.cgi?id=93477Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Tested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vetter@ffwll.ch
      (cherry picked from commit f58a1acc)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf12e894
    • Lucas Stach's avatar
      drm/radeon: fix PLL sharing on DCE6.1 (v2) · 472f52f5
      Lucas Stach authored
      commit e3c00d87 upstream.
      
      On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not
      be taken into consideration when looking for an already enabled PLL
      to be shared with other outputs.
      
      This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland
      based laptop, where the internal display is connected to UNIPHYA through
      a TRAVIS DP->LVDS bridge.
      
      Bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=78987
      
      v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop
          extra parameter.
      Signed-off-by: default avatarLucas Stach <dev@lynxeye.de>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      472f52f5
    • Mauro Carvalho Chehab's avatar
      Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing" · 9df2dc6c
      Mauro Carvalho Chehab authored
      commit 93f0750d upstream.
      
      This patch causes a Kernel panic when called on a DVB driver.
      
      This was also reported by David R <david@unsolicited.net>:
      
      May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
      May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
      May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
      May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport
      May  7 14:47:35 server kernel: [  501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
      May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
      May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
      May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
      May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
      May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
      May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
      May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
      May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
      May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
      May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
      May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
      May  7 14:47:35 server kernel: [  501.250794] Stack:
      May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
      May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
      May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
      May  7 14:47:35 server kernel: [  501.251165] Call Trace:
      May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
      May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
      May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
      May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
      May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
      May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
      May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
      May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
      May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
      May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
      May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
      May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
      May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
      May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
      May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---
      
      This reverts commit 2c1f6951.
      
      Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
      Cc: Hans Verkuil <hans.verkuil@cisco.com>
      Fixes: 2c1f6951 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9df2dc6c
    • Marek Szyprowski's avatar
      Input: max8997-haptic - fix NULL pointer dereference · 1abbf804
      Marek Szyprowski authored
      commit 6ae645d5 upstream.
      
      NULL pointer derefence happens when booting with DTB because the
      platform data for haptic device is not set in supplied data from parent
      MFD device.
      
      The MFD device creates only platform data (from Device Tree) for itself,
      not for haptic child.
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000009c
      pgd = c0004000
      	[0000009c] *pgd=00000000
      	Internal error: Oops: 5 [#1] PREEMPT SMP ARM
      	(max8997_haptic_probe) from [<c03f9cec>] (platform_drv_probe+0x4c/0xb0)
      	(platform_drv_probe) from [<c03f8440>] (driver_probe_device+0x214/0x2c0)
      	(driver_probe_device) from [<c03f8598>] (__driver_attach+0xac/0xb0)
      	(__driver_attach) from [<c03f67ac>] (bus_for_each_dev+0x68/0x9c)
      	(bus_for_each_dev) from [<c03f7a38>] (bus_add_driver+0x1a0/0x218)
      	(bus_add_driver) from [<c03f8db0>] (driver_register+0x78/0xf8)
      	(driver_register) from [<c0101774>] (do_one_initcall+0x90/0x1d8)
      	(do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
      	(kernel_init_freeable) from [<c06bb5b4>] (kernel_init+0x8/0x114)
      	(kernel_init) from [<c0107938>] (ret_from_fork+0x14/0x3c)
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Fixes: 104594b0 ("Input: add driver support for MAX8997-haptic")
      [k.kozlowski: Write commit message, add CC-stable]
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1abbf804
    • Al Viro's avatar
      get_rock_ridge_filename(): handle malformed NM entries · 007796c0
      Al Viro authored
      commit 99d82582 upstream.
      
      Payloads of NM entries are not supposed to contain NUL.  When we run
      into such, only the part prior to the first NUL goes into the
      concatenation (i.e. the directory entry name being encoded by a bunch
      of NM entries).  We do stop when the amount collected so far + the
      claimed amount in the current NM entry exceed 254.  So far, so good,
      but what we return as the total length is the sum of *claimed*
      sizes, not the actual amount collected.  And that can grow pretty
      large - not unlimited, since you'd need to put CE entries in
      between to be able to get more than the maximum that could be
      contained in one isofs directory entry / continuation chunk and
      we are stop once we'd encountered 32 CEs, but you can get about 8Kb
      easily.  And that's what will be passed to readdir callback as the
      name length.  8Kb __copy_to_user() from a buffer allocated by
      __get_free_page()
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      007796c0
    • Steven Rostedt's avatar
      tools lib traceevent: Do not reassign parg after collapse_tree() · 35eb30c2
      Steven Rostedt authored
      commit 106b816c upstream.
      
      At the end of process_filter(), collapse_tree() was changed to update
      the parg parameter, but the reassignment after the call wasn't removed.
      
      What happens is that the "current_op" gets modified and freed and parg
      is assigned to the new allocated argument. But after the call to
      collapse_tree(), parg is assigned again to the just freed "current_op",
      and this causes the tool to crash.
      
      The current_op variable must also be assigned to NULL in case of error,
      otherwise it will cause it to be free()ed twice.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Fixes: 42d6194d ("tools lib traceevent: Refactor process_filter()")
      Link: http://lkml.kernel.org/r/20160511150936.678c18a1@gandalf.local.homeSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35eb30c2
    • Johannes Thumshirn's avatar
      qla1280: Don't allocate 512kb of host tags · 4c127a3e
      Johannes Thumshirn authored
      commit 2bcbc814 upstream.
      
      The qla1280 driver sets the scsi_host_template's can_queue field to 0xfffff
      which results in an allocation failure when allocating the block layer tags
      for the driver's queues. This was introduced with the change for host wide
      tags in commit 64d513ac - "scsi: use host wide tags by default".
      
      Reduce can_queue to MAX_OUTSTANDING_COMMANDS (512) to solve the allocation
      error.
      Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Fixes: 64d513ac - "scsi: use host wide tags by default"
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Michael Reed <mdr@sgi.com>
      Reviewed-by: default avatarLaurence Oberman <loberman@redhat.com>
      Reviewed-by: default avatarLee Duncan <lduncan@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJames Bottomley <jejb@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c127a3e
    • Al Viro's avatar
      atomic_open(): fix the handling of create_error · 4549fc71
      Al Viro authored
      commit 10c64cea upstream.
      
      * if we have a hashed negative dentry and either CREAT|EXCL on
      r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL
      with failing may_o_create(), we should fail with EROFS or the
      error may_o_create() has returned, but not ENOENT.  Which is what
      the current code ends up returning.
      
      * if we have CREAT|TRUNC hitting a regular file on a read-only
      filesystem, we can't fail with EROFS here.  At the very least,
      not until we'd done follow_managed() - we might have a writable
      file (or a device, for that matter) bound on top of that one.
      Moreover, the code downstream will see that O_TRUNC and attempt
      to grab the write access (*after* following possible mount), so
      if we really should fail with EROFS, it will happen.  No need
      to do that inside atomic_open().
      
      The real logics is much simpler than what the current code is
      trying to do - if we decided to go for simple lookup, ended
      up with a negative dentry *and* had create_error set, fail with
      create_error.  No matter whether we'd got that negative dentry
      from lookup_real() or had found it in dcache.
      Acked-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4549fc71
    • Hans de Goede's avatar
      regulator: axp20x: Fix axp22x ldo_io voltage ranges · b6570278
      Hans de Goede authored
      commit a2262e5a upstream.
      
      The minium voltage of 1800mV is a copy and paste error from the axp20x
      regulator info. The correct minimum voltage for the ldo_io regulators
      on the axp22x is 700mV.
      
      Fixes: 1b82b4e4 ("regulator: axp20x: Add support for AXP22X regulators")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Acked-by: default avatarChen-Yu Tsai <wens@csie.org>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6570278
    • Krzysztof Kozlowski's avatar
      regulator: s2mps11: Fix invalid selector mask and voltages for buck9 · fc2d8c98
      Krzysztof Kozlowski authored
      commit 3b672623 upstream.
      
      The buck9 regulator of S2MPS11 PMIC had incorrect vsel_mask (0xff
      instead of 0x1f) thus reading entire register as buck9's voltage. This
      effectively caused regulator core to interpret values as higher voltages
      than they were and then to set real voltage much lower than intended.
      
      The buck9 provides power to other regulators, including LDO13
      and LDO19 which supply the MMC2 (SD card). On Odroid XU3/XU4 the lower
      voltage caused SD card detection errors on Odroid XU3/XU4:
      	mmc1: card never left busy state
      	mmc1: error -110 whilst initialising SD card
      
      During driver probe the regulator core was checking whether initial
      voltage matches the constraints. With incorrect vsel_mask of 0xff and
      default value of 0x50, the core interpreted this as 5 V which is outside
      of constraints (3-3.775 V). Then the regulator core was adjusting the
      voltage to match the constraints. With incorrect vsel_mask this new
      voltage mapped to a vere low voltage in the driver.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Reviewed-by: default avatarJavier Martinez Canillas <javier@osg.samsung.com>
      Tested-by: default avatarJavier Martinez Canillas <javier@osg.samsung.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc2d8c98
    • Wanpeng Li's avatar
      workqueue: fix rebind bound workers warning · cf73d8ad
      Wanpeng Li authored
      commit f7c17d26 upstream.
      
      ------------[ cut here ]------------
      WARNING: CPU: 0 PID: 16 at kernel/workqueue.c:4559 rebind_workers+0x1c0/0x1d0
      Modules linked in:
      CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 4.6.0-rc4+ #31
      Hardware name: IBM IBM System x3550 M4 Server -[7914IUW]-/00Y8603, BIOS -[D7E128FUS-1.40]- 07/23/2013
       0000000000000000 ffff881037babb58 ffffffff8139d885 0000000000000010
       0000000000000000 0000000000000000 0000000000000000 ffff881037babba8
       ffffffff8108505d ffff881037ba0000 000011cf3e7d6e60 0000000000000046
      Call Trace:
       dump_stack+0x89/0xd4
       __warn+0xfd/0x120
       warn_slowpath_null+0x1d/0x20
       rebind_workers+0x1c0/0x1d0
       workqueue_cpu_up_callback+0xf5/0x1d0
       notifier_call_chain+0x64/0x90
       ? trace_hardirqs_on_caller+0xf2/0x220
       ? notify_prepare+0x80/0x80
       __raw_notifier_call_chain+0xe/0x10
       __cpu_notify+0x35/0x50
       notify_down_prepare+0x5e/0x80
       ? notify_prepare+0x80/0x80
       cpuhp_invoke_callback+0x73/0x330
       ? __schedule+0x33e/0x8a0
       cpuhp_down_callbacks+0x51/0xc0
       cpuhp_thread_fun+0xc1/0xf0
       smpboot_thread_fn+0x159/0x2a0
       ? smpboot_create_threads+0x80/0x80
       kthread+0xef/0x110
       ? wait_for_completion+0xf0/0x120
       ? schedule_tail+0x35/0xf0
       ret_from_fork+0x22/0x50
       ? __init_kthread_worker+0x70/0x70
      ---[ end trace eb12ae47d2382d8f ]---
      notify_down_prepare: attempt to take down CPU 0 failed
      
      This bug can be reproduced by below config w/ nohz_full= all cpus:
      
      CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
      CONFIG_DEBUG_HOTPLUG_CPU0=y
      CONFIG_NO_HZ_FULL=y
      
      As Thomas pointed out:
      
      | If a down prepare callback fails, then DOWN_FAILED is invoked for all
      | callbacks which have successfully executed DOWN_PREPARE.
      |
      | But, workqueue has actually two notifiers. One which handles
      | UP/DOWN_FAILED/ONLINE and one which handles DOWN_PREPARE.
      |
      | Now look at the priorities of those callbacks:
      |
      | CPU_PRI_WORKQUEUE_UP        = 5
      | CPU_PRI_WORKQUEUE_DOWN      = -5
      |
      | So the call order on DOWN_PREPARE is:
      |
      | CB 1
      | CB ...
      | CB workqueue_up() -> Ignores DOWN_PREPARE
      | CB ...
      | CB X ---> Fails
      |
      | So we call up to CB X with DOWN_FAILED
      |
      | CB 1
      | CB ...
      | CB workqueue_up() -> Handles DOWN_FAILED
      | CB ...
      | CB X-1
      |
      | So the problem is that the workqueue stuff handles DOWN_FAILED in the up
      | callback, while it should do it in the down callback. Which is not a good idea
      | either because it wants to be called early on rollback...
      |
      | Brilliant stuff, isn't it? The hotplug rework will solve this problem because
      | the callbacks become symetric, but for the existing mess, we need some
      | workaround in the workqueue code.
      
      The boot CPU handles housekeeping duty(unbound timers, workqueues,
      timekeeping, ...) on behalf of full dynticks CPUs. It must remain
      online when nohz full is enabled. There is a priority set to every
      notifier_blocks:
      
      workqueue_cpu_up > tick_nohz_cpu_down > workqueue_cpu_down
      
      So tick_nohz_cpu_down callback failed when down prepare cpu 0, and
      notifier_blocks behind tick_nohz_cpu_down will not be called any
      more, which leads to workers are actually not unbound. Then hotplug
      state machine will fallback to undo and online cpu 0 again. Workers
      will be rebound unconditionally even if they are not unbound and
      trigger the warning in this progress.
      
      This patch fix it by catching !DISASSOCIATED to avoid rebind bound
      workers.
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Suggested-by: default avatarLai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf73d8ad