1. 18 Oct, 2016 1 commit
  2. 17 Oct, 2016 5 commits
  3. 13 Oct, 2016 2 commits
  4. 12 Oct, 2016 7 commits
    • Johannes Berg's avatar
      mac80211: preserve more bits when building QoS header · 32910bb9
      Johannes Berg authored
      Michael Braun reported that when trying to inject A-MSDUs over
      monitor interfaces, the frame doesn't come out right since the
      QoS header A-MSDU bit is overwritten.
      
      Rather than adding that bit specifically simply preserve those
      bits that we don't set here, since we typically get here with
      a zeroed-out QoS header anyway.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      32910bb9
    • Michael Braun's avatar
      mac80211: filter multicast data packets on AP / AP_VLAN · 72f15d53
      Michael Braun authored
      This patch adds filtering for multicast data packets on AP_VLAN
      interfaces that have no authorized station connected and changes
      filtering on AP interfaces to not count stations assigned to
      AP_VLAN interfaces.
      
      This saves airtime and avoids waking up other stations currently
      authorized in this BSS. When using WPA, the packets dropped could
      not be decrypted by any station.
      
      The behaviour when there are no AP_VLAN interfaces is left unchanged.
      
      When there are AP_VLAN interfaces, this patch
      1. adds filtering multicast data packets sent on AP_VLAN interfaces
         that have no authorized station connected.
         No filtering happens on 4addr AP_VLAN interfaces.
      2. makes filtering of multicast data packets sent on AP interfaces
         depend on the number of authorized stations in this bss not
         assigned to an AP_VLAN interface.
      
      Therefore, a new num_mcast_sta counter is added for AP_VLAN interfaces.
      The existing one for AP interfaces is altered to not track stations
      assigned to an AP_VLAN interface.
      
      The new counter is exposed in debugfs.
      Signed-off-by: default avatarMichael Braun <michael-dev@fami-braun.de>
      [reformat commit message a bit, unline ieee80211_vif_{inc,dec}_num_mcast]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      72f15d53
    • Michael Braun's avatar
      mac80211: remove unnecessary num_mcast_sta check · 5f9994bd
      Michael Braun authored
      Checking for num_mcast_sta in __ieee80211_request_smps_ap() is
      unnecessary as sta list will be empty in this case anyway, so
      the list iteration will just exit immediately. Since this isn't
      a "hot" code path, it doesn't really matter, and the next patch
      will redefine num_mcast_sta to make this check invalid.
      Signed-off-by: default avatarMichael Braun <michael-dev@fami-braun.de>
      [change commit message]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      5f9994bd
    • Johannes Berg's avatar
      mac80211_hwsim: make multi-channel ops const · 246ad56e
      Johannes Berg authored
      Instead of building the multi-channel ops at runtime, declare
      the common ops with a macro and build both that way, so that
      the multi-channel ops can also be const.
      
      As a side effect, due to the removed code, this decreases the
      size of the module (while shifting data from .bss to .text
      due to the newly added const).
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      246ad56e
    • Johannes Berg's avatar
      mac80211: remove unnecessary mesh check · 850092db
      Johannes Berg authored
      sta_info_get_bss() is equivalent to sta_info_get() in the
      mesh case, since sta->sdata->bss will be NULL (it's only
      set for AP/AP_VLAN interfaces.) Thus, the mesh check here
      isn't actually needed - remove it.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      850092db
    • Linus Torvalds's avatar
      Merge tag 'drm-for-v4.9' of git://people.freedesktop.org/~airlied/linux · 6b25e21f
      Linus Torvalds authored
      Pull drm updates from Dave Airlie:
       "Core:
         - Fence destaging work
         - DRIVER_LEGACY to split off legacy drm drivers
         - drm_mm refactoring
         - Splitting drm_crtc.c into chunks and documenting better
         - Display info fixes
         - rbtree support for prime buffer lookup
         - Simple VGA DAC driver
      
        Panel:
         - Add Nexus 7 panel
         - More simple panels
      
        i915:
         - Refactoring GEM naming
         - Refactored vma/active tracking
         - Lockless request lookups
         - Better stolen memory support
         - FBC fixes
         - SKL watermark fixes
         - VGPU improvements
         - dma-buf fencing support
         - Better DP dongle support
      
        amdgpu:
         - Powerplay for Iceland asics
         - Improved GPU reset support
         - UVD/VEC powergating support for CZ/ST
         - Preinitialised VRAM buffer support
         - Virtual display support
         - Initial SI support
         - GTT rework
         - PCI shutdown callback support
         - HPD IRQ storm fixes
      
        amdkfd:
         - bugfixes
      
        tilcdc:
         - Atomic modesetting support
      
        mediatek:
         - AAL + GAMMA engine support
         - Hook up gamma LUT
         - Temporal dithering support
      
        imx:
         - Pixel clock from devicetree
         - drm bridge support for LVDS bridges
         - active plane reconfiguration
         - VDIC deinterlacer support
         - Frame synchronisation unit support
         - Color space conversion support
      
        analogix:
         - PSR support
         - Better panel on/off support
      
        rockchip:
         - rk3399 vop/crtc support
         - PSR support
      
        vc4:
         - Interlaced vblank timing
         - 3D rendering CPU overhead reduction
         - HDMI output fixes
      
        tda998x:
         - HDMI audio ASoC support
      
        sunxi:
         - Allwinner A33 support
         - better TCON support
      
        msm:
         - DT binding cleanups
         - Explicit fence-fd support
      
        sti:
         - remove sti415/416 support
      
        etnaviv:
         - MMUv2 refactoring
         - GC3000 support
      
        exynos:
         - Refactoring HDMI DCC/PHY
         - G2D pm regression fix
         - Page fault issues with wait for vblank
      
        There is no nouveau work in this tree, as Ben didn't get a pull
        request in, and he was fighting moving to atomic and adding mst
        support, so maybe best it waits for a cycle"
      
      * tag 'drm-for-v4.9' of git://people.freedesktop.org/~airlied/linux: (1412 commits)
        drm/crtc: constify drm_crtc_index parameter
        drm/i915: Fix conflict resolution from backmerge of v4.8-rc8 to drm-next
        drm/i915/guc: Unwind GuC workqueue reservation if request construction fails
        drm/i915: Reset the breadcrumbs IRQ more carefully
        drm/i915: Force relocations via cpu if we run out of idle aperture
        drm/i915: Distinguish last emitted request from last submitted request
        drm/i915: Allow DP to work w/o EDID
        drm/i915: Move long hpd handling into the hotplug work
        drm/i915/execlists: Reinitialise context image after GPU hang
        drm/i915: Use correct index for backtracking HUNG semaphores
        drm/i915: Unalias obj->phys_handle and obj->userptr
        drm/i915: Just clear the mmiodebug before a register access
        drm/i915/gen9: only add the planes actually affected by ddb changes
        drm/i915: Allow PCH DPLL sharing regardless of DPLL_SDVO_HIGH_SPEED
        drm/i915/bxt: Fix HDMI DPLL configuration
        drm/i915/gen9: fix the watermark res_blocks value
        drm/i915/gen9: fix plane_blocks_per_line on watermarks calculations
        drm/i915/gen9: minimum scanlines for Y tile is not always 4
        drm/i915/gen9: fix the WaWmMemoryReadLatency implementation
        drm/i915/kbl: KBL also needs to run the SAGV code
        ...
      6b25e21f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · a379f71a
      Linus Torvalds authored
      Merge more updates from Andrew Morton:
      
       - a few block updates that fell in my lap
      
       - lib/ updates
      
       - checkpatch
      
       - autofs
      
       - ipc
      
       - a ton of misc other things
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (100 commits)
        mm: split gfp_mask and mapping flags into separate fields
        fs: use mapping_set_error instead of opencoded set_bit
        treewide: remove redundant #include <linux/kconfig.h>
        hung_task: allow hung_task_panic when hung_task_warnings is 0
        kthread: add kerneldoc for kthread_create()
        kthread: better support freezable kthread workers
        kthread: allow to modify delayed kthread work
        kthread: allow to cancel kthread work
        kthread: initial support for delayed kthread work
        kthread: detect when a kthread work is used by more workers
        kthread: add kthread_destroy_worker()
        kthread: add kthread_create_worker*()
        kthread: allow to call __kthread_create_on_node() with va_list args
        kthread/smpboot: do not park in kthread_create_on_cpu()
        kthread: kthread worker API cleanup
        kthread: rename probe_kthread_data() to kthread_probe_data()
        scripts/tags.sh: enable code completion in VIM
        mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping
        kdump, vmcoreinfo: report memory sections virtual addresses
        ipc/sem.c: add cond_resched in exit_sme
        ...
      a379f71a
  5. 11 Oct, 2016 25 commits
    • Michal Hocko's avatar
      mm: split gfp_mask and mapping flags into separate fields · 9c5d760b
      Michal Hocko authored
      mapping->flags currently encodes two different things into a single flag.
      It contains sticky gfp_mask for page cache allocations and AS_ codes used
      to report errors/enospace and other states which are mapping specific.
      Condensing the two semantically unrelated things saves few bytes but it
      also complicates other things.  For one thing the gfp flags space is
      reduced and in fact we are already running out of available bits.  It can
      be assumed that more gfp flags will be necessary later on.
      
      To not introduce the address_space grow (at least on x86_64) we can stick
      it right after private_lock because we have a hole there.
      
      struct address_space {
              struct inode *             host;                 /*     0     8 */
              struct radix_tree_root     page_tree;            /*     8    16 */
              spinlock_t                 tree_lock;            /*    24     4 */
              atomic_t                   i_mmap_writable;      /*    28     4 */
              struct rb_root             i_mmap;               /*    32     8 */
              struct rw_semaphore        i_mmap_rwsem;         /*    40    40 */
              /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
              long unsigned int          nrpages;              /*    80     8 */
              long unsigned int          nrexceptional;        /*    88     8 */
              long unsigned int          writeback_index;      /*    96     8 */
              const struct address_space_operations  * a_ops;  /*   104     8 */
              long unsigned int          flags;                /*   112     8 */
              spinlock_t                 private_lock;         /*   120     4 */
      
              /* XXX 4 bytes hole, try to pack */
      
              /* --- cacheline 2 boundary (128 bytes) --- */
              struct list_head           private_list;         /*   128    16 */
              void *                     private_data;         /*   144     8 */
      
              /* size: 152, cachelines: 3, members: 14 */
              /* sum members: 148, holes: 1, sum holes: 4 */
              /* last cacheline: 24 bytes */
      };
      
      Link: http://lkml.kernel.org/r/20160912114852.GI14524@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9c5d760b
    • Michal Hocko's avatar
      fs: use mapping_set_error instead of opencoded set_bit · 5114a97a
      Michal Hocko authored
      The mapping_set_error() helper sets the correct AS_ flag for the mapping
      so there is no reason to open code it.  Use the helper directly.
      
      [akpm@linux-foundation.org: be honest about conversion from -ENXIO to -EIO]
      Link: http://lkml.kernel.org/r/20160912111608.2588-2-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5114a97a
    • Masahiro Yamada's avatar
      treewide: remove redundant #include <linux/kconfig.h> · 97139d4a
      Masahiro Yamada authored
      Kernel source files need not include <linux/kconfig.h> explicitly
      because the top Makefile forces to include it with:
      
        -include $(srctree)/include/linux/kconfig.h
      
      This commit removes explicit includes except the following:
      
        * arch/s390/include/asm/facilities_src.h
        * tools/testing/radix-tree/linux/kernel.h
      
      These two are used for host programs.
      
      Link: http://lkml.kernel.org/r/1473656164-11929-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      97139d4a
    • John Siddle's avatar
      hung_task: allow hung_task_panic when hung_task_warnings is 0 · 48a6d64e
      John Siddle authored
      Previously hung_task_panic would not be respected if enabled after
      hung_task_warnings had already been decremented to 0.
      
      Permit the kernel to panic if hung_task_panic is enabled after
      hung_task_warnings has already been decremented to 0 and another task
      hangs for hung_task_timeout_secs seconds.
      
      Check if hung_task_panic is enabled so we don't return prematurely, and
      check if hung_task_warnings is non-zero so we don't print the warning
      unnecessarily.
      
      [akpm@linux-foundation.org: fix off-by-one]
      Link: http://lkml.kernel.org/r/1473450214-4049-1-git-send-email-jsiddle@redhat.comSigned-off-by: default avatarJohn Siddle <jsiddle@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      48a6d64e
    • Jonathan Corbet's avatar
      kthread: add kerneldoc for kthread_create() · e154ccc8
      Jonathan Corbet authored
      This macro is referenced in other kerneldoc comments, but lacks one of its
      own; fix that.
      
      Link: http://lkml.kernel.org/r/20160826072313.726a3485@lwn.netSigned-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Reported-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e154ccc8
    • Petr Mladek's avatar
      kthread: better support freezable kthread workers · dbf52682
      Petr Mladek authored
      This patch allows to make kthread worker freezable via a new @flags
      parameter. It will allow to avoid an init work in some kthreads.
      
      It currently does not affect the function of kthread_worker_fn()
      but it might help to do some optimization or fixes eventually.
      
      I currently do not know about any other use for the @flags
      parameter but I believe that we will want more flags
      in the future.
      
      Finally, I hope that it will not cause confusion with @flags member
      in struct kthread. Well, I guess that we will want to rework the
      basic kthreads implementation once all kthreads are converted into
      kthread workers or workqueues. It is possible that we will merge
      the two structures.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-12-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbf52682
    • Petr Mladek's avatar
      kthread: allow to modify delayed kthread work · 9a6b06c8
      Petr Mladek authored
      There are situations when we need to modify the delay of a delayed kthread
      work. For example, when the work depends on an event and the initial delay
      means a timeout. Then we want to queue the work immediately when the event
      happens.
      
      This patch implements kthread_mod_delayed_work() as inspired workqueues.
      It cancels the timer, removes the work from any worker list and queues it
      again with the given timeout.
      
      A very special case is when the work is being canceled at the same time.
      It might happen because of the regular kthread_cancel_delayed_work_sync()
      or by another kthread_mod_delayed_work(). In this case, we do nothing and
      let the other operation win. This should not normally happen as the caller
      is supposed to synchronize these operations a reasonable way.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-11-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a6b06c8
    • Petr Mladek's avatar
      kthread: allow to cancel kthread work · 37be45d4
      Petr Mladek authored
      We are going to use kthread workers more widely and sometimes we will need
      to make sure that the work is neither pending nor running.
      
      This patch implements cancel_*_sync() operations as inspired by
      workqueues.  Well, we are synchronized against the other operations via
      the worker lock, we use del_timer_sync() and a counter to count parallel
      cancel operations.  Therefore the implementation might be easier.
      
      First, we check if a worker is assigned.  If not, the work has newer been
      queued after it was initialized.
      
      Second, we take the worker lock.  It must be the right one.  The work must
      not be assigned to another worker unless it is initialized in between.
      
      Third, we try to cancel the timer when it exists.  The timer is deleted
      synchronously to make sure that the timer call back is not running.  We
      need to temporary release the worker->lock to avoid a possible deadlock
      with the callback.  In the meantime, we set work->canceling counter to
      avoid any queuing.
      
      Fourth, we try to remove the work from a worker list. It might be
      the list of either normal or delayed works.
      
      Fifth, if the work is running, we call kthread_flush_work().  It might
      take an arbitrary time.  We need to release the worker-lock again.  In the
      meantime, we again block any queuing by the canceling counter.
      
      As already mentioned, the check for a pending kthread work is done under a
      lock.  In compare with workqueues, we do not need to fight for a single
      PENDING bit to block other operations.  Therefore we do not suffer from
      the thundering storm problem and all parallel canceling jobs might use
      kthread_flush_work().  Any queuing is blocked until the counter gets zero.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-10-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      37be45d4
    • Petr Mladek's avatar
      kthread: initial support for delayed kthread work · 22597dc3
      Petr Mladek authored
      We are going to use kthread_worker more widely and delayed works
      will be pretty useful.
      
      The implementation is inspired by workqueues.  It uses a timer to queue
      the work after the requested delay.  If the delay is zero, the work is
      queued immediately.
      
      In compare with workqueues, each work is associated with a single worker
      (kthread).  Therefore the implementation could be much easier.  In
      particular, we use the worker->lock to synchronize all the operations with
      the work.  We do not need any atomic operation with a flags variable.
      
      In fact, we do not need any state variable at all.  Instead, we add a list
      of delayed works into the worker.  Then the pending work is listed either
      in the list of queued or delayed works.  And the existing check of pending
      works is the same even for the delayed ones.
      
      A work must not be assigned to another worker unless reinitialized.
      Therefore the timer handler might expect that dwork->work->worker is valid
      and it could simply take the lock.  We just add some sanity checks to help
      with debugging a potential misuse.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-9-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      22597dc3
    • Petr Mladek's avatar
      kthread: detect when a kthread work is used by more workers · 8197b3d4
      Petr Mladek authored
      Nothing currently prevents a work from queuing for a kthread worker when
      it is already running on another one.  This means that the work might run
      in parallel on more than one worker.  Also some operations are not
      reliable, e.g.  flush.
      
      This problem will be even more visible after we add kthread_cancel_work()
      function.  It will only have "work" as the parameter and will use
      worker->lock to synchronize with others.
      
      Well, normally this is not a problem because the API users are sane.
      But bugs might happen and users also might be crazy.
      
      This patch adds a warning when we try to insert the work for another
      worker.  It does not fully prevent the misuse because it would make the
      code much more complicated without a big benefit.
      
      It adds the same warning also into kthread_flush_work() instead of the
      repeated attempts to get the right lock.
      
      A side effect is that one needs to explicitly reinitialize the work if it
      must be queued into another worker.  This is needed, for example, when the
      worker is stopped and started again.  It is a bit inconvenient.  But it
      looks like a good compromise between the stability and complexity.
      
      I have double checked all existing users of the kthread worker API and
      they all seems to initialize the work after the worker gets started.
      
      Just for completeness, the patch adds a check that the work is not already
      in a queue.
      
      The patch also puts all the checks into a separate function.  It will be
      reused when implementing delayed works.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-8-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8197b3d4
    • Petr Mladek's avatar
      kthread: add kthread_destroy_worker() · 35033fe9
      Petr Mladek authored
      The current kthread worker users call flush() and stop() explicitly.
      This function does the same plus it frees the kthread_worker struct
      in one call.
      
      It is supposed to be used together with kthread_create_worker*() that
      allocates struct kthread_worker.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-7-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      35033fe9
    • Petr Mladek's avatar
      kthread: add kthread_create_worker*() · fbae2d44
      Petr Mladek authored
      Kthread workers are currently created using the classic kthread API,
      namely kthread_run().  kthread_worker_fn() is passed as the @threadfn
      parameter.
      
      This patch defines kthread_create_worker() and
      kthread_create_worker_on_cpu() functions that hide implementation details.
      
      They enforce using kthread_worker_fn() for the main thread.  But I doubt
      that there are any plans to create any alternative.  In fact, I think that
      we do not want any alternative main thread because it would be hard to
      support consistency with the rest of the kthread worker API.
      
      The naming and function of kthread_create_worker() is inspired by the
      workqueues API like the rest of the kthread worker API.
      
      The kthread_create_worker_on_cpu() variant is motivated by the original
      kthread_create_on_cpu().  Note that we need to bind per-CPU kthread
      workers already when they are created.  It makes the life easier.
      kthread_bind() could not be used later for an already running worker.
      
      This patch does _not_ convert existing kthread workers.  The kthread
      worker API need more improvements first, e.g.  a function to destroy the
      worker.
      
      IMPORTANT:
      
      kthread_create_worker_on_cpu() allows to use any format of the worker
      name, in compare with kthread_create_on_cpu().  The good thing is that it
      is more generic.  The bad thing is that most users will need to pass the
      cpu number in two parameters, e.g.  kthread_create_worker_on_cpu(cpu,
      "helper/%d", cpu).
      
      To be honest, the main motivation was to avoid the need for an empty
      va_list.  The only legal way was to create a helper function that would be
      called with an empty list.  Other attempts caused compilation warnings or
      even errors on different architectures.
      
      There were also other alternatives, for example, using #define or
      splitting __kthread_create_worker().  The used solution looked like the
      least ugly.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-6-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fbae2d44
    • Petr Mladek's avatar
      kthread: allow to call __kthread_create_on_node() with va_list args · 255451e4
      Petr Mladek authored
      kthread_create_on_node() implements a bunch of logic to create the
      kthread.  It is already called by kthread_create_on_cpu().
      
      We are going to extend the kthread worker API and will need to call
      kthread_create_on_node() with va_list args there.
      
      This patch does only a refactoring and does not modify the existing
      behavior.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-5-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      255451e4
    • Petr Mladek's avatar
      kthread/smpboot: do not park in kthread_create_on_cpu() · a65d4096
      Petr Mladek authored
      kthread_create_on_cpu() was added by the commit 2a1d4460
      ("kthread: Implement park/unpark facility").  It is currently used only
      when enabling new CPU.  For this purpose, the newly created kthread has to
      be parked.
      
      The CPU binding is a bit tricky.  The kthread is parked when the CPU has
      not been allowed yet.  And the CPU is bound when the kthread is unparked.
      
      The function would be useful for more per-CPU kthreads, e.g.
      bnx2fc_thread, fcoethread.  For this purpose, the newly created kthread
      should stay in the uninterruptible state.
      
      This patch moves the parking into smpboot.  It binds the thread already
      when created.  Then the function might be used universally.  Also the
      behavior is consistent with kthread_create() and kthread_create_on_node().
      
      Link: http://lkml.kernel.org/r/1470754545-17632-4-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a65d4096
    • Petr Mladek's avatar
      kthread: kthread worker API cleanup · 3989144f
      Petr Mladek authored
      A good practice is to prefix the names of functions by the name
      of the subsystem.
      
      The kthread worker API is a mix of classic kthreads and workqueues.  Each
      worker has a dedicated kthread.  It runs a generic function that process
      queued works.  It is implemented as part of the kthread subsystem.
      
      This patch renames the existing kthread worker API to use
      the corresponding name from the workqueues API prefixed by
      kthread_:
      
      __init_kthread_worker()		-> __kthread_init_worker()
      init_kthread_worker()		-> kthread_init_worker()
      init_kthread_work()		-> kthread_init_work()
      insert_kthread_work()		-> kthread_insert_work()
      queue_kthread_work()		-> kthread_queue_work()
      flush_kthread_work()		-> kthread_flush_work()
      flush_kthread_worker()		-> kthread_flush_worker()
      
      Note that the names of DEFINE_KTHREAD_WORK*() macros stay
      as they are. It is common that the "DEFINE_" prefix has
      precedence over the subsystem names.
      
      Note that INIT() macros and init() functions use different
      naming scheme. There is no good solution. There are several
      reasons for this solution:
      
        + "init" in the function names stands for the verb "initialize"
          aka "initialize worker". While "INIT" in the macro names
          stands for the noun "INITIALIZER" aka "worker initializer".
      
        + INIT() macros are used only in DEFINE() macros
      
        + init() functions are used close to the other kthread()
          functions. It looks much better if all the functions
          use the same scheme.
      
        + There will be also kthread_destroy_worker() that will
          be used close to kthread_cancel_work(). It is related
          to the init() function. Again it looks better if all
          functions use the same naming scheme.
      
        + there are several precedents for such init() function
          names, e.g. amd_iommu_init_device(), free_area_init_node(),
          jump_label_init_type(),  regmap_init_mmio_clk(),
      
        + It is not an argument but it was inconsistent even before.
      
      [arnd@arndb.de: fix linux-next merge conflict]
       Link: http://lkml.kernel.org/r/20160908135724.1311726-1-arnd@arndb.de
      Link: http://lkml.kernel.org/r/1470754545-17632-3-git-send-email-pmladek@suse.comSuggested-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3989144f
    • Petr Mladek's avatar
      kthread: rename probe_kthread_data() to kthread_probe_data() · e700591a
      Petr Mladek authored
      Patch series "kthread: Kthread worker API improvements"
      
      The intention of this patchset is to make it easier to manipulate and
      maintain kthreads.  Especially, I want to replace all the custom main
      cycles with a generic one.  Also I want to make the kthreads sleep in a
      consistent state in a common place when there is no work.
      
      This patch (of 11):
      
      A good practice is to prefix the names of functions by the name of the
      subsystem.
      
      This patch fixes the name of probe_kthread_data().  The other wrong
      functions names are part of the kthread worker API and will be fixed
      separately.
      
      Link: http://lkml.kernel.org/r/1470754545-17632-2-git-send-email-pmladek@suse.comSigned-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Suggested-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e700591a
    • Mathieu Maret's avatar
      scripts/tags.sh: enable code completion in VIM · d0c75f33
      Mathieu Maret authored
      Vim, with the omnicppcomplete(#1) plugin, can do code completion using
      information build by ctags.  Add flags needed by omnicppcomplete(#2) to
      have completion on member of structure.
      
      1: https://github.com/vim-scripts/omnicppcomplete
      2: https://github.com/vim-scripts/OmniCppComplete/blob/master/doc/omnicppcomplete.txt#L93
      
      Link: http://lkml.kernel.org/r/20160830191546.4469-1-mathieu.maret@gmail.comSigned-off-by: default avatarMathieu Maret <mathieu.maret@gmail.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0c75f33
    • Catalin Marinas's avatar
      mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping · 9099daed
      Catalin Marinas authored
      Some of the kmemleak_*() callbacks in memblock, bootmem, CMA convert a
      physical address to a virtual one using __va().  However, such physical
      addresses may sometimes be located in highmem and using __va() is
      incorrect, leading to inconsistent object tracking in kmemleak.
      
      The following functions have been added to the kmemleak API and they take
      a physical address as the object pointer.  They only perform the
      corresponding action if the address has a lowmem mapping:
      
      kmemleak_alloc_phys
      kmemleak_free_part_phys
      kmemleak_not_leak_phys
      kmemleak_ignore_phys
      
      The affected calling places have been updated to use the new kmemleak
      API.
      
      Link: http://lkml.kernel.org/r/1471531432-16503-1-git-send-email-catalin.marinas@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarVignesh R <vigneshr@ti.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9099daed
    • Thomas Garnier's avatar
      kdump, vmcoreinfo: report memory sections virtual addresses · 0549a3c0
      Thomas Garnier authored
      KASLR memory randomization can randomize the base of the physical memory
      mapping (PAGE_OFFSET), vmalloc (VMALLOC_START) and vmemmap
      (VMEMMAP_START).  Adding these variables on VMCOREINFO so tools can easily
      identify the base of each memory section.
      
      Link: http://lkml.kernel.org/r/1471531632-23003-1-git-send-email-thgarnie@google.comSigned-off-by: default avatarThomas Garnier <thgarnie@google.com>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H . Peter Anvin" <hpa@zytor.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Xunlei Pang <xlpang@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Eugene Surovegin <surovegin@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0549a3c0
    • Nikolay Borisov's avatar
      ipc/sem.c: add cond_resched in exit_sme · 2a1613a5
      Nikolay Borisov authored
      In CONFIG_PREEMPT=n kernel a softlockup was observed while the for loop in
      exit_sem.  Apparently it's possible for the loop to take quite a long time
      and it doesn't have a scheduling point in it.  Since the codes is
      executing under an rcu read section this may also cause rcu stalls, which
      in turn block synchronize_rcu operations, which more or less de-stabilises
      the whole system.
      
      Fix this by introducing a cond_resched() at the beginning of the loop.
      
      So this patch fixes the following:
      
        NMI watchdog: BUG: soft lockup - CPU#10 stuck for 23s! [httpd:18119]
        CPU: 10 PID: 18119 Comm: httpd Tainted: G           O    4.4.20-clouder2 #6
        Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1 04/14/2015
        task: ffff88348d695280 ti: ffff881c95550000 task.ti: ffff881c95550000
        RIP: 0010:[<ffffffff81614bc7>]  [<ffffffff81614bc7>] _raw_spin_lock+0x17/0x30
        RSP: 0018:ffff881c95553e40  EFLAGS: 00000246
        RAX: 0000000000000000 RBX: ffff883161b1eea8 RCX: 000000000000000d
        RDX: 0000000000000001 RSI: 000000000000000e RDI: ffff883161b1eea4
        RBP: ffff881c95553ea0 R08: ffff881c95553e68 R09: ffff883fef376f88
        R10: ffff881fffb58c20 R11: ffffea0072556600 R12: ffff883161b1eea0
        R13: ffff88348d695280 R14: ffff883dec427000 R15: ffff8831621672a0
        FS:  0000000000000000(0000) GS:ffff881fffb40000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f3b3723e020 CR3: 0000000001c0a000 CR4: 00000000001406e0
        Call Trace:
          ? exit_sem+0x7c/0x280
          do_exit+0x338/0xb40
          do_group_exit+0x43/0xd0
          SyS_exit_group+0x14/0x20
          entry_SYSCALL_64_fastpath+0x16/0x6e
      
      Link: http://lkml.kernel.org/r/1475154992-6363-1-git-send-email-kernel@kyup.comSigned-off-by: default avatarNikolay Borisov <kernel@kyup.com>
      Cc: Herton R. Krzesinski <herton@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2a1613a5
    • Davidlohr Bueso's avatar
      ipc/msg: avoid waking sender upon full queue · ed27f912
      Davidlohr Bueso authored
      Blocked tasks queued in q_senders waiting for their message to fit in the
      queue are blindly awoken every time we think there's a remote chance this
      might happen.  This could cause numerous (and expensive -- thundering
      herd-ish) bogus wakeups if the queue is still really full.  Adding to the
      scheduling cost/overhead, there's also the fact that we need to take the
      ipc object lock and requeue ourselves in the q_senders list.
      
      By keeping track of the blocked sender's message size, we can know
      previously if the wakeup ought to occur or not.  Otherwise, to maintain
      the current wakeup order we just move it to the tail.  This is exactly
      what occurs right now if the sender needs to go back to sleep.
      
      The case of EIDRM is left completely untouched, as we need to wakeup all
      the tasks, and shouldn't be playing games in the first place.
      
      This patch was seen to save on the 'msgctl10' ltp testcase ~15% in context
      switches (avg out of ten runs).  Although these tests are really about
      functionality (as opposed to performance), is does show the direct
      benefits of the optimization.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Link: http://lkml.kernel.org/r/1469748819-19484-6-git-send-email-dave@stgolabs.netSigned-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed27f912
    • Davidlohr Bueso's avatar
      ipc/msg: make ss_wakeup() kill arg boolean · d0d6a2a9
      Davidlohr Bueso authored
      ... 'tis annoying.
      
      Link: http://lkml.kernel.org/r/1469748819-19484-4-git-send-email-dave@stgolabs.netSigned-off-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0d6a2a9
    • Davidlohr Bueso's avatar
      ipc/msg: batch queue sender wakeups · e3658538
      Davidlohr Bueso authored
      Currently the use of wake_qs in sysv msg queues are only for the receiver
      tasks that are blocked on the queue.  But blocked sender tasks (due to
      queue size constraints) still are awoken with the ipc object lock held,
      which can be a problem particularly for small sized queues and far from
      gracious for -rt (just like it was for the receiver side).
      
      The paths that actually wakeup a sender are obviously related to when we
      are either getting rid of the queue or after (some) space is freed-up
      after a receiver takes the msg (msgrcv).  Furthermore, with the exception
      of msgrcv, we can always piggy-back on expunge_all that has its own tasks
      lined-up for waking.  Finally, upon unlinking the message, it should be no
      problem delaying the wakeups a bit until after we've released the lock.
      
      Link: http://lkml.kernel.org/r/1469748819-19484-3-git-send-email-dave@stgolabs.netSigned-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e3658538
    • Sebastian Andrzej Siewior's avatar
      ipc/msg: implement lockless pipelined wakeups · ee51636c
      Sebastian Andrzej Siewior authored
      This patch moves the wakeup_process() invocation so it is not done under
      the ipc global lock by making use of a lockless wake_q.  With this change,
      the waiter is woken up once the message has been assigned and it does not
      need to loop on SMP if the message points to NULL.  In the signal case we
      still need to check the pointer under the lock to verify the state.
      
      This change should also avoid the introduction of preempt_disable() in -RT
      which avoids a busy-loop which pools for the NULL -> !NULL change if the
      waiter has a higher priority compared to the waker.
      
      By making use of wake_qs, the logic of sysv msg queues is greatly
      simplified (and very well suited as we can batch lockless wakeups),
      particularly around the lockless receive algorithm.
      
      This has been tested with Manred's pmsg-shared tool on a "AMD A10-7800
      Radeon R7, 12 Compute Cores 4C+8G":
      
      test             |   before   |   after    | diff
      -----------------|------------|------------|----------
      pmsg-shared 8 60 | 19,347,422 | 30,442,191 | + ~57.34 %
      pmsg-shared 4 60 | 21,367,197 | 35,743,458 | + ~67.28 %
      pmsg-shared 2 60 | 22,884,224 | 24,278,200 | +  ~6.09 %
      
      Link: http://lkml.kernel.org/r/1469748819-19484-2-git-send-email-dave@stgolabs.netSigned-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ee51636c
    • Manfred Spraul's avatar
      ipc/sem.c: fix complex_count vs. simple op race · 5864a2fd
      Manfred Spraul authored
      Commit 6d07b68c ("ipc/sem.c: optimize sem_lock()") introduced a
      race:
      
      sem_lock has a fast path that allows parallel simple operations.
      There are two reasons why a simple operation cannot run in parallel:
       - a non-simple operations is ongoing (sma->sem_perm.lock held)
       - a complex operation is sleeping (sma->complex_count != 0)
      
      As both facts are stored independently, a thread can bypass the current
      checks by sleeping in the right positions.  See below for more details
      (or kernel bugzilla 105651).
      
      The patch fixes that by creating one variable (complex_mode)
      that tracks both reasons why parallel operations are not possible.
      
      The patch also updates stale documentation regarding the locking.
      
      With regards to stable kernels:
      The patch is required for all kernels that include the
      commit 6d07b68c ("ipc/sem.c: optimize sem_lock()") (3.10?)
      
      The alternative is to revert the patch that introduced the race.
      
      The patch is safe for backporting, i.e. it makes no assumptions
      about memory barriers in spin_unlock_wait().
      
      Background:
      Here is the race of the current implementation:
      
      Thread A: (simple op)
      - does the first "sma->complex_count == 0" test
      
      Thread B: (complex op)
      - does sem_lock(): This includes an array scan. But the scan can't
        find Thread A, because Thread A does not own sem->lock yet.
      - the thread does the operation, increases complex_count,
        drops sem_lock, sleeps
      
      Thread A:
      - spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock)
      - sleeps before the complex_count test
      
      Thread C: (complex op)
      - does sem_lock (no array scan, complex_count==1)
      - wakes up Thread B.
      - decrements complex_count
      
      Thread A:
      - does the complex_count test
      
      Bug:
      Now both thread A and thread C operate on the same array, without
      any synchronization.
      
      Fixes: 6d07b68c ("ipc/sem.c: optimize sem_lock()")
      Link: http://lkml.kernel.org/r/1469123695-5661-1-git-send-email-manfred@colorfullife.com
      Reported-by: <felixh@informatik.uni-bremen.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: <1vier1@web.de>
      Cc: <stable@vger.kernel.org>	[3.10+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5864a2fd