1. 24 Jun, 2017 1 commit
    • Michal Hocko's avatar
      x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmap · 4a06370b
      Michal Hocko authored
      Since the following commit in 2008:
      
        cc503c1b ("x86: PIE executable randomization")
      
      We added a heuristics to treat applications with RLIMIT_STACK configured
      to unlimited as legacy. This means:
      
       a) set the mmap_base to 1/3 of address space + randomization and
       b) mmap from bottom to top.
      
      This makes some sense as it allows the stack to grow really large. On the
      other hand it reduces the address space usable for default mmaps
      (without address hint) quite a lot.
      
      We have received a bug report that SAP HANA workload has hit into this
      limitation.
      
      We could argue that the user just got what he asked for when setting
      up the unlimited stack but to be realistic growing stack up to 1/6
      TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real
      life. This would give mmap 20TB of additional address space which is
      quite nice. Especially when it is much more likely to use that address
      space than the reserved stack.
      
      Digging into the history the original implementation of the randomization:
      
        8817210d ("[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit")
      
      didn't have this restriction.
      
      So let's try and remove this assumption - hopefully nothing breaks.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJiri Kosina <jkosina@suse.cz>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: hughd@google.com
      Cc: linux-mm@kvack.org
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/tip-86b110d2ae6365ce91cabd37588bc8611770421a@git.kernel.org
      [ So I've applied this to tip:x86/mm with a wider Cc: list - if anyone objects to this change please holler. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4a06370b
  2. 22 Jun, 2017 4 commits
    • Andy Lutomirski's avatar
      x86/mm: Remove reset_lazy_tlbstate() · d5436812
      Andy Lutomirski authored
      The only call site also calls idle_task_exit(), and idle_task_exit()
      puts us into a clean state by explicitly switching to init_mm.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/3acc7ad02a2ec060d2321a1e0f6de1cb90069517.1498022414.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d5436812
    • Andy Lutomirski's avatar
      x86/ldt: Simplify the LDT switching logic · 73534258
      Andy Lutomirski authored
      Originally, Linux reloaded the LDT whenever the prev mm or the next
      mm had an LDT. It was changed in 2002 in:
      
        0bbed3be ("[PATCH] Thread-Local Storage (TLS) support")
      
      (commit from the historical tree), like this:
      
      -		/* load_LDT, if either the previous or next thread
      -		 * has a non-default LDT.
      +		/*
      +		 * load the LDT, if the LDT is different:
      		 */
      -		if (next->context.size+prev->context.size)
      +		if (unlikely(prev->context.ldt != next->context.ldt))
      			load_LDT(&next->context);
      
      The current code is unlikely to avoid any LDT reloads, since different
      mms won't share an LDT.
      
      When we redo lazy mode to stop flush IPIs without switching to
      init_mm, though, the current logic would become incorrect: it will
      be possible to have real_prev == next but nonetheless have a stale
      LDT descriptor.
      
      Simplify the code to update LDTR if either the previous or the next
      mm has an LDT, i.e. effectively restore the historical logic..
      While we're at it, clean up the code by moving all the ifdeffery to
      a header where it belongs.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/2a859ac01245f9594c58f9d0a8b2ed8a7cd2507e.1498022414.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      73534258
    • Ingo Molnar's avatar
      a4eb8b99
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 8d829b9b
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "This contains a set of fixes for xen-blkback by way of Konrad, and a
        performance regression fix for blk-mq for shared tags.
      
        The latter could account for as much as a 50x reduction in
        performance, with the test case from the user with 500 name spaces. A
        more realistic setup on my end with 32 drives showed a 3.5x drop. The
        fix has been thoroughly tested before being committed"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-mq: fix performance regression with shared tags
        xen-blkback: don't leak stack data via response ring
        xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
        xen/blkback: don't free be structure too early
        xen/blkback: fix disconnect while I/Os in flight
      8d829b9b
  3. 21 Jun, 2017 8 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 48b6bbef
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix refcounting wrt timers which hold onto inet6 address objects,
          from Xin Long.
      
       2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg.
      
       3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel.
      
       4) Several mlx5 driver fixes (firmware readiness, timestamp cap
          reporting, devlink command validity checking, tc offloading, etc.)
          From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz.
      
       5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan.
      
       6) Fix dst refcount bug in decnet, from Wei Wang.
      
       7) Netdev can be double freed in register_vlan_device(). Fix from Gao
          Feng.
      
       8) Don't allow object to be destroyed while it is being dumped in SCTP,
          from Xin Long.
      
       9) Fix dpaa_eth build when modular, from Madalin Bucur.
      
      10) Fix throw route leaks, from Serhey Popovych.
      
      11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table,
          also from Serhey Popovych.
      
      12) Fix premature TX SKB free in stmmac, from Niklas Cassel.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
        igmp: add a missing spin_lock_init()
        net: stmmac: free an skb first when there are no longer any descriptors using it
        sfc: remove duplicate up_write on VF filter_sem
        rtnetlink: add IFLA_GROUP to ifla_policy
        ipv6: Do not leak throw route references
        dt-bindings: net: sms911x: Add missing optional VDD regulators
        dpaa_eth: reuse the dma_ops provided by the FMan MAC device
        fsl/fman: propagate dma_ops
        net/core: remove explicit do_softirq() from busy_poll_stop()
        fib_rules: Resolve goto rules target on delete
        sctp: ensure ep is not destroyed before doing the dump
        net/hns:bugfix of ethtool -t phy self_test
        net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
        cxgb4: notify uP to route ctrlq compl to rdma rspq
        ip6_tunnel: Correct tos value in collect_md mode
        decnet: always not take dst->__refcnt when inserting dst into hash table
        ip6_tunnel: fix potential issue in __ip6_tnl_rcv
        ip_tunnel: fix potential issue in ip_tunnel_rcv
        brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2()
        net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it
        ...
      48b6bbef
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · ce879b64
      Linus Torvalds authored
      Pull more pin control fixes from Linus Walleij:
       "Some late arriving fixes. I should have sent earlier, just swamped
        with work as usual. Thomas patch makes AMD systems usable despite
        firmware bugs so it is fairly important.
      
         - Make the AMD driver use a regular interrupt rather than a chained
           one, so the system does not lock up.
      
         - Fix a function call error deep inside the STM32 driver"
      
      * tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: stm32: Fix bad function call
        pinctrl/amd: Use regular interrupt instead of chained
      ce879b64
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · db1b5ccd
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - revert of a commit to magicmouse driver that regressess certain
         devices, from Daniel Stone
      
       - quirk for a specific Dell mouse, from Sebastian Parschauer
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse"
        HID: Add quirk for Dell PIXART OEM mouse
      db1b5ccd
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching · dcba7108
      Linus Torvalds authored
      Pull livepatching fix from Jiri Kosina:
       "Fix the way how livepatches are being stacked with respect to RCU,
        from Petr Mladek"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
        livepatch: Fix stacking of patches with respect to RCU
      dcba7108
    • Linus Torvalds's avatar
      Merge branch 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 021f6019
      Linus Torvalds authored
      Pull more ufs fixes from Al Viro:
       "More UFS fixes, unfortunately including build regression fix for the
        64-bit s_dsize commit. Fixed in this pile:
      
         - trivial bug in signedness of 32bit timestamps on ufs1
      
         - ESTALE instead of ufs_error() when doing open-by-fhandle on
           something deleted
      
         - build regression on 32bit in ufs_new_fragments() - calculating that
           many percents of u64 pulls libgcc stuff on some of those. Mea
           culpa.
      
         - fix hysteresis loop broken by typo in 2.4.14.7 (right next to the
           location of previous bug).
      
         - fix the insane limits of said hysteresis loop on filesystems with
           very low percentage of reserved blocks. If it's 5% or less, just
           use the OPTSPACE policy.
      
         - calculate those limits once and mount time.
      
        This tree does pass xfstests clean (both ufs1 and ufs2) and it _does_
        survive cross-builds.
      
        Again, my apologies for missing that, especially since I have noticed
        a related percentage-of-64bit issue in earlier patches (when dealing
        with amount of reserved blocks). Self-LART applied..."
      
      * 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        ufs: fix the logics for tail relocation
        ufs_iget(): fail with -ESTALE on deleted inode
        fix signedness of timestamps on ufs1
      021f6019
    • Helge Deller's avatar
      Allow stack to grow up to address space limit · bd726c90
      Helge Deller authored
      Fix expand_upwards() on architectures with an upward-growing stack (parisc,
      metag and partly IA-64) to allow the stack to reliably grow exactly up to
      the address space limit given by TASK_SIZE.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bd726c90
    • Hugh Dickins's avatar
      mm: fix new crash in unmapped_area_topdown() · f4cb767d
      Hugh Dickins authored
      Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
      mmap testing.  That's the VM_BUG_ON(gap_end < gap_start) at the
      end of unmapped_area_topdown().  Linus points out how MAP_FIXED
      (which does not have to respect our stack guard gap intentions)
      could result in gap_end below gap_start there.  Fix that, and
      the similar case in its alternative, unmapped_area().
      
      Cc: stable@vger.kernel.org
      Fixes: 1be7107f ("mm: larger stack guard gap, between vmas")
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Debugged-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f4cb767d
    • Jens Axboe's avatar
      blk-mq: fix performance regression with shared tags · 8e8320c9
      Jens Axboe authored
      If we have shared tags enabled, then every IO completion will trigger
      a full loop of every queue belonging to a tag set, and every hardware
      queue for each of those queues, even if nothing needs to be done.
      This causes a massive performance regression if you have a lot of
      shared devices.
      
      Instead of doing this huge full scan on every IO, add an atomic
      counter to the main queue that tracks how many hardware queues have
      been marked as needing a restart. With that, we can avoid looking for
      restartable queues, if we don't have to.
      
      Max reports that this restores performance. Before this patch, 4K
      IOPS was limited to 22-23K IOPS. With the patch, we are running at
      950-970K IOPS.
      
      Fixes: 6d8c6c0f ("blk-mq: Restart a single queue if tag sets are shared")
      Reported-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Tested-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Tested-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8e8320c9
  4. 20 Jun, 2017 19 commits
    • WANG Cong's avatar
      igmp: add a missing spin_lock_init() · b4846fc3
      WANG Cong authored
      Andrey reported a lockdep warning on non-initialized
      spinlock:
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:16
        dump_stack+0x292/0x395 lib/dump_stack.c:52
        register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
        ? 0xffffffffa0000000
        __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
        lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
        __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
        _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
        spin_lock_bh ./include/linux/spinlock.h:304
        ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
        igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
        ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
      
      We miss a spin_lock_init() in igmpv3_add_delrec(), probably
      because previously we never use it on this code path. Since
      we already unlink it from the global mc_tomb list, it is
      probably safe not to acquire this spinlock here. It does not
      harm to have it although, to avoid conditional locking.
      
      Fixes: c38b7d32 ("igmp: acquire pmc lock for ip_mc_clear_src()")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4846fc3
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-for-davem-2017-06-20' of... · afd64631
      David S. Miller authored
      Merge tag 'wireless-drivers-for-davem-2017-06-20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for 4.12
      
      Two important fixes for brcmfmac. The rest of the brcmfmac patches are
      either code preparation and fixing a new build warning.
      
      brcmfmac
      
      * fix a NULL pointer dereference during resume
      
      * fix a NULL pointer dereference with USB devices, a regression from
        v4.12-rc1
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afd64631
    • Niklas Cassel's avatar
      net: stmmac: free an skb first when there are no longer any descriptors using it · 05cf0d1b
      Niklas Cassel authored
      When having the skb pointer in the first descriptor, stmmac_tx_clean
      can get called at a moment where the IP has only cleared the own bit
      of the first descriptor, thus freeing the skb, even though there can
      be several descriptors whose buffers point into the same skb.
      
      By simply moving the skb pointer from the first descriptor to the last
      descriptor, a skb will get freed only when the IP has cleared the
      own bit of all the descriptors that are using that skb.
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05cf0d1b
    • Edward Cree's avatar
      sfc: remove duplicate up_write on VF filter_sem · 57f0c9cf
      Edward Cree authored
      Somehow two copies of the line 'up_write(&vf->efx->filter_sem);' got into
       efx_ef10_sriov_set_vf_vlan().  This would put the mutex in a bad state and
       cause all subsequent down attempts to hang.
      
      Fixes: 671b53ee ("sfc: Ensure down_write(&filter_sem) and up_write() are matched before calling efx_net_open()")
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57f0c9cf
    • Serhey Popovych's avatar
      rtnetlink: add IFLA_GROUP to ifla_policy · db833d40
      Serhey Popovych authored
      Network interface groups support added while ago, however
      there is no IFLA_GROUP attribute description in policy
      and netlink message size calculations until now.
      
      Add IFLA_GROUP attribute to the policy.
      
      Fixes: cbda10fa ("net_device: add support for network device groups")
      Signed-off-by: default avatarSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db833d40
    • Serhey Popovych's avatar
      ipv6: Do not leak throw route references · 07f61557
      Serhey Popovych authored
      While commit 73ba57bf ("ipv6: fix backtracking for throw routes")
      does good job on error propagation to the fib_rules_lookup()
      in fib rules core framework that also corrects throw routes
      handling, it does not solve route reference leakage problem
      happened when we return -EAGAIN to the fib_rules_lookup()
      and leave routing table entry referenced in arg->result.
      
      If rule with matched throw route isn't last matched in the
      list we overwrite arg->result losing reference on throw
      route stored previously forever.
      
      We also partially revert commit ab997ad4 ("ipv6: fix the
      incorrect return value of throw route") since we never return
      routing table entry with dst.error == -EAGAIN when
      CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
      to check for RTF_REJECT flag since it is always set throw
      route.
      
      Fixes: 73ba57bf ("ipv6: fix backtracking for throw routes")
      Signed-off-by: default avatarSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07f61557
    • Krzysztof Kozlowski's avatar
      dt-bindings: net: sms911x: Add missing optional VDD regulators · 7e113321
      Krzysztof Kozlowski authored
      The lan911x family of devices require supplying from 3.3 V power
      supplies (connected to VDD_IO, VDD_A and VREG_3.3 pins).  The existing
      driver however obtains only VDD_IO and VDD_A regulators in an optional
      way so document this in bindings.
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e113321
    • David S. Miller's avatar
      Merge branch 'net-fix-loadable-module-for-DPAA-Ethernet' · 73b098d6
      David S. Miller authored
      Madalin Bucur says:
      
      ====================
      net: fix loadable module for DPAA Ethernet
      
      The DPAA Ethernet makes use of a symbol that is not exported.
      Address the issue by propagating the dma_ops rather than calling
      arch_setup_dma_ops().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73b098d6
    • Madalin Bucur's avatar
      dpaa_eth: reuse the dma_ops provided by the FMan MAC device · fb52728a
      Madalin Bucur authored
      Remove the use of arch_setup_dma_ops() that was not exported
      and was breaking loadable module compilation.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb52728a
    • Madalin Bucur's avatar
      fsl/fman: propagate dma_ops · 5567e989
      Madalin Bucur authored
      Make sure dma_ops are set, to be later used by the Ethernet driver.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5567e989
    • Sebastian Siewior's avatar
      net/core: remove explicit do_softirq() from busy_poll_stop() · fe420d87
      Sebastian Siewior authored
      Since commit 217f6974 ("net: busy-poll: allow preemption in
      sk_busy_loop()") there is an explicit do_softirq() invocation after
      local_bh_enable() has been invoked.
      I don't understand why we need this because local_bh_enable() will
      invoke do_softirq() once the softirq counter reached zero and we have
      softirq-related work pending.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe420d87
    • Serhey Popovych's avatar
      fib_rules: Resolve goto rules target on delete · bdaf32c3
      Serhey Popovych authored
      We should avoid marking goto rules unresolved when their
      target is actually reachable after rule deletion.
      
      Consolder following sample scenario:
      
        # ip -4 ru sh
        0:      from all lookup local
        32000:  from all goto 32100
        32100:  from all lookup main
        32100:  from all lookup default
        32766:  from all lookup main
        32767:  from all lookup default
      
        # ip -4 ru del pref 32100 table main
        # ip -4 ru sh
        0:      from all lookup local
        32000:  from all goto 32100 [unresolved]
        32100:  from all lookup default
        32766:  from all lookup main
        32767:  from all lookup default
      
      After removal of first rule with preference 32100 we
      mark all goto rules as unreachable, even when rule with
      same preference as removed one still present.
      
      Check if next rule with same preference is available
      and make all rules with goto action pointing to it.
      Signed-off-by: default avatarSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdaf32c3
    • Jens Axboe's avatar
      Merge branch 'stable/for-jens-4.12' of... · ec2f0fad
      Jens Axboe authored
      Merge branch 'stable/for-jens-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
      
      Pull xen-blkback fixes from Konrad:
      
      "Security and memory leak fixes in xen block driver."
      ec2f0fad
    • Kirill A. Shutemov's avatar
      x86/boot/64: Put __startup_64() into .head.text · 26179670
      Kirill A. Shutemov authored
      Put __startup_64() and fixup_pointer() into .head.text section to make
      sure it's always near startup_64() and always callable.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel test robot <fengguang.wu@intel.com>
      Cc: wfg@linux.intel.com
      Link: http://lkml.kernel.org/r/20170616113024.ajmif63cmcszry5a@black.fi.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      26179670
    • Jiri Kosina's avatar
      900a88ef
    • Petr Mladek's avatar
      livepatch: Fix stacking of patches with respect to RCU · 842c0884
      Petr Mladek authored
      rcu_read_(un)lock(), list_*_rcu(), and synchronize_rcu() are used for a secure
      access and manipulation of the list of patches that modify the same function.
      In particular, it is the variable func_stack that is accessible from the ftrace
      handler via struct ftrace_ops and klp_ops.
      
      Of course, it synchronizes also some states of the patch on the top of the
      stack, e.g. func->transition in klp_ftrace_handler.
      
      At the same time, this mechanism guards also the manipulation of
      task->patch_state. It is modified according to the state of the transition and
      the state of the process.
      
      Now, all this works well as long as RCU works well. Sadly livepatching might
      get into some corner cases when this is not true. For example, RCU is not
      watching when rcu_read_lock() is taken in idle threads.  It is because they
      might sleep and prevent reaching the grace period for too long.
      
      There are ways how to make RCU watching even in idle threads, see
      rcu_irq_enter(). But there is a small location inside RCU infrastructure when
      even this does not work.
      
      This small problematic location can be detected either before calling
      rcu_irq_enter() by rcu_irq_enter_disabled() or later by rcu_is_watching().
      Sadly, there is no safe way how to handle it.  Once we detect that RCU was not
      watching, we might see inconsistent state of the function stack and the related
      variables in klp_ftrace_handler(). Then we could do a wrong decision, use an
      incompatible implementation of the function and break the consistency of the
      system. We could warn but we could not avoid the damage.
      
      Fortunately, ftrace has similar problems and they seem to be solved well there.
      It uses a heavy weight implementation of some RCU operations. In particular, it
      replaces:
      
        + rcu_read_lock() with preempt_disable_notrace()
        + rcu_read_unlock() with preempt_enable_notrace()
        + synchronize_rcu() with schedule_on_each_cpu(sync_work)
      
      My understanding is that this is RCU implementation from a stone age. It meets
      the core RCU requirements but it is rather ineffective. Especially, it does not
      allow to batch or speed up the synchronize calls.
      
      On the other hand, it is very trivial. It allows to safely trace and/or
      livepatch even the RCU core infrastructure.  And the effectiveness is a not a
      big issue because using ftrace or livepatches on productive systems is a rare
      operation.  The safety is much more important than a negligible extra load.
      
      Note that the alternative implementation follows the RCU principles. Therefore,
           we could and actually must use list_*_rcu() variants when manipulating the
           func_stack.  These functions allow to access the pointers in the right
           order and with the right barriers. But they do not use any other
           information that would be set only by rcu_read_lock().
      
      Also note that there are actually two problems solved in ftrace:
      
      First, it cares about the consistency of RCU read sections.  It is being solved
      the way as described and used in this patch.
      
      Second, ftrace needs to make sure that nobody is inside the dynamic trampoline
      when it is being freed. For this, it also calls synchronize_rcu_tasks() in
      preemptive kernel in ftrace_shutdown().
      
      Livepatch has similar problem but it is solved by ftrace for free.
      klp_ftrace_handler() is a good guy and never sleeps. In addition, it is
      registered with FTRACE_OPS_FL_DYNAMIC. It causes that
      unregister_ftrace_function() calls:
      
      	* schedule_on_each_cpu(ftrace_sync) - always
      	* synchronize_rcu_tasks() - in preemptive kernel
      
      The effect is that nobody is neither inside the dynamic trampoline nor inside
      the ftrace handler after unregister_ftrace_function() returns.
      
      [jkosina@suse.cz: reformat changelog, fix comment]
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      842c0884
    • Daniel Stone's avatar
      Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse" · 53145c2e
      Daniel Stone authored
      Setting these bits causes libinput to fail to initialize the device;
      setting BTN_TOUCH and BTN_TOOL_FINGER causes it to treat the mouse as a
      touchpad, and it then refuses to continue when it discovers ABS_X is not
      set.
      
      This breaks all known Wayland compositors, as well as Xorg when the
      libinput driver is being used.
      
      This reverts commit f4b65b95.
      Signed-off-by: default avatarDaniel Stone <daniels@collabora.com>
      Cc: Che-Liang Chiou <clchiou@chromium.org>
      Cc: Thierry Escande <thierry.escande@collabora.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
      Acked-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      53145c2e
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 9705596d
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "One build fix for an Amlogic clk driver and a handful of Allwinner clk
        driver fixes for some DT bindings and a randconfig build error that
        all came in this merge window"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
        clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
        dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
        clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition
        clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
        clk: meson: gxbb: fix build error without RESET_CONTROLLER
        clk: sunxi-ng: v3s: Fix usb otg device reset bit
        clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
      9705596d
    • Linus Torvalds's avatar
      Merge tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb · 865be780
      Linus Torvalds authored
      Pull NTB fixes from Jon Mason:
       "NTB bug fixes to address the modinfo in ntb_perf, a couple of bugs in
        the NTB transport QP calculations, skx doorbells, and sleeping in
        ntb_async_tx_submit"
      
      * tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb:
        ntb: no sleep in ntb_async_tx_submit
        ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits
        ntb_transport: fix bug calculating num_qps_mw
        ntb_transport: fix qp count bug
        NTB: ntb_test: fix bug printing ntb_perf results
        ntb: Correct modinfo usage statement for ntb_perf
      865be780
  5. 19 Jun, 2017 8 commits