1. 10 Aug, 2017 9 commits
    • Paolo Bonzini's avatar
      cpuset: Make nr_cpusets private · be040bea
      Paolo Bonzini authored
      Any use of key->enabled (that is static_key_enabled and static_key_count)
      outside jump_label_lock should handle its own serialization.  In the case
      of cpusets_enabled_key, the key is always incremented/decremented under
      cpuset_mutex, and hence the same rule applies to nr_cpusets.  The rule
      *is* respected currently, but the mutex is static so nr_cpusets should
      be static too.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarZefan Li <lizefan@huawei.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1501601046-35683-4-git-send-email-pbonzini@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      be040bea
    • Paolo Bonzini's avatar
      jump_label: Do not use unserialized static_key_enabled() · 7a34bcb8
      Paolo Bonzini authored
      Any use of key->enabled (that is static_key_enabled and static_key_count)
      outside jump_label_lock should handle its own serialization.  The only
      two that are not doing so are the UDP encapsulation static keys.  Change
      them to use static_key_enable, which now correctly tests key->enabled under
      the jump label lock.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1501601046-35683-3-git-send-email-pbonzini@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7a34bcb8
    • Paolo Bonzini's avatar
      jump_label: Fix concurrent static_key_enable/disable() · 1dbb6704
      Paolo Bonzini authored
      static_key_enable/disable are trying to cap the static key count to
      0/1.  However, their use of key->enabled is outside jump_label_lock
      so they do not really ensure that.
      
      Rewrite them to do a quick check for an already enabled (respectively,
      already disabled), and then recheck under the jump label lock.  Unlike
      static_key_slow_inc/dec, a failed check under the jump label lock does
      not modify key->enabled.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1501601046-35683-2-git-send-email-pbonzini@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1dbb6704
    • Kirill Tkhai's avatar
      locking/rwsem-xadd: Add killable versions of rwsem_down_read_failed() · 83ced169
      Kirill Tkhai authored
      Rename rwsem_down_read_failed() in __rwsem_down_read_failed_common()
      and teach it to abort waiting in case of pending signals and killable
      state argument passed.
      
      Note, that we shouldn't wake anybody up in EINTR path, as:
      
      We check for (waiter.task) under spinlock before we go to out_nolock
      path. Current task wasn't able to be woken up, so there are
      a writer, owning the sem, or a writer, which is the first waiter.
      In the both cases we shouldn't wake anybody. If there is a writer,
      owning the sem, and we were the only waiter, remove RWSEM_WAITING_BIAS,
      as there are no waiters anymore.
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Link: http://lkml.kernel.org/r/149789534632.9059.2901382369609922565.stgit@localhost.localdomainSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      83ced169
    • Kirill Tkhai's avatar
      locking/rwsem-spinlock: Add killable versions of __down_read() · 0aa1125f
      Kirill Tkhai authored
      Rename __down_read() in __down_read_common() and teach it
      to abort waiting in case of pending signals and killable
      state argument passed.
      
      Note, that we shouldn't wake anybody up in EINTR path, as:
      
      We check for signal_pending_state() after (!waiter.task)
      test and under spinlock. So, current task wasn't able to
      be woken up. It may be in two cases: a writer is owner
      of the sem, or a writer is a first waiter of the sem.
      
      If a writer is owner of the sem, no one else may work
      with it in parallel. It will wake somebody, when it
      call up_write() or downgrade_write().
      
      If a writer is the first waiter, it will be woken up,
      when the last active reader releases the sem, and
      sem->count became 0.
      
      Also note, that set_current_state() may be moved down
      to schedule() (after !waiter.task check), as all
      assignments in this type of semaphore (including wake_up),
      occur under spinlock, so we can't miss anything.
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arnd@arndb.de
      Cc: avagin@virtuozzo.com
      Cc: davem@davemloft.net
      Cc: fenghua.yu@intel.com
      Cc: gorcunov@virtuozzo.com
      Cc: heiko.carstens@de.ibm.com
      Cc: hpa@zytor.com
      Cc: ink@jurassic.park.msu.ru
      Cc: mattst88@gmail.com
      Cc: rth@twiddle.net
      Cc: schwidefsky@de.ibm.com
      Cc: tony.luck@intel.com
      Link: http://lkml.kernel.org/r/149789533283.9059.9829416940494747182.stgit@localhost.localdomainSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0aa1125f
    • Prateek Sood's avatar
      locking/osq_lock: Fix osq_lock queue corruption · 50972fe7
      Prateek Sood authored
      Fix ordering of link creation between node->prev and prev->next in
      osq_lock(). A case in which the status of optimistic spin queue is
      CPU6->CPU2 in which CPU6 has acquired the lock.
      
              tail
                v
        ,-. <- ,-.
        |6|    |2|
        `-' -> `-'
      
      At this point if CPU0 comes in to acquire osq_lock, it will update the
      tail count.
      
        CPU2			CPU0
        ----------------------------------
      
      				       tail
      				         v
      			  ,-. <- ,-.    ,-.
      			  |6|    |2|    |0|
      			  `-' -> `-'    `-'
      
      After tail count update if CPU2 starts to unqueue itself from
      optimistic spin queue, it will find an updated tail count with CPU0 and
      update CPU2 node->next to NULL in osq_wait_next().
      
        unqueue-A
      
      	       tail
      	         v
        ,-. <- ,-.    ,-.
        |6|    |2|    |0|
        `-'    `-'    `-'
      
        unqueue-B
      
        ->tail != curr && !node->next
      
      If reordering of following stores happen then prev->next where prev
      being CPU2 would be updated to point to CPU0 node:
      
      				       tail
      				         v
      			  ,-. <- ,-.    ,-.
      			  |6|    |2|    |0|
      			  `-'    `-' -> `-'
      
        osq_wait_next()
          node->next <- 0
          xchg(node->next, NULL)
      
      	       tail
      	         v
        ,-. <- ,-.    ,-.
        |6|    |2|    |0|
        `-'    `-'    `-'
      
        unqueue-C
      
      At this point if next instruction
      	WRITE_ONCE(next->prev, prev);
      in CPU2 path is committed before the update of CPU0 node->prev = prev then
      CPU0 node->prev will point to CPU6 node.
      
      	       tail
          v----------. v
        ,-. <- ,-.    ,-.
        |6|    |2|    |0|
        `-'    `-'    `-'
           `----------^
      
      At this point if CPU0 path's node->prev = prev is committed resulting
      in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL
      currently,
      
      				       tail
      			                 v
      			  ,-. <- ,-. <- ,-.
      			  |6|    |2|    |0|
      			  `-'    `-'    `-'
      			     `----------^
      
      so if CPU0 gets into unqueue path of osq_lock it will keep spinning
      in infinite loop as condition prev->next == node will never be true.
      Signed-off-by: default avatarPrateek Sood <prsood@codeaurora.org>
      [ Added pictures, rewrote comments. ]
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: sramana@codeaurora.org
      Link: http://lkml.kernel.org/r/1500040076-27626-1-git-send-email-prsood@codeaurora.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      50972fe7
    • Peter Zijlstra's avatar
      locking/atomic: Fix atomic_set_release() for 'funny' architectures · 9d664c0a
      Peter Zijlstra authored
      Those architectures that have a special atomic_set implementation also
      need a special atomic_set_release(), because for the very same reason
      WRITE_ONCE() is broken for them, smp_store_release() is too.
      
      The vast majority is architectures that have spinlock hash based atomic
      implementation except hexagon which seems to have a hardware 'feature'.
      
      The spinlock based atomics should be SC, that is, none of them appear to
      place extra barriers in atomic_cmpxchg() or any of the other SC atomic
      primitives and therefore seem to rely on their spinlock implementation
      being SC (I did not fully validate all that).
      
      Therefore, the normal atomic_set() is SC and can be used at
      atomic_set_release().
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: davem@davemloft.net
      Cc: james.hogan@imgtec.com
      Cc: jejb@parisc-linux.org
      Cc: rkuo@codeaurora.org
      Cc: vgupta@synopsys.com
      Link: http://lkml.kernel.org/r/20170609110506.yod47flaav3wgoj5@hirez.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9d664c0a
    • Boqun Feng's avatar
      sched/wait: Remove the lockless swait_active() check in swake_up*() · 35a2897c
      Boqun Feng authored
      Steven Rostedt reported a potential race in RCU core because of
      swake_up():
      
              CPU0                            CPU1
              ----                            ----
                                      __call_rcu_core() {
      
                                       spin_lock(rnp_root)
                                       need_wake = __rcu_start_gp() {
                                        rcu_start_gp_advanced() {
                                         gp_flags = FLAG_INIT
                                        }
                                       }
      
       rcu_gp_kthread() {
         swait_event_interruptible(wq,
              gp_flags & FLAG_INIT) {
         spin_lock(q->lock)
      
                                      *fetch wq->task_list here! *
      
         list_add(wq->task_list, q->task_list)
         spin_unlock(q->lock);
      
         *fetch old value of gp_flags here *
      
                                       spin_unlock(rnp_root)
      
                                       rcu_gp_kthread_wake() {
                                        swake_up(wq) {
                                         swait_active(wq) {
                                          list_empty(wq->task_list)
      
                                         } * return false *
      
        if (condition) * false *
          schedule();
      
      In this case, a wakeup is missed, which could cause the rcu_gp_kthread
      waits for a long time.
      
      The reason of this is that we do a lockless swait_active() check in
      swake_up(). To fix this, we can either 1) add a smp_mb() in swake_up()
      before swait_active() to provide the proper order or 2) simply remove
      the swait_active() in swake_up().
      
      The solution 2 not only fixes this problem but also keeps the swait and
      wait API as close as possible, as wake_up() doesn't provide a full
      barrier and doesn't do a lockless check of the wait queue either.
      Moreover, there are users already using swait_active() to do their quick
      checks for the wait queues, so it make less sense that swake_up() and
      swake_up_all() do this on their own.
      
      This patch then removes the lockless swait_active() check in swake_up()
      and swake_up_all().
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Krister Johansen <kjlx@templeofstupid.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170615041828.zk3a3sfyudm5p6nl@tardisSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      35a2897c
    • Ingo Molnar's avatar
      388f8e12
  2. 09 Aug, 2017 14 commits
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 8d31f80e
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "These are the pin control fixes I have gathered since the return from
        my vacation. They boiled in -next a while so let's get them in.
      
        Apart from the documentation build it is purely driver fixes. Which is
        nice. The Intel fixes seem kind of important.
      
         - Fix the documentation build as the docs were moved
      
         - Correct the UART pin list on the Intel Merrifield
      
         - Fix pin assignment and number of pins on the Marvell Armada 37xx
           pin controller
      
         - Cover the Setzer models in the Chromebook DMI quirk in the Intel
           cheryview driver so they start working
      
         - Add the missing "sim" function to the sunxi driver
      
         - Fix USB pin definitions on Uniphier Pro4
      
         - Smatch fix for invalid reference in the zx pin control driver"
      
      * tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: generic: update references to Documentation/pinctrl.txt
        pinctrl: intel: merrifield: Correct UART pin lists
        pinctrl: armada-37xx: Fix number of pin in south bridge
        pinctrl: armada-37xx: Fix the pin 23 on south bridge
        pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk
        pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
        pinctrl: uniphier: fix USB3 pin assignment for Pro4
        pinctrl: zte: fix dereference of 'data' in zx_set_mux()
      8d31f80e
    • Mel Gorman's avatar
      futex: Remove unnecessary warning from get_futex_key · 48fb6f4d
      Mel Gorman authored
      Commit 65d8fc77 ("futex: Remove requirement for lock_page() in
      get_futex_key()") removed an unnecessary lock_page() with the
      side-effect that page->mapping needed to be treated very carefully.
      
      Two defensive warnings were added in case any assumption was missed and
      the first warning assumed a correct application would not alter a
      mapping backing a futex key.  Since merging, it has not triggered for
      any unexpected case but Mark Rutland reported the following bug
      triggering due to the first warning.
      
        kernel BUG at kernel/futex.c:679!
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
        Modules linked in:
        CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3
        Hardware name: linux,dummy-virt (DT)
        task: ffff80001e271780 task.stack: ffff000010908000
        PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
        LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
        pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145
      
      The fact that it's a bug instead of a warning was due to an unrelated
      arm64 problem, but the warning itself triggered because the underlying
      mapping changed.
      
      This is an application issue but from a kernel perspective it's a
      recoverable situation and the warning is unnecessary so this patch
      removes the warning.  The warning may potentially be triggered with the
      following test program from Mark although it may be necessary to adjust
      NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the
      system.
      
          #include <linux/futex.h>
          #include <pthread.h>
          #include <stdio.h>
          #include <stdlib.h>
          #include <sys/mman.h>
          #include <sys/syscall.h>
          #include <sys/time.h>
          #include <unistd.h>
      
          #define NR_FUTEX_THREADS 16
          pthread_t threads[NR_FUTEX_THREADS];
      
          void *mem;
      
          #define MEM_PROT  (PROT_READ | PROT_WRITE)
          #define MEM_SIZE  65536
      
          static int futex_wrapper(int *uaddr, int op, int val,
                                   const struct timespec *timeout,
                                   int *uaddr2, int val3)
          {
              syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
          }
      
          void *poll_futex(void *unused)
          {
              for (;;) {
                  futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
              }
          }
      
          int main(int argc, char *argv[])
          {
              int i;
      
              mem = mmap(NULL, MEM_SIZE, MEM_PROT,
                     MAP_SHARED | MAP_ANONYMOUS, -1, 0);
      
              printf("Mapping @ %p\n", mem);
      
              printf("Creating futex threads...\n");
      
              for (i = 0; i < NR_FUTEX_THREADS; i++)
                  pthread_create(&threads[i], NULL, poll_futex, NULL);
      
              printf("Flipping mapping...\n");
              for (;;) {
                  mmap(mem, MEM_SIZE, MEM_PROT,
                       MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
              }
      
              return 0;
          }
      Reported-and-tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: stable@vger.kernel.org # 4.7+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      48fb6f4d
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 358f8c26
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "The main thing is to allow empty id_tables for ACPI to make some
        drivers get probed again. It looks a bit bigger than usual because it
        needs some internal renaming, too.
      
        Other than that, there is a fix for broken DSTDs, a super simple
        enablement for ARM MPS, and two documentation fixes which I'd like to
        see in v4.13 already"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: rephrase explanation of I2C_CLASS_DEPRECATED
        i2c: allow i2c-versatile for ARM MPS platforms
        i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz
        i2c: designware: Print clock freq on invalid clock freq error
        i2c: core: Allow empty id_table in ACPI case as well
        i2c: mux: pinctrl: mention correct module name in Kconfig help text
      358f8c26
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 31cf92f3
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Three patches that should go into this release.
      
        Two of them are from Paolo and fix up some corner cases with BFQ, and
        the last patch is from Ming and fixes up a potential usage count
        imbalance regression due to the recent NOWAIT work"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-mq: don't leak preempt counter/q_usage_counter when allocating rq failed
        block, bfq: consider also in_service_entity to state whether an entity is active
        block, bfq: reset in_service_entity if it becomes idle
      31cf92f3
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · d555eb6b
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "Fix two regressions in the inside-secure driver with respect to
        hmac(sha1)"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: inside-secure - fix the sha state length in hmac_sha1_setkey
        crypto: inside-secure - fix invalidation check in hmac_sha1_setkey
      d555eb6b
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 4530cca1
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "The pull requests are getting smaller, that's progress I suppose :-)
      
         1) Fix infinite loop in CIPSO option parsing, from Yujuan Qi.
      
         2) Fix remote checksum handling in VXLAN and GUE tunneling drivers,
            from Koichiro Den.
      
         3) Missing u64_stats_init() calls in several drivers, from Florian
            Fainelli.
      
         4) TCP can set the congestion window to an invalid ssthresh value
            after congestion window reductions, from Yuchung Cheng.
      
         5) Fix BPF jit branch generation on s390, from Daniel Borkmann.
      
         6) Correct MIPS ebpf JIT merge, from David Daney.
      
         7) Correct byte order test in BPF test_verifier.c, from Daniel
            Borkmann.
      
         8) Fix various crashes and leaks in ASIX driver, from Dean Jenkins.
      
         9) Handle SCTP checksums properly in mlx4 driver, from Davide
            Caratti.
      
        10) We can potentially enter tcp_connect() with a cached route
            already, due to fastopen, so we have to explicitly invalidate it.
      
        11) skb_warn_bad_offload() can bark in legitimate situations, fix from
            Willem de Bruijn"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
        net: avoid skb_warn_bad_offload false positives on UFO
        qmi_wwan: fix NULL deref on disconnect
        ppp: fix xmit recursion detection on ppp channels
        rds: Reintroduce statistics counting
        tcp: fastopen: tcp_connect() must refresh the route
        net: sched: set xt_tgchk_param par.net properly in ipt_init_target
        net: dsa: mediatek: add adjust link support for user ports
        net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
        qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
        hysdn: fix to a race condition in put_log_buffer
        s390/qeth: fix L3 next-hop in xmit qeth hdr
        asix: Fix small memory leak in ax88772_unbind()
        asix: Ensure asix_rx_fixup_info members are all reset
        asix: Add rx->ax_skb = NULL after usbnet_skb_return()
        bpf: fix selftest/bpf/test_pkt_md_access on s390x
        netvsc: fix race on sub channel creation
        bpf: fix byte order test in test_verifier
        xgene: Always get clk source, but ignore if it's missing for SGMII ports
        MIPS: Add missing file for eBPF JIT.
        bpf, s390: fix build for libbpf and selftest suite
        ...
      4530cca1
    • Willem de Bruijn's avatar
      net: avoid skb_warn_bad_offload false positives on UFO · 8d63bee6
      Willem de Bruijn authored
      skb_warn_bad_offload triggers a warning when an skb enters the GSO
      stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
      checksum offload set.
      
      Commit b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      observed that SKB_GSO_DODGY producers can trigger the check and
      that passing those packets through the GSO handlers will fix it
      up. But, the software UFO handler will set ip_summed to
      CHECKSUM_NONE.
      
      When __skb_gso_segment is called from the receive path, this
      triggers the warning again.
      
      Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
      Tx these two are equivalent. On Rx, this better matches the
      skb state (checksum computed), as CHECKSUM_NONE here means no
      checksum computed.
      
      See also this thread for context:
      http://patchwork.ozlabs.org/patch/799015/
      
      Fixes: b2504a5d ("net: reduce skb_warn_bad_offload() noise")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d63bee6
    • Bjørn Mork's avatar
      qmi_wwan: fix NULL deref on disconnect · bbae08e5
      Bjørn Mork authored
      qmi_wwan_disconnect is called twice when disconnecting devices with
      separate control and data interfaces.  The first invocation will set
      the interface data to NULL for both interfaces to flag that the
      disconnect has been handled.  But the matching NULL check was left
      out when qmi_wwan_disconnect was added, resulting in this oops:
      
        usb 2-1.4: USB disconnect, device number 4
        qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0
        IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
        PGD 0
        P4D 0
        Oops: 0000 [#1] SMP
        Modules linked in: <stripped irrelevant module list>
        CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G            E   4.12.3-nr44-normandy-r1500619820+ #1
        Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017
        Workqueue: usb_hub_wq hub_event [usbcore]
        task: ffff8c882b716040 task.stack: ffffb8e800d84000
        RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
        RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
        RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400
        RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000
        R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8
        R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000
        FS:  0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0
        Call Trace:
         ? usb_unbind_interface+0x71/0x270 [usbcore]
         ? device_release_driver_internal+0x154/0x210
         ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan]
         ? usbnet_disconnect+0x6c/0xf0 [usbnet]
         ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan]
         ? usb_unbind_interface+0x71/0x270 [usbcore]
         ? device_release_driver_internal+0x154/0x210
      Reported-and-tested-by: default avatarNathaniel Roach <nroach44@gmail.com>
      Fixes: c6adf779 ("net: usb: qmi_wwan: add qmap mux protocol support")
      Cc: Daniele Palmas <dnlplm@gmail.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbae08e5
    • Guillaume Nault's avatar
      ppp: fix xmit recursion detection on ppp channels · 0a0e1a85
      Guillaume Nault authored
      Commit e5dadc65 ("ppp: Fix false xmit recursion detect with two ppp
      devices") dropped the xmit_recursion counter incrementation in
      ppp_channel_push() and relied on ppp_xmit_process() for this task.
      But __ppp_channel_push() can also send packets directly (using the
      .start_xmit() channel callback), in which case the xmit_recursion
      counter isn't incremented anymore. If such packets get routed back to
      the parent ppp unit, ppp_xmit_process() won't notice the recursion and
      will call ppp_channel_push() on the same channel, effectively creating
      the deadlock situation that the xmit_recursion mechanism was supposed
      to prevent.
      
      This patch re-introduces the xmit_recursion counter incrementation in
      ppp_channel_push(). Since the xmit_recursion variable is now part of
      the parent ppp unit, incrementation is skipped if the channel doesn't
      have any. This is fine because only packets routed through the parent
      unit may enter the channel recursively.
      
      Finally, we have to ensure that pch->ppp is not going to be modified
      while executing ppp_channel_push(). Instead of taking this lock only
      while calling ppp_xmit_process(), we now have to hold it for the full
      ppp_channel_push() execution. This respects the ppp locks ordering
      which requires locking ->upl before ->downl.
      
      Fixes: e5dadc65 ("ppp: Fix false xmit recursion detect with two ppp devices")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a0e1a85
    • Håkon Bugge's avatar
      rds: Reintroduce statistics counting · 05bfd7db
      Håkon Bugge authored
      In commit 7e3f2952 ("rds: don't let RDS shutdown a connection
      while senders are present"), refilling the receive queue was removed
      from rds_ib_recv(), along with the increment of
      s_ib_rx_refill_from_thread.
      
      Commit 73ce4317 ("RDS: make sure we post recv buffers")
      re-introduces filling the receive queue from rds_ib_recv(), but does
      not add the statistics counter. rds_ib_recv() was later renamed to
      rds_ib_recv_path().
      
      This commit reintroduces the statistics counting of
      s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq.
      Signed-off-by: default avatarHåkon Bugge <haakon.bugge@oracle.com>
      Reviewed-by: default avatarKnut Omang <knut.omang@oracle.com>
      Reviewed-by: default avatarWei Lin Guay <wei.lin.guay@oracle.com>
      Reviewed-by: default avatarShamir Rabinovitch <shamir.rabinovitch@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05bfd7db
    • Eric Dumazet's avatar
      tcp: fastopen: tcp_connect() must refresh the route · 8ba60924
      Eric Dumazet authored
      With new TCP_FASTOPEN_CONNECT socket option, there is a possibility
      to call tcp_connect() while socket sk_dst_cache is either NULL
      or invalid.
      
       +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
       +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
       +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
       +0 connect(4, ..., ...) = 0
      
      << sk->sk_dst_cache becomes obsolete, or even set to NULL >>
      
       +1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000
      
      We need to refresh the route otherwise bad things can happen,
      especially when syzkaller is running on the host :/
      
      Fixes: 19f6d3f3 ("net/tcp-fastopen: Add new API support")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Wei Wang <weiwan@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ba60924
    • Xin Long's avatar
      net: sched: set xt_tgchk_param par.net properly in ipt_init_target · ec0acb09
      Xin Long authored
      Now xt_tgchk_param par in ipt_init_target is a local varibale,
      par.net is not initialized there. Later when xt_check_target
      calls target's checkentry in which it may access par.net, it
      would cause kernel panic.
      
      Jaroslav found this panic when running:
      
        # ip link add TestIface type dummy
        # tc qd add dev TestIface ingress handle ffff:
        # tc filter add dev TestIface parent ffff: u32 match u32 0 0 \
          action xt -j CONNMARK --set-mark 4
      
      This patch is to pass net param into ipt_init_target and set
      par.net with it properly in there.
      
      v1->v2:
        As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix
        it by also passing net_id to __tcf_ipt_init.
      v2->v3:
        Missed the fixes tag, so add it.
      
      Fixes: ecb2421b ("netfilter: add and use nf_ct_netns_get/put")
      Reported-by: default avatarJaroslav Aster <jaster@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec0acb09
    • John Crispin's avatar
      net: dsa: mediatek: add adjust link support for user ports · 8e6f1521
      John Crispin authored
      Manually adjust the port settings of user ports once PHY polling has
      completed. This patch extends the adjust_link callback to configure the
      per port PMCR register, applying the proper values polled from the PHY.
      Without this patch flow control was not always getting setup properly.
      Signed-off-by: default avatarShashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>
      Signed-off-by: default avatarMuciri Gatimu <muciri@openmesh.com>
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e6f1521
    • Davide Caratti's avatar
      net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets · e718fe45
      Davide Caratti authored
      if the NIC fails to validate the checksum on TCP/UDP, and validation of IP
      checksum is successful, the driver subtracts the pseudo-header checksum
      from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't
      do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails.
      
      V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Fixes: f8c6455b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e718fe45
  3. 08 Aug, 2017 12 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · bfa738cf
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
       "Third set of -rc fixes for 4.13 cycle
      
         - small set of miscellanous fixes
      
         - a reasonably sizable set of IPoIB fixes that deal with multiple
           long standing issues"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        IB/hns: checking for IS_ERR() instead of NULL
        RDMA/mlx5: Fix existence check for extended address vector
        IB/uverbs: Fix device cleanup
        RDMA/uverbs: Prevent leak of reserved field
        IB/core: Fix race condition in resolving IP to MAC
        IB/ipoib: Notify on modify QP failure only when relevant
        Revert "IB/core: Allow QP state transition from reset to error"
        IB/ipoib: Remove double pointer assigning
        IB/ipoib: Clean error paths in add port
        IB/ipoib: Add get statistics support to SRIOV VF
        IB/ipoib: Add multicast packets statistics
        IB/ipoib: Set IPOIB_NEIGH_TBL_FLUSH after flushed completion initialization
        IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp
        IB/ipoib: Make sure no in-flight joins while leaving that mcast
        IB/ipoib: Use cancel_delayed_work_sync when needed
        IB/ipoib: Fix race between light events and interface restart
      bfa738cf
    • Joe Perches's avatar
      parse-maintainers: Move matching sections from MAINTAINERS · b95c29a2
      Joe Perches authored
      Allow any number of command line arguments to match either the
      section header or the section contents and create new files.
      
      Create MAINTAINERS.new and SECTION.new.
      
      This allows scripting of the movement of various sections from
      MAINTAINERS.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b95c29a2
    • Joe Perches's avatar
      parse-maintainers: Use perl hash references and specific filenames · fe909030
      Joe Perches authored
      Instead of reading STDIN and writing STDOUT, use specific filenames of
      MAINTAINERS and MAINTAINERS.new.
      
      Use hash references instead of global hash %hash so future modifications
      can read and write specific hashes to split up MAINTAINERS into multiple
      files using a script.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fe909030
    • Joe Perches's avatar
      parse-maintainers: Add section pattern sorting · 61f74164
      Joe Perches authored
      Section [A-Z]: patterns are not currently in any required sorting order.
      Add a specific sorting sequence to MAINTAINERS entries.
      Sort F: and X: patterns in alphabetic order.
      
      The preferred section ordering is:
      
        SECTION HEADER
        M:	Maintainers
        R:	Reviewers
        P:	Named persons without email addresses
        L:	Mailing list addresses
        S:	Status of this section (Supported, Maintained, Orphan, etc...)
        W:	Any relevant URLs
        T:	Source code control type (git, quilt, etc)
        Q:	Patchwork patch acceptance queue site
        B:	Bug tracking URIs
        C:	Chat URIs
        F:	Files with wildcard patterns (alphabetic ordered)
        X:	Excluded files with wildcard patterns (alphabetic ordered)
        N:	Files with regex patterns
        K:	Keyword regexes in source code for maintainership identification
      
      Miscellaneous perl neatening:
      
       - Rename %map to %hash, map has a different meaning in perl
       - Avoid using \& and local variables for function indirection
       - Use return for a little c like clarity
       - Use c-like function call style instead of &function
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      61f74164
    • Joe Perches's avatar
      get_maintainer: Prepare for separate MAINTAINERS files · 6f7d98ec
      Joe Perches authored
      Allow for MAINTAINERS to become a directory and if it is,
      read all the files in the directory for maintained sections.
      
      Optionally look for all files named MAINTAINERS in directories
      excluding the .git directory by using --find-maintainer-files.
      
      This optional feature adds ~.3 seconds of CPU on an Intel
      i5-6200 with an SSD.
      
      Miscellanea:
      
       - Create a read_maintainer_file subroutine from the existing code
       - Test only the existence of MAINTAINERS, not whether it's a file
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6f7d98ec
    • Randy Dunlap's avatar
      MAINTAINERS: openbmc mailing list is moderated · 6209ef67
      Randy Dunlap authored
      The openbmc mailing list is moderated for non-subscribers.
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Acked-by: default avatarBrendan Higgins <brendanhiggins@google.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Joel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6209ef67
    • Sedat Dilek's avatar
      MAINTAINERS: greybus: Fix typo s/LOOBACK/LOOPBACK · a1ffc2d2
      Sedat Dilek authored
      Fixes: f47e07bc ("Fix up MAINTAINERS file problems")
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a1ffc2d2
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · de70be0a
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two small fixes, one re-fix of a previous fix and five patches sorting
        out hotplug in the bnx2X class of drivers. The latter is rather
        involved, but necessary because these drivers have started dropping
        lockdep recursion warnings on the hotplug lock because of its
        conversion to a percpu rwsem"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: sg: only check for dxfer_len greater than 256M
        scsi: aacraid: reading out of bounds
        scsi: qedf: Limit number of CQs
        scsi: bnx2i: Simplify cpu hotplug code
        scsi: bnx2fc: Simplify CPU hotplug code
        scsi: bnx2i: Prevent recursive cpuhotplug locking
        scsi: bnx2fc: Prevent recursive cpuhotplug locking
        scsi: bnx2fc: Plug CPU hotplug race
      de70be0a
    • Helge Deller's avatar
      random: fix warning message on ia64 and parisc · 51d96dc2
      Helge Deller authored
      Fix the warning message on the parisc and IA64 architectures to show the
      correct function name of the caller by using %pS instead of %pF. The
      message is printed with the value of _RET_IP_ which calls
      __builtin_return_address(0) and as such returns the IP address caller
      instead of pointer to a function descriptor of the caller.
      
      The effect of this patch is visible on the parisc and ia64 architectures
      only since those are the ones which use function descriptors while on
      all others %pS and %pF will behave the same.
      
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Fixes: eecabf56 ("random: suppress spammy warnings about unseeded randomness")
      Fixes: d06bfd19 ("random: warn when kernel uses unseeded randomness")
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      51d96dc2
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20170807' of git://github.com/jcmvbkbc/linux-xtensa · 623ce345
      Linus Torvalds authored
      Pull Xtensa fixes from Max Filippov:
      
       - use asm-generic instances of asm/param.h and asm/device.h instead of
         exact copies in arch/xtensa/include/asm;
      
       - fix build error for xtensa cores with aliasing WT cache: define cache
         flushing functions and copy_{to,from}_user_page;
      
       - add missing EXPORT_SYMBOLs for clear_user_highpage, copy_user_highpage,
         flush_dcache_page, local_flush_cache_range, local_flush_cache_page,
         csum_partial and csum_partial_copy_generic.
      
      * tag 'xtensa-20170807' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: mm/cache: add missing EXPORT_SYMBOLs
        xtensa: don't limit csum_partial export by CONFIG_NET
        xtensa: fix cache aliasing handling code for WT cache
        xtensa: remove wrapper header for asm/param.h
        xtensa: remove wrapper header for asm/device.h
      623ce345
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20170807' of git://git.infradead.org/linux-mtd · d16b9d22
      Linus Torvalds authored
      Pull MTD fixes from Brian Norris:
       "I missed getting these out for rc4, but here are some MTD fixes.
      
        Just NAND fixes (in both the core handling, and a few drivers). Notes
        stolen from Boris:
      
        Core fixes:
      
         - fix data interface setup for ONFI NANDs that do not support the SET
           FEATURES command
      
         - fix a kernel doc header
      
         - fix potential integer overflow when retrieving timing information
           from the parameter page
      
         - fix wrong OOB layout for small page NANDs
      
        Driver fixes:
      
         - fix potential division-by-zero bug
      
         - fix backward compat with old atmel-nand DT bindings
      
         - fix ->setup_data_interface() in the atmel NAND driver"
      
      * tag 'for-linus-20170807' of git://git.infradead.org/linux-mtd:
        mtd: nand: atmel: Fix EDO mode check
        mtd: nand: Declare tBERS, tR and tPROG as u64 to avoid integer overflow
        mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES
        mtd: nand: Fix a docs build warning
        mtd: nand: sunxi: fix potential divide-by-zero error
        nand: fix wrong default oob layout for small pages using soft ecc
        mtd: nand: atmel: Fix DT backward compatibility in pmecc.c
      d16b9d22
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 1742c0f0
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "I have a couple more bug fixes for you today:
      
         - fix memory leak when issuing discard
      
         - fix propagation of the dax inode flag"
      
      * tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: Fix per-inode DAX flag inheritance
        xfs: Fix leak of discard bio
      1742c0f0
  4. 07 Aug, 2017 5 commits
    • Christophe Jaillet's avatar
      qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()' · eb2a6b80
      Christophe Jaillet authored
      We allocate 'p_info->mfw_mb_cur' and 'p_info->mfw_mb_shadow' but we check
      'p_info->mfw_mb_addr' instead of 'p_info->mfw_mb_cur'.
      
      'p_info->mfw_mb_addr' is never 0, because it is initiliazed a few lines
      above in 'qed_load_mcp_offsets()'.
      
      Update the test and check the result of the 2 'kzalloc()' instead.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Acked-by: default avatarTomer Tayar <Tomer.Tayar@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb2a6b80
    • Anton Volkov's avatar
      hysdn: fix to a race condition in put_log_buffer · b925ef37
      Anton Volkov authored
      The synchronization type that was used earlier to guard the loop that
      deletes unused log buffers may lead to a situation that prevents any
      thread from going through the loop.
      
      The patch deletes previously used synchronization mechanism and moves
      the loop under the spin_lock so the similar cases won't be feasible in
      the future.
      
      Found by by Linux Driver Verification project (linuxtesting.org).
      Signed-off-by: default avatarAnton Volkov <avolkov@ispras.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b925ef37
    • Julian Wiedmann's avatar
      s390/qeth: fix L3 next-hop in xmit qeth hdr · ec2c6726
      Julian Wiedmann authored
      On L3, the qeth_hdr struct needs to be filled with the next-hop
      IP address.
      The current code accesses rtable->rt_gateway without checking that
      rtable is a valid address. The accidental access to a lowcore area
      results in a random next-hop address in the qeth_hdr.
      rtable (or more precisely, skb_dst(skb)) can be NULL in rare cases
      (for instance together with AF_PACKET sockets).
      This patch adds the missing NULL-ptr checks.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Fixes: 87e7597b qeth: Move away from using neighbour entries in qeth_l3_fill_header()
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec2c6726
    • Doug Ledford's avatar
      Merge tag 'rdma-rc-2017-07-26' of... · 48107c4e
      Doug Ledford authored
      Merge tag 'rdma-rc-2017-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma into leon-ipoib
      
      IPoIB fixes for 4.13
      
      The patchset provides various fixes for IPoIB. It is combination of
      fixes to various issues discovered during verification along with
      static checkers cleanup patches.
      
      Most of the patches are from pre-git era and hence lack of Fixes lines.
      
      There is one exception in this IPoIB group - addition of patch revert:
      Revert "IB/core: Allow QP state transition from reset to error", but
      it followed by proper fix to the annoying print, so I thought it is
      appropriate to include it.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      48107c4e
    • David S. Miller's avatar
      Merge branch 'asix-Improve-robustness' · c0e0fb83
      David S. Miller authored
      Dean Jenkins says:
      
      ====================
      asix: Improve robustness
      
      Please consider taking these patches to improve the robustness of the ASIX USB
      to Ethernet driver.
      
      Failures prompting an ASIX driver code review
      =============================================
      
      On an ARM i.MX6 embedded platform some strange one-off and two-off failures were
      observed in and around the ASIX USB to Ethernet driver. This was observed on a
      highly modified kernel 3.14 with the ASIX driver containing back-ported changes
      from kernel.org up to kernel 4.8 approximately.
      
      a) A one-off failure in asix_rx_fixup_internal():
      
      There was an occurrence of an attempt to write off the end of the netdev buffer
      which was trapped by skb_over_panic() in skb_put().
      
      [20030.846440] skbuff: skb_over_panic: text:7f2271c0 len:120 put:60 head:8366ecc0 data:8366ed02 tail:0x8366ed7a end:0x8366ed40 dev:eth0
      [20030.863007] Kernel BUG at 8044ce38 [verbose debug info unavailable]
      
      [20031.215345] Backtrace:
      [20031.217884] [<8044cde0>] (skb_panic) from [<8044d50c>] (skb_put+0x50/0x5c)
      [20031.227408] [<8044d4bc>] (skb_put) from [<7f2271c0>] (asix_rx_fixup_internal+0x1c4/0x23c [asix])
      [20031.242024] [<7f226ffc>] (asix_rx_fixup_internal [asix]) from [<7f22724c>] (asix_rx_fixup_common+0x14/0x18 [asix])
      [20031.260309] [<7f227238>] (asix_rx_fixup_common [asix]) from [<7f21f7d4>] (usbnet_bh+0x74/0x224 [usbnet])
      [20031.269879] [<7f21f760>] (usbnet_bh [usbnet]) from [<8002f834>] (call_timer_fn+0xa4/0x1f0)
      [20031.283961] [<8002f790>] (call_timer_fn) from [<80030834>] (run_timer_softirq+0x230/0x2a8)
      [20031.302782] [<80030604>] (run_timer_softirq) from [<80028780>] (__do_softirq+0x15c/0x37c)
      [20031.321511] [<80028624>] (__do_softirq) from [<80028c38>] (irq_exit+0x8c/0xe8)
      [20031.339298] [<80028bac>] (irq_exit) from [<8000e9c8>] (handle_IRQ+0x8c/0xc8)
      [20031.350038] [<8000e93c>] (handle_IRQ) from [<800085c8>] (gic_handle_irq+0xb8/0xf8)
      [20031.365528] [<80008510>] (gic_handle_irq) from [<8050de80>] (__irq_svc+0x40/0x70)
      
      Analysis of the logic of the ASIX driver (containing backported changes from
      kernel.org up to kernel 4.8 approximately) suggested that the software could not
      trigger skb_over_panic(). The analysis of the kernel BUG() crash information
      suggested that the netdev buffer was written with 2 minimal 60 octet length
      Ethernet frames (ASIX hardware drops the 4 octet FCS field) and the 2nd Ethernet
      frame attempted to write off the end of the netdev buffer.
      
      Note that the netdev buffer should only contain 1 Ethernet frame so if an
      attempt to write 2 Ethernet frames into the buffer is made then that is wrong.
      However, the logic of the asix_rx_fixup_internal() only allows 1 Ethernet frame
      to be written into the netdev buffer.
      
      Potentially this failure was due to memory corruption because it was only seen
      once.
      
      b) Two-off failures in the NAPI layer's backlog queue:
      
      There were 2 crashes in the NAPI layer's backlog queue presumably after
      asix_rx_fixup_internal() called usbnet_skb_return().
      
      [24097.273945] Unable to handle kernel NULL pointer dereference at virtual address 00000004
      
      [24097.398944] PC is at process_backlog+0x80/0x16c
      
      [24097.569466] Backtrace:
      [24097.572007] [<8045ad98>] (process_backlog) from [<8045b64c>] (net_rx_action+0xcc/0x248)
      [24097.591631] [<8045b580>] (net_rx_action) from [<80028780>] (__do_softirq+0x15c/0x37c)
      [24097.610022] [<80028624>] (__do_softirq) from [<800289cc>] (run_ksoftirqd+0x2c/0x84)
      
      and
      
      [ 1059.828452] Unable to handle kernel NULL pointer dereference at virtual address 00000000
      
      [ 1059.953715] PC is at process_backlog+0x84/0x16c
      
      [ 1060.140896] Backtrace:
      [ 1060.143434] [<8045ad98>] (process_backlog) from [<8045b64c>] (net_rx_action+0xcc/0x248)
      [ 1060.163075] [<8045b580>] (net_rx_action) from [<80028780>] (__do_softirq+0x15c/0x37c)
      [ 1060.181474] [<80028624>] (__do_softirq) from [<80028c38>] (irq_exit+0x8c/0xe8)
      [ 1060.199256] [<80028bac>] (irq_exit) from [<8000e9c8>] (handle_IRQ+0x8c/0xc8)
      [ 1060.210006] [<8000e93c>] (handle_IRQ) from [<800085c8>] (gic_handle_irq+0xb8/0xf8)
      [ 1060.225492] [<80008510>] (gic_handle_irq) from [<8050de80>] (__irq_svc+0x40/0x70)
      
      The embedded board was only using an ASIX USB to Ethernet adaptor eth0.
      
      Analysis suggested that the doubly-linked list pointers of the backlog queue had
      been corrupted because one of the link pointers was NULL.
      
      Potentially this failure was due to memory corruption because it was only seen
      twice.
      
      Results of the ASIX driver code review
      ======================================
      
      During the code review some weaknesses were observed in the ASIX driver and the
      following patches have been created to improve the robustness.
      
      Brief overview of the patches
      -----------------------------
      
      1. asix: Add rx->ax_skb = NULL after usbnet_skb_return()
      
      The current ASIX driver sends the received Ethernet frame to the NAPI layer of
      the network stack via the call to usbnet_skb_return() in
      asix_rx_fixup_internal() but retains the rx->ax_skb pointer to the netdev
      buffer. The driver no longer needs the rx->ax_skb pointer at this point because
      the NAPI layer now has the Ethernet frame.
      
      This means that asix_rx_fixup_internal() must not use rx->ax_skb after the call
      to usbnet_skb_return() because it could corrupt the handling of the Ethernet
      frame within the network layer.
      
      Therefore, to remove the risk of erroneous usage of rx->ax_skb, set rx->ax_skb
      to NULL after the call to usbnet_skb_return(). This avoids potential erroneous
      freeing of rx->ax_skb and erroneous writing to the netdev buffer.  If the
      software now somehow inappropriately reused rx->ax_skb, then a NULL pointer
      dereference of rx->ax_skb would occur which makes investigation easier.
      
      2. asix: Ensure asix_rx_fixup_info members are all reset
      
      This patch creates reset_asix_rx_fixup_info() to allow all the
      asix_rx_fixup_info structure members to be consistently reset to initial
      conditions.
      
      Call reset_asix_rx_fixup_info() upon each detectable error condition so that the
      next URB is processed from a known state.
      
      Otherwise, there is a risk that some members of the asix_rx_fixup_info structure
      may be incorrect after an error occurred so potentially leading to a
      malfunction.
      
      3. asix: Fix small memory leak in ax88772_unbind()
      
      This patch creates asix_rx_fixup_common_free() to allow the rx->ax_skb to be
      freed when necessary.
      
      asix_rx_fixup_common_free() is called from ax88772_unbind() before the parent
      private data structure is freed.
      
      Without this patch, there is a risk of a small netdev buffer memory leak each
      time ax88772_unbind() is called during the reception of an Ethernet frame that
      spans across 2 URBs.
      
      Testing
      =======
      
      The patches have been sanity tested on a 64-bit Linux laptop running kernel
      4.13-rc2 with the 3 patches applied on top.
      
      The ASIX USB to Adaptor used for testing was (output of lsusb):
      ID 0b95:772b ASIX Electronics Corp. AX88772B
      
      Test #1
      -------
      
      The test ran a flood ping test script which slowly incremented the ICMP Echo
      Request's payload from 0 to 5000 octets. This eventually causes IPv4
      fragmentation to occur which causes Ethernet frames to be sent very close to
      each other so increases the probability that an Ethernet frame will span 2 URBs.
      The test showed that all pings were successful. The test took about 15 minutes
      to complete.
      
      Test #2
      -------
      
      A script was run on the laptop to periodically run ifdown and ifup every second
      so that the ASIX USB to Adaptor was up for 1 second and down for 1 second.
      
      From a Linux PC connected to the laptop, the following ping command was used
      ping -f -s 5000 <ip address of laptop>
      
      The large ICMP payload causes IPv4 fragmentation resulting in multiple
      Ethernet frames per original IP packet.
      
      Kernel debug within the ASIX driver was enabled to see whether any ASIX errors
      were generated. The test was run for about 24 hours and no ASIX errors were
      seen.
      
      Patches
      =======
      
      The 3 patches have been rebased off the net-next repo master branch with HEAD
      fbbeefdd net: fec: Allow reception of frames bigger than 1522 bytes
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0e0fb83