1. 27 Sep, 2017 22 commits
    • Aleksandar Markovic's avatar
      MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix cases of both inputs zero · 6acd1d26
      Aleksandar Markovic authored
      commit 15560a58 upstream.
      
      Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S>, if both inputs
      are zeros. The right behavior in such cases is stated in instruction
      reference manual and is as follows:
      
         fs  ft       MAX     MIN       MAXA    MINA
        ---------------------------------------------
          0   0        0       0         0       0
          0  -0        0      -0         0      -0
         -0   0        0      -0         0      -0
         -0  -0       -0      -0        -0      -0
      
      Prior to this patch, some of the above cases were yielding correct
      results. However, for the sake of code consistency, all such cases
      are rewritten in this patch.
      
      A relevant example:
      
      MAX.S fd,fs,ft:
        If fs contains +0.0, and ft contains -0.0, fd is going to contain
        +0.0 (without this patch, it used to contain -0.0).
      
      Fixes: a79f5f9b ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
      Fixes: 4e9561b2 ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")
      Signed-off-by: default avatarMiodrag Dinic <miodrag.dinic@imgtec.com>
      Signed-off-by: default avatarGoran Ferenc <goran.ferenc@imgtec.com>
      Signed-off-by: default avatarAleksandar Markovic <aleksandar.markovic@imgtec.com>
      Reviewed-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Bo Hu <bohu@google.com>
      Cc: Douglas Leung <douglas.leung@imgtec.com>
      Cc: Jin Qian <jinqian@google.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Petar Jovanovic <petar.jovanovic@imgtec.com>
      Cc: Raghu Gandham <raghu.gandham@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/16881/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6acd1d26
    • Aleksandar Markovic's avatar
      MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation · b6c818d8
      Aleksandar Markovic authored
      commit e78bf0dc upstream.
      
      Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S> fd,fs,ft, if both
      inputs are quiet NaNs. The <MAX|MAXA|MIN|MINA>.<D|S> specifications
      state that the returned value in such cases should be the quiet NaN
      contained in register fs.
      
      A relevant example:
      
      MAX.S fd,fs,ft:
        If fs contains qNaN1, and ft contains qNaN2, fd is going to contain
        qNaN1 (without this patch, it used to contain qNaN2).
      
      Fixes: a79f5f9b ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
      Fixes: 4e9561b2 ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")
      Signed-off-by: default avatarMiodrag Dinic <miodrag.dinic@imgtec.com>
      Signed-off-by: default avatarGoran Ferenc <goran.ferenc@imgtec.com>
      Signed-off-by: default avatarAleksandar Markovic <aleksandar.markovic@imgtec.com>
      Reviewed-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Bo Hu <bohu@google.com>
      Cc: Douglas Leung <douglas.leung@imgtec.com>
      Cc: Jin Qian <jinqian@google.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Petar Jovanovic <petar.jovanovic@imgtec.com>
      Cc: Raghu Gandham <raghu.gandham@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/16880/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6c818d8
    • Kai-Heng Feng's avatar
      Input: i8042 - add Gigabyte P57 to the keyboard reset table · bf592dde
      Kai-Heng Feng authored
      commit 697c5d8a upstream.
      
      Similar to other Gigabyte laptops, the touchpad on P57 requires a
      keyboard reset to detect Elantech touchpad correctly.
      
      BugLink: https://bugs.launchpad.net/bugs/1594214Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf592dde
    • Arnd Bergmann's avatar
      tty: fix __tty_insert_flip_char regression · c13c5c7e
      Arnd Bergmann authored
      commit 8a5a90a2 upstream.
      
      Sergey noticed a small but fatal mistake in __tty_insert_flip_char,
      leading to an oops in an interrupt handler when using any serial
      port.
      
      The problem is that I accidentally took the tty_buffer pointer
      before calling __tty_buffer_request_room(), which replaces the
      buffer. This moves the pointer lookup to the right place after
      allocating the new buffer space.
      
      Fixes: 979990c6 ("tty: improve tty_insert_flip_char() fast path")
      Reported-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c13c5c7e
    • Arnd Bergmann's avatar
      tty: improve tty_insert_flip_char() slow path · 077933dc
      Arnd Bergmann authored
      commit 065ea0a7 upstream.
      
      While working on improving the fast path of tty_insert_flip_char(),
      I noticed that by calling tty_buffer_request_room(), we needlessly
      move to the separate flag buffer mode for the tty, even when all
      characters use TTY_NORMAL as the flag.
      
      This changes the code to call __tty_buffer_request_room() with the
      correct flag, which will then allocate a regular buffer when it rounds
      out of space but no special flags have been used. I'm guessing that
      this is the behavior that Peter Hurley intended when he introduced
      the compacted flip buffers.
      
      Fixes: acc0f67f ("tty: Halve flip buffer GFP_ATOMIC memory consumption")
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      077933dc
    • Arnd Bergmann's avatar
      tty: improve tty_insert_flip_char() fast path · e1e6620f
      Arnd Bergmann authored
      commit 979990c6 upstream.
      
      kernelci.org reports a crazy stack usage for the VT code when CONFIG_KASAN
      is enabled:
      
      drivers/tty/vt/keyboard.c: In function 'kbd_keycode':
      drivers/tty/vt/keyboard.c:1452:1: error: the frame size of 2240 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
      
      The problem is that tty_insert_flip_char() gets inlined many times into
      kbd_keycode(), and also into other functions, and each copy requires 128
      bytes for stack redzone to check for a possible out-of-bounds access on
      the 'ch' and 'flags' arguments that are passed into
      tty_insert_flip_string_flags as a variable-length string.
      
      This introduces a new __tty_insert_flip_char() function for the slow
      path, which receives the two arguments by value. This completely avoids
      the problem and the stack usage goes back down to around 100 bytes.
      
      Without KASAN, this is also slightly better, as we don't have to
      spill the arguments to the stack but can simply pass 'ch' and 'flag'
      in registers, saving a few bytes in .text for each call site.
      
      This should be backported to linux-4.0 or later, which first introduced
      the stack sanitizer in the kernel.
      
      Fixes: c420f167 ("kasan: enable stack instrumentation")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1e6620f
    • Minchan Kim's avatar
      mm: prevent double decrease of nr_reserved_highatomic · c576160f
      Minchan Kim authored
      commit 4855e4a7 upstream.
      
      There is race between page freeing and unreserved highatomic.
      
       CPU 0				    CPU 1
      
          free_hot_cold_page
            mt = get_pfnblock_migratetype
            set_pcppage_migratetype(page, mt)
          				    unreserve_highatomic_pageblock
          				    spin_lock_irqsave(&zone->lock)
          				    move_freepages_block
          				    set_pageblock_migratetype(page)
          				    spin_unlock_irqrestore(&zone->lock)
            free_pcppages_bulk
              __free_one_page(mt) <- mt is stale
      
      By above race, a page on CPU 0 could go non-highorderatomic free list
      since the pageblock's type is changed.  By that, unreserve logic of
      highorderatomic can decrease reserved count on a same pageblock severak
      times and then it will make mismatch between nr_reserved_highatomic and
      the number of reserved pageblock.
      
      So, this patch verifies whether the pageblock is highatomic or not and
      decrease the count only if the pageblock is highatomic.
      
      Link: http://lkml.kernel.org/r/1476259429-18279-3-git-send-email-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Sangseok Lee <sangseok.lee@lge.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Miles Chen <miles.chen@mediatek.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c576160f
    • Chuck Lever's avatar
      nfsd: Fix general protection fault in release_lock_stateid() · 6ea627b2
      Chuck Lever authored
      commit f46c445b upstream.
      
      When I push NFSv4.1 / RDMA hard, (xfstests generic/089, for example),
      I get this crash on the server:
      
      Oct 28 22:04:30 klimt kernel: general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
      Oct 28 22:04:30 klimt kernel: Modules linked in: cts rpcsec_gss_krb5 iTCO_wdt iTCO_vendor_support sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btrfs irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd xor pcspkr raid6_pq i2c_i801 i2c_smbus lpc_ich mfd_core sg mei_me mei ioatdma shpchp wmi ipmi_si ipmi_msghandler rpcrdma ib_ipoib rdma_ucm acpi_power_meter acpi_pad ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb ahci libahci ptp mlx4_core pps_core dca libata i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
      Oct 28 22:04:30 klimt kernel: CPU: 7 PID: 1558 Comm: nfsd Not tainted 4.9.0-rc2-00005-g82cd754 #8
      Oct 28 22:04:30 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
      Oct 28 22:04:30 klimt kernel: task: ffff880835c3a100 task.stack: ffff8808420d8000
      Oct 28 22:04:30 klimt kernel: RIP: 0010:[<ffffffffa05a759f>]  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP: 0018:ffff8808420dbce0  EFLAGS: 00010246
      Oct 28 22:04:30 klimt kernel: RAX: ffff88084e6660f0 RBX: ffff88084e667020 RCX: 0000000000000000
      Oct 28 22:04:30 klimt kernel: RDX: 0000000000000007 RSI: 0000000000000000 RDI: ffff88084e667020
      Oct 28 22:04:30 klimt kernel: RBP: ffff8808420dbcf8 R08: 0000000000000001 R09: 0000000000000000
      Oct 28 22:04:30 klimt kernel: R10: ffff880835c3a100 R11: ffff880835c3aca8 R12: 6b6b6b6b6b6b6b6b
      Oct 28 22:04:30 klimt kernel: R13: ffff88084e6670d8 R14: ffff880835f546f0 R15: ffff880835f1c548
      Oct 28 22:04:30 klimt kernel: FS:  0000000000000000(0000) GS:ffff88087bdc0000(0000) knlGS:0000000000000000
      Oct 28 22:04:30 klimt kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Oct 28 22:04:30 klimt kernel: CR2: 00007ff020389000 CR3: 0000000001c06000 CR4: 00000000001406e0
      Oct 28 22:04:30 klimt kernel: Stack:
      Oct 28 22:04:30 klimt kernel: ffff88084e667020 0000000000000000 ffff88084e6670d8 ffff8808420dbd20
      Oct 28 22:04:30 klimt kernel: ffffffffa05ac80d ffff880835f54548 ffff88084e640008 ffff880835f545b0
      Oct 28 22:04:30 klimt kernel: ffff8808420dbd70 ffffffffa059803d ffff880835f1c768 0000000000000870
      Oct 28 22:04:30 klimt kernel: Call Trace:
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05ac80d>] nfsd4_free_stateid+0xfd/0x1b0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa059803d>] nfsd4_proc_compound+0x40d/0x690 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa0583114>] nfsd_dispatch+0xd4/0x1d0 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047bbf9>] svc_process_common+0x3d9/0x700 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa047ca64>] svc_process+0xf4/0x330 [sunrpc]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05827ca>] nfsd+0xfa/0x160 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffffa05826d0>] ? nfsd_destroy+0x170/0x170 [nfsd]
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b367b>] kthread+0x10b/0x120
      Oct 28 22:04:30 klimt kernel: [<ffffffff810b3570>] ? kthread_stop+0x280/0x280
      Oct 28 22:04:30 klimt kernel: [<ffffffff8174e8ba>] ret_from_fork+0x2a/0x40
      Oct 28 22:04:30 klimt kernel: Code: c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 48 8b 87 b0 00 00 00 48 89 fb 4c 8b a0 98 00 00 00 <49> 8b 44 24 20 48 8d b8 80 03 00 00 e8 10 66 1a e1 48 89 df e8
      Oct 28 22:04:30 klimt kernel: RIP  [<ffffffffa05a759f>] release_lock_stateid+0x1f/0x60 [nfsd]
      Oct 28 22:04:30 klimt kernel: RSP <ffff8808420dbce0>
      Oct 28 22:04:30 klimt kernel: ---[ end trace cf5d0b371973e167 ]---
      
      Jeff Layton says:
      > Hm...now that I look though, this is a little suspicious:
      >
      >    struct nfs4_openowner *oo = openowner(stp->st_openstp->st_stateowner);
      >
      > I wonder if it's possible for the openstateid to have already been
      > destroyed at this point.
      >
      > We might be better off doing something like this to get the client pointer:
      >
      >    stp->st_stid.sc_client;
      >
      > ...which should be more direct and less dependent on other stateids
      > staying valid.
      
      With the suggested change, I am no longer able to reproduce the above oops.
      
      v2: Fix unhash_lock_stateid() as well
      Fix-suggested-by: default avatarJeff Layton <jlayton@redhat.com>
      Fixes: 42691398 ('nfsd: Fix race between FREE_STATEID and LOCK')
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Cc: Christian Theune <ct@flyingcircus.io>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ea627b2
    • Song Liu's avatar
      md/raid5: release/flush io in raid5_do_work() · d5c59ee8
      Song Liu authored
      commit 9c72a18e upstream.
      
      In raid5, there are scenarios where some ios are deferred to a later
      time, and some IO need a flush to complete. To make sure we make
      progress with these IOs, we need to call the following functions:
      
          flush_deferred_bios(conf);
          r5l_flush_stripe_to_raid(conf->log);
      
      Both of these functions are called in raid5d(), but missing in
      raid5_do_work(). As a result, these functions are not called
      when multi-threading (group_thread_cnt > 0) is enabled. This patch
      adds calls to these function to raid5_do_work().
      
      Note for stable branches:
      
        r5l_flush_stripe_to_raid(conf->log) is need for 4.4+
        flush_deferred_bios(conf) is only needed for 4.11+
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5c59ee8
    • Andy Lutomirski's avatar
      x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps · e21d6604
      Andy Lutomirski authored
      commit 9584d98b upstream.
      
      In ELF_COPY_CORE_REGS, we're copying from the current task, so
      accessing thread.fsbase and thread.gsbase makes no sense.  Just read
      the values from the CPU registers.
      
      In practice, the old code would have been correct most of the time
      simply because thread.fsbase and thread.gsbase usually matched the
      CPU registers.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Chang Seok <chang.seok.bae@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e21d6604
    • Jaegeuk Kim's avatar
      f2fs: check hot_data for roll-forward recovery · 53e5f7b8
      Jaegeuk Kim authored
      commit 125c9fb1 upstream.
      
      We need to check HOT_DATA to truncate any previous data block when doing
      roll-forward recovery.
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53e5f7b8
    • Eric Dumazet's avatar
      ipv6: fix typo in fib6_net_exit() · be999481
      Eric Dumazet authored
      
      [ Upstream commit 32a805ba ]
      
      IPv6 FIB should use FIB6_TABLE_HASHSZ, not FIB_TABLE_HASHSZ.
      
      Fixes: ba1cc08d ("ipv6: fix memory leak with multiple tables during netns destruction")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be999481
    • Sabrina Dubroca's avatar
      ipv6: fix memory leak with multiple tables during netns destruction · 70479eaf
      Sabrina Dubroca authored
      
      [ Upstream commit ba1cc08d ]
      
      fib6_net_exit only frees the main and local tables. If another table was
      created with fib6_alloc_table, we leak it when the netns is destroyed.
      
      Fix this in the same way ip_fib_net_exit cleans up tables, by walking
      through the whole hashtable of fib6_table's. We can get rid of the
      special cases for local and main, since they're also part of the
      hashtable.
      
      Reproducer:
          ip netns add x
          ip -net x -6 rule add from 6003:1::/64 table 100
          ip netns del x
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Fixes: 58f09b78 ("[NETNS][IPV6] ip6_fib - make it per network namespace")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      70479eaf
    • Claudiu Manoil's avatar
      gianfar: Fix Tx flow control deactivation · 9b5e5d8a
      Claudiu Manoil authored
      
      [ Upstream commit 5d621672 ]
      
      The wrong register is checked for the Tx flow control bit,
      it should have been maccfg1 not maccfg2.
      This went unnoticed for so long probably because the impact is
      hardly visible, not to mention the tangled code from adjust_link().
      First, link flow control (i.e. handling of Rx/Tx link level pause frames)
      is disabled by default (needs to be enabled via 'ethtool -A').
      Secondly, maccfg2 always returns 0 for tx_flow_oldval (except for a few
      old boards), which results in Tx flow control remaining always on
      once activated.
      
      Fixes: 45b679c9 ("gianfar: Implement PAUSE frame generation support")
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b5e5d8a
    • Jesper Dangaard Brouer's avatar
      Revert "net: fix percpu memory leaks" · 5f529e0d
      Jesper Dangaard Brouer authored
      
      [ Upstream commit 5a63643e ]
      
      This reverts commit 1d6119ba.
      
      After reverting commit 6d7b857d ("net: use lib/percpu_counter API
      for fragmentation mem accounting") then here is no need for this
      fix-up patch.  As percpu_counter is no longer used, it cannot
      memory leak it any-longer.
      
      Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting")
      Fixes: 1d6119ba ("net: fix percpu memory leaks")
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f529e0d
    • Jesper Dangaard Brouer's avatar
      Revert "net: use lib/percpu_counter API for fragmentation mem accounting" · 40bc5355
      Jesper Dangaard Brouer authored
      
      [ Upstream commit fb452a1a ]
      
      This reverts commit 6d7b857d.
      
      There is a bug in fragmentation codes use of the percpu_counter API,
      that can cause issues on systems with many CPUs.
      
      The frag_mem_limit() just reads the global counter (fbc->count),
      without considering other CPUs can have upto batch size (130K) that
      haven't been subtracted yet.  Due to the 3MBytes lower thresh limit,
      this become dangerous at >=24 CPUs (3*1024*1024/130000=24).
      
      The correct API usage would be to use __percpu_counter_compare() which
      does the right thing, and takes into account the number of (online)
      CPUs and batch size, to account for this and call __percpu_counter_sum()
      when needed.
      
      We choose to revert the use of the lib/percpu_counter API for frag
      memory accounting for several reasons:
      
      1) On systems with CPUs > 24, the heavier fully locked
         __percpu_counter_sum() is always invoked, which will be more
         expensive than the atomic_t that is reverted to.
      
      Given systems with more than 24 CPUs are becoming common this doesn't
      seem like a good option.  To mitigate this, the batch size could be
      decreased and thresh be increased.
      
      2) The add_frag_mem_limit+sub_frag_mem_limit pairs happen on the RX
         CPU, before SKBs are pushed into sockets on remote CPUs.  Given
         NICs can only hash on L2 part of the IP-header, the NIC-RXq's will
         likely be limited.  Thus, a fair chance that atomic add+dec happen
         on the same CPU.
      
      Revert note that commit 1d6119ba ("net: fix percpu memory leaks")
      removed init_frag_mem_limit() and instead use inet_frags_init_net().
      After this revert, inet_frags_uninit_net() becomes empty.
      
      Fixes: 6d7b857d ("net: use lib/percpu_counter API for fragmentation mem accounting")
      Fixes: 1d6119ba ("net: fix percpu memory leaks")
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40bc5355
    • Wei Wang's avatar
      tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0 · 611a98c8
      Wei Wang authored
      
      [ Upstream commit 499350a5 ]
      
      When tcp_disconnect() is called, inet_csk_delack_init() sets
      icsk->icsk_ack.rcv_mss to 0.
      This could potentially cause tcp_recvmsg() => tcp_cleanup_rbuf() =>
      __tcp_select_window() call path to have division by 0 issue.
      So this patch initializes rcv_mss to TCP_MIN_MSS instead of 0.
      Reported-by: default avatarAndrey Konovalov  <andreyknvl@google.com>
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      611a98c8
    • Florian Fainelli's avatar
      Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()" · 081be8c9
      Florian Fainelli authored
      
      [ Upstream commit ebc8254a ]
      
      This reverts commit 7ad813f2 ("net: phy:
      Correctly process PHY_HALTED in phy_stop_machine()") because it is
      creating the possibility for a NULL pointer dereference.
      
      David Daney provide the following call trace and diagram of events:
      
      When ndo_stop() is called we call:
      
       phy_disconnect()
          +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;
          +---> phy_stop_machine()
          |      +---> phy_state_machine()
          |              +----> queue_delayed_work(): Work queued.
          +--->phy_detach() implies: phydev->attached_dev = NULL;
      
      Now at a later time the queued work does:
      
       phy_state_machine()
          +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:
      
       CPU 12 Unable to handle kernel paging request at virtual address
      0000000000000048, epc == ffffffff80de37ec, ra == ffffffff80c7c
      Oops[#1]:
      CPU: 12 PID: 1502 Comm: kworker/12:1 Not tainted 4.9.43-Cavium-Octeon+ #1
      Workqueue: events_power_efficient phy_state_machine
      task: 80000004021ed100 task.stack: 8000000409d70000
      $ 0   : 0000000000000000 ffffffff84720060 0000000000000048 0000000000000004
      $ 4   : 0000000000000000 0000000000000001 0000000000000004 0000000000000000
      $ 8   : 0000000000000000 0000000000000000 00000000ffff98f3 0000000000000000
      $12   : 8000000409d73fe0 0000000000009c00 ffffffff846547c8 000000000000af3b
      $16   : 80000004096bab68 80000004096babd0 0000000000000000 80000004096ba800
      $20   : 0000000000000000 0000000000000000 ffffffff81090000 0000000000000008
      $24   : 0000000000000061 ffffffff808637b0
      $28   : 8000000409d70000 8000000409d73cf0 80000000271bd300 ffffffff80c7804c
      Hi    : 000000000000002a
      Lo    : 000000000000003f
      epc   : ffffffff80de37ec netif_carrier_off+0xc/0x58
      ra    : ffffffff80c7804c phy_state_machine+0x48c/0x4f8
      Status: 14009ce3        KX SX UX KERNEL EXL IE
      Cause : 00800008 (ExcCode 02)
      BadVA : 0000000000000048
      PrId  : 000d9501 (Cavium Octeon III)
      Modules linked in:
      Process kworker/12:1 (pid: 1502, threadinfo=8000000409d70000,
      task=80000004021ed100, tls=0000000000000000)
      Stack : 8000000409a54000 80000004096bab68 80000000271bd300 80000000271c1e00
              0000000000000000 ffffffff808a1708 8000000409a54000 80000000271bd300
              80000000271bd320 8000000409a54030 ffffffff80ff0f00 0000000000000001
              ffffffff81090000 ffffffff808a1ac0 8000000402182080 ffffffff84650000
              8000000402182080 ffffffff84650000 ffffffff80ff0000 8000000409a54000
              ffffffff808a1970 0000000000000000 80000004099e8000 8000000402099240
              0000000000000000 ffffffff808a8598 0000000000000000 8000000408eeeb00
              8000000409a54000 00000000810a1d00 0000000000000000 8000000409d73de8
              8000000409d73de8 0000000000000088 000000000c009c00 8000000409d73e08
              8000000409d73e08 8000000402182080 ffffffff808a84d0 8000000402182080
              ...
      Call Trace:
      [<ffffffff80de37ec>] netif_carrier_off+0xc/0x58
      [<ffffffff80c7804c>] phy_state_machine+0x48c/0x4f8
      [<ffffffff808a1708>] process_one_work+0x158/0x368
      [<ffffffff808a1ac0>] worker_thread+0x150/0x4c0
      [<ffffffff808a8598>] kthread+0xc8/0xe0
      [<ffffffff808617f0>] ret_from_kernel_thread+0x14/0x1c
      
      The original motivation for this change originated from Marc Gonzales
      indicating that his network driver did not have its adjust_link callback
      executing with phydev->link = 0 while he was expecting it.
      
      PHYLIB has never made any such guarantees ever because phy_stop() merely just
      tells the workqueue to move into PHY_HALTED state which will happen
      asynchronously.
      Reported-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reported-by: default avatarDavid Daney <ddaney.cavm@gmail.com>
      Fixes: 7ad813f2 ("net: phy: Correctly process PHY_HALTED in phy_stop_machine()")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      081be8c9
    • Arnd Bergmann's avatar
      qlge: avoid memcpy buffer overflow · 6d8c8fd1
      Arnd Bergmann authored
      
      [ Upstream commit e58f9583 ]
      
      gcc-8.0.0 (snapshot) points out that we copy a variable-length string
      into a fixed length field using memcpy() with the destination length,
      and that ends up copying whatever follows the string:
      
          inlined from 'ql_core_dump' at drivers/net/ethernet/qlogic/qlge/qlge_dbg.c:1106:2:
      drivers/net/ethernet/qlogic/qlge/qlge_dbg.c:708:2: error: 'memcpy' reading 15 bytes from a region of size 14 [-Werror=stringop-overflow=]
        memcpy(seg_hdr->description, desc, (sizeof(seg_hdr->description)) - 1);
      
      Changing it to use strncpy() will instead zero-pad the destination,
      which seems to be the right thing to do here.
      
      The bug is probably harmless, but it seems like a good idea to address
      it in stable kernels as well, if only for the purpose of building with
      gcc-8 without warnings.
      
      Fixes: a61f8026 ("qlge: Add ethtool register dump function.")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d8c8fd1
    • Wei Wang's avatar
      ipv6: fix sparse warning on rt6i_node · 354d36b7
      Wei Wang authored
      
      [ Upstream commit 4e587ea7 ]
      
      Commit c5cff856 adds rcu grace period before freeing fib6_node. This
      generates a new sparse warning on rt->rt6i_node related code:
        net/ipv6/route.c:1394:30: error: incompatible types in comparison
        expression (different address spaces)
        ./include/net/ip6_fib.h:187:14: error: incompatible types in comparison
        expression (different address spaces)
      
      This commit adds "__rcu" tag for rt6i_node and makes sure corresponding
      rcu API is used for it.
      After this fix, sparse no longer generates the above warning.
      
      Fixes: c5cff856 ("ipv6: add rcu grace period before freeing fib6_node")
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      354d36b7
    • Wei Wang's avatar
      ipv6: add rcu grace period before freeing fib6_node · e51bf99b
      Wei Wang authored
      
      [ Upstream commit c5cff856 ]
      
      We currently keep rt->rt6i_node pointing to the fib6_node for the route.
      And some functions make use of this pointer to dereference the fib6_node
      from rt structure, e.g. rt6_check(). However, as there is neither
      refcount nor rcu taken when dereferencing rt->rt6i_node, it could
      potentially cause crashes as rt->rt6i_node could be set to NULL by other
      CPUs when doing a route deletion.
      This patch introduces an rcu grace period before freeing fib6_node and
      makes sure the functions that dereference it takes rcu_read_lock().
      
      Note: there is no "Fixes" tag because this bug was there in a very
      early stage.
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e51bf99b
    • Stefano Brivio's avatar
      ipv6: accept 64k - 1 packet length in ip6_find_1stfragopt() · 6eb7ae12
      Stefano Brivio authored
      
      [ Upstream commit 3de33e1b ]
      
      A packet length of exactly IPV6_MAXPLEN is allowed, we should
      refuse parsing options only if the size is 64KiB or more.
      
      While at it, remove one extra variable and one assignment which
      were also introduced by the commit that introduced the size
      check. Checking the sum 'offset + len' and only later adding
      'len' to 'offset' doesn't provide any advantage over directly
      summing to 'offset' and checking it.
      
      Fixes: 6399f1fa ("ipv6: avoid overflow of offset in ip6_find_1stfragopt")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6eb7ae12
  2. 13 Sep, 2017 18 commits