1. 16 Jun, 2018 12 commits
    • Andy Lutomirski's avatar
      x86/fpu: Fix math emulation in eager fpu mode · 85191ed0
      Andy Lutomirski authored
      commit 4ecd16ec upstream.
      
      Systems without an FPU are generally old and therefore use lazy FPU
      switching. Unsurprisingly, math emulation in eager FPU mode is a
      bit buggy. Fix it.
      
      There were two bugs involving kernel code trying to use the FPU
      registers in eager mode even if they didn't exist and one BUG_ON()
      that was incorrect.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/b4b8d112436bd6fab866e1b4011131507e8d7fbe.1453675014.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85191ed0
    • Andy Lutomirski's avatar
      x86/fpu: Fix FNSAVE usage in eagerfpu mode · a2dd2844
      Andy Lutomirski authored
      commit 5ed73f40 upstream.
      
      In eager fpu mode, having deactivated FPU without immediately
      reloading some other context is illegal.  Therefore, to recover from
      FNSAVE, we can't just deactivate the state -- we need to reload it
      if we're not actively context switching.
      
      We had this wrong in fpu__save() and fpu__copy().  Fix both.
      __kernel_fpu_begin() was fine -- add a comment.
      
      This fixes a warning triggerable with nofxsr eagerfpu=on.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/60662444e13c76f06e23c15c5dcdba31b4ac3d67.1453675014.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a2dd2844
    • Andy Lutomirski's avatar
      x86/fpu: Hard-disable lazy FPU mode · 7c3adb3c
      Andy Lutomirski authored
      commit ca6938a1 upstream.
      
      Since commit:
      
        58122bf1 ("x86/fpu: Default eagerfpu=on on all CPUs")
      
      ... in Linux 4.6, eager FPU mode has been the default on all x86
      systems, and no one has reported any regressions.
      
      This patch removes the ability to enable lazy mode: use_eager_fpu()
      becomes "return true" and all of the FPU mode selection machinery is
      removed.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: pbonzini@redhat.com
      Link: http://lkml.kernel.org/r/1475627678-20788-3-git-send-email-riel@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c3adb3c
    • Borislav Petkov's avatar
      x86/fpu: Fix eager-FPU handling on legacy FPU machines · 8edb1d7e
      Borislav Petkov authored
      commit 6e686709 upstream.
      
      i486 derived cores like Intel Quark support only the very old,
      legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
      our FPU code wasn't handling the saving and restoring there
      properly in the 'eagerfpu' case.
      
      So after we made eagerfpu the default for all CPU types:
      
        58122bf1 x86/fpu: Default eagerfpu=on on all CPUs
      
      these old FPU designs broke. First, Andy Shevchenko reported a splat:
      
        WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160
      
      which was us trying to execute FXRSTOR on those machines even though
      they don't support it.
      
      After taking care of that, Bryan O'Donoghue reported that a simple FPU
      test still failed because we weren't initializing the FPU state properly
      on those machines.
      
      Take care of all that.
      Reported-and-tested-by: default avatarBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reported-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yu-cheng <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnicSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8edb1d7e
    • Yu-cheng Yu's avatar
      x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off") · 93deddfd
      Yu-cheng Yu authored
      commit a65050c6 upstream.
      
      Leonid Shatz noticed that the SDM interpretation of the following
      recent commit:
      
        394db20c ("x86/fpu: Disable AVX when eagerfpu is off")
      
      ... is incorrect and that the original behavior of the FPU code was correct.
      
      Because AVX is not stated in CR0 TS bit description, it was mistakenly
      believed to be not supported for lazy context switch. This turns out
      to be false:
      
        Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:
      
         'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
          MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
          an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
          by the new task.'
      
        Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
        Specification:
      
         'AVX instructions refer to exceptions by classes that include #NM
          "Device Not Available" exception for lazy context switch.'
      
      So revert the commit.
      Reported-by: default avatarLeonid Shatz <leonid.shatz@ravellosystems.com>
      Signed-off-by: default avatarYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93deddfd
    • Andy Lutomirski's avatar
      x86/fpu: Fix 'no387' regression · 0e005aa5
      Andy Lutomirski authored
      commit f363938c upstream.
      
      After fixing FPU option parsing, we now parse the 'no387' boot option
      too early: no387 clears X86_FEATURE_FPU before it's even probed, so
      the boot CPU promptly re-enables it.
      
      I suspect it gets even more confused on SMP.
      
      Fix the probing code to leave X86_FEATURE_FPU off if it's been
      disabled by setup_clear_cpu_cap().
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Fixes: 4f81cbaf ("x86/fpu: Fix early FPU command-line parsing")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e005aa5
    • Andy Lutomirski's avatar
      x86/fpu: Default eagerfpu=on on all CPUs · 1d9af7dc
      Andy Lutomirski authored
      commit 58122bf1 upstream.
      
      We have eager and lazy FPU modes, introduced in:
      
        304bceda ("x86, fpu: use non-lazy fpu restore for processors supporting xsave")
      
      The result is rather messy.  There are two code paths in almost all
      of the FPU code, and only one of them (the eager case) is tested
      frequently, since most kernel developers have new enough hardware
      that we use eagerfpu.
      
      It seems that, on any remotely recent hardware, eagerfpu is a win:
      glibc uses SSE2, so laziness is probably overoptimistic, and, in any
      case, manipulating TS is far slower that saving and restoring the
      full state.  (Stores to CR0.TS are serializing and are poorly
      optimized.)
      
      To try to shake out any latent issues on old hardware, this changes
      the default to eager on all CPUs.  If no performance or functionality
      problems show up, a subsequent patch could remove lazy mode entirely.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/ac290de61bf08d9cfc2664a4f5080257ffc1075a.1453675014.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d9af7dc
    • yu-cheng yu's avatar
      x86/fpu: Disable AVX when eagerfpu is off · 747fd4e6
      yu-cheng yu authored
      commit 394db20c upstream.
      
      When "eagerfpu=off" is given as a command-line input, the kernel
      should disable AVX support.
      
      The Task Switched bit used for lazy context switching does not
      support AVX. If AVX is enabled without eagerfpu context
      switching, one task's AVX state could become corrupted or leak
      to other tasks. This is a bug and has bad security implications.
      
      This only affects systems that have AVX/AVX2/AVX512 and this
      issue will be found only when one actually uses AVX/AVX2/AVX512
      _AND_ does eagerfpu=off.
      
      Reference: Intel Software Developer's Manual Vol. 3A
      
      Sec. 2.5 Control Registers:
      TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the
      x87 FPU/ MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch
      to be delayed until an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4
      instruction is actually executed by the new task.
      
      Sec. 13.4.1 Using the TS Flag to Control the Saving of the X87
      FPU and SSE State
      When the TS flag is set, the processor monitors the instruction
      stream for x87 FPU, MMX, SSE instructions. When the processor
      detects one of these instructions, it raises a
      device-not-available exeception (#NM) prior to executing the
      instruction.
      Signed-off-by: default avatarYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/1452119094-7252-5-git-send-email-yu-cheng.yu@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      747fd4e6
    • yu-cheng yu's avatar
      x86/fpu: Disable MPX when eagerfpu is off · 63b20af8
      yu-cheng yu authored
      commit a5fe93a5 upstream.
      
      This issue is a fallout from the command-line parsing move.
      
      When "eagerfpu=off" is given as a command-line input, the kernel
      should disable MPX support. The decision for turning off MPX was
      made in fpu__init_system_ctx_switch(), which is after the
      selection of the XSAVE format. This patch fixes it by getting
      that decision done earlier in fpu__init_system_xstate().
      Signed-off-by: default avatarYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/1452119094-7252-4-git-send-email-yu-cheng.yu@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63b20af8
    • Borislav Petkov's avatar
      x86/cpufeature: Remove unused and seldomly used cpu_has_xx macros · 082efbb0
      Borislav Petkov authored
      commit 362f924b upstream.
      
      Those are stupid and code should use static_cpu_has_safe() or
      boot_cpu_has() instead. Kill the least used and unused ones.
      
      The remaining ones need more careful inspection before a conversion can
      happen. On the TODO.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1449481182-27541-4-git-send-email-bp@alien8.de
      Cc: David Sterba <dsterba@suse.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Chris Mason <clm@fb.com>
      Cc: Josef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      082efbb0
    • Juergen Gross's avatar
    • yu-cheng yu's avatar
      x86/fpu: Fix early FPU command-line parsing · d1e6f6d1
      yu-cheng yu authored
      commit 4f81cbaf upstream.
      
      The function fpu__init_system() is executed before
      parse_early_param(). This causes wrong FPU configuration. This
      patch fixes this issue by parsing boot_command_line in the
      beginning of fpu__init_system().
      
      With all four patches in this series, each parameter disables
      features as the following:
      
      eagerfpu=off: eagerfpu, avx, avx2, avx512, mpx
      no387: fpu
      nofxsr: fxsr, fxsropt, xmm
      noxsave: xsave, xsaveopt, xsaves, xsavec, avx, avx2, avx512,
      mpx, xgetbv1 noxsaveopt: xsaveopt
      noxsaves: xsaves
      Signed-off-by: default avatarYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/1452119094-7252-2-git-send-email-yu-cheng.yu@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1e6f6d1
  2. 13 Jun, 2018 25 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.137 · ed90fd0c
      Greg Kroah-Hartman authored
      ed90fd0c
    • Eric Dumazet's avatar
      net: metrics: add proper netlink validation · 7ab4c1a1
      Eric Dumazet authored
      [ Upstream commit 5b5e7a0d ]
      
      Before using nla_get_u32(), better make sure the attribute
      is of the proper size.
      
      Code recently was changed, but bug has been there from beginning
      of git.
      
      BUG: KMSAN: uninit-value in rtnetlink_put_metrics+0x553/0x960 net/core/rtnetlink.c:746
      CPU: 1 PID: 14139 Comm: syz-executor6 Not tainted 4.17.0-rc5+ #103
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
       __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686
       rtnetlink_put_metrics+0x553/0x960 net/core/rtnetlink.c:746
       fib_dump_info+0xc42/0x2190 net/ipv4/fib_semantics.c:1361
       rtmsg_fib+0x65f/0x8c0 net/ipv4/fib_semantics.c:419
       fib_table_insert+0x2314/0x2b50 net/ipv4/fib_trie.c:1287
       inet_rtm_newroute+0x210/0x340 net/ipv4/fib_frontend.c:779
       rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646
       netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x455a09
      RSP: 002b:00007faae5fd8c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007faae5fd96d4 RCX: 0000000000455a09
      RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000013
      RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000005d0 R14: 00000000006fdc20 R15: 0000000000000000
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:294 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:685
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:529
       fib_convert_metrics net/ipv4/fib_semantics.c:1056 [inline]
       fib_create_info+0x2d46/0x9dc0 net/ipv4/fib_semantics.c:1150
       fib_table_insert+0x3e4/0x2b50 net/ipv4/fib_trie.c:1146
       inet_rtm_newroute+0x210/0x340 net/ipv4/fib_frontend.c:779
       rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646
       netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
       kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322
       slab_post_alloc_hook mm/slab.h:446 [inline]
       slab_alloc_node mm/slub.c:2753 [inline]
       __kmalloc_node_track_caller+0xb32/0x11b0 mm/slub.c:4395
       __kmalloc_reserve net/core/skbuff.c:138 [inline]
       __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206
       alloc_skb include/linux/skbuff.h:988 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
       netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: a919525a ("net: Move fib_convert_metrics to metrics file")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: David Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ab4c1a1
    • Florian Fainelli's avatar
      net: phy: broadcom: Fix bcm_write_exp() · e9e03708
      Florian Fainelli authored
      [ Upstream commit 79fb218d ]
      
      On newer PHYs, we need to select the expansion register to write with
      setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior
      to being migrated to generic code under bcm-phy-lib.c which
      unfortunately used the older implementation from the BCM54xx days.
      
      Fix this by creating an inline stub: bcm_write_exp_sel() which adds the
      correct value (MII_BCM54XX_EXP_SEL_ER) and update both the Cygnus PHY
      and BCM7xxx PHY drivers which require setting these bits.
      
      broadcom.c is unchanged because some PHYs even use a different selector
      method, so let them specify it directly (e.g: SerDes secondary selector).
      
      Fixes: a1cba561 ("net: phy: Add Broadcom phy library for common interfaces")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9e03708
    • Eric Dumazet's avatar
      rtnetlink: validate attributes in do_setlink() · d80cbf9f
      Eric Dumazet authored
      [ Upstream commit 644c7eeb ]
      
      It seems that rtnl_group_changelink() can call do_setlink
      while a prior call to validate_linkmsg(dev = NULL, ...) could
      not validate IFLA_ADDRESS / IFLA_BROADCAST
      
      Make sure do_setlink() calls validate_linkmsg() instead
      of letting its callers having this responsibility.
      
      With help from Dmitry Vyukov, thanks a lot !
      
      BUG: KMSAN: uninit-value in is_valid_ether_addr include/linux/etherdevice.h:199 [inline]
      BUG: KMSAN: uninit-value in eth_prepare_mac_addr_change net/ethernet/eth.c:275 [inline]
      BUG: KMSAN: uninit-value in eth_mac_addr+0x203/0x2b0 net/ethernet/eth.c:308
      CPU: 1 PID: 8695 Comm: syz-executor3 Not tainted 4.17.0-rc5+ #103
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084
       __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686
       is_valid_ether_addr include/linux/etherdevice.h:199 [inline]
       eth_prepare_mac_addr_change net/ethernet/eth.c:275 [inline]
       eth_mac_addr+0x203/0x2b0 net/ethernet/eth.c:308
       dev_set_mac_address+0x261/0x530 net/core/dev.c:7157
       do_setlink+0xbc3/0x5fc0 net/core/rtnetlink.c:2317
       rtnl_group_changelink net/core/rtnetlink.c:2824 [inline]
       rtnl_newlink+0x1fe9/0x37a0 net/core/rtnetlink.c:2976
       rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646
       netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x455a09
      RSP: 002b:00007fc07480ec68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007fc07480f6d4 RCX: 0000000000455a09
      RDX: 0000000000000000 RSI: 00000000200003c0 RDI: 0000000000000014
      RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000005d0 R14: 00000000006fdc20 R15: 0000000000000000
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:294 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:685
       kmsan_memcpy_origins+0x11d/0x170 mm/kmsan/kmsan.c:527
       __msan_memcpy+0x109/0x160 mm/kmsan/kmsan_instr.c:478
       do_setlink+0xb84/0x5fc0 net/core/rtnetlink.c:2315
       rtnl_group_changelink net/core/rtnetlink.c:2824 [inline]
       rtnl_newlink+0x1fe9/0x37a0 net/core/rtnetlink.c:2976
       rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646
       netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
       kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322
       slab_post_alloc_hook mm/slab.h:446 [inline]
       slab_alloc_node mm/slub.c:2753 [inline]
       __kmalloc_node_track_caller+0xb32/0x11b0 mm/slub.c:4395
       __kmalloc_reserve net/core/skbuff.c:138 [inline]
       __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206
       alloc_skb include/linux/skbuff.h:988 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
       netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: e7ed828f ("netlink: support setting devgroup parameters")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d80cbf9f
    • Dan Carpenter's avatar
      team: use netdev_features_t instead of u32 · d3078b97
      Dan Carpenter authored
      [ Upstream commit 25ea6654 ]
      
      This code was introduced in 2011 around the same time that we made
      netdev_features_t a u64 type.  These days a u32 is not big enough to
      hold all the potential features.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3078b97
    • Jack Morgenstein's avatar
      net/mlx4: Fix irq-unsafe spinlock usage · d22856ee
      Jack Morgenstein authored
      [ Upstream commit d546b67c ]
      
      spin_lock/unlock was used instead of spin_un/lock_irq
      in a procedure used in process space, on a spinlock
      which can be grabbed in an interrupt.
      
      This caused the stack trace below to be displayed (on kernel
      4.17.0-rc1 compiled with Lock Debugging enabled):
      
      [  154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
      [  154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G          I
      [  154.675856] -----------------------------------------------------
      [  154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
      [  154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
      [  154.700927]
      and this task is already holding:
      [  154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
      [  154.718028] which would create a new lock dependency:
      [  154.723705]  (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
      [  154.731922]
      but this new dependency connects a SOFTIRQ-irq-safe lock:
      [  154.740798]  (&(&cq->lock)->rlock){..-.}
      [  154.740800]
      ... which became SOFTIRQ-irq-safe at:
      [  154.752163]   _raw_spin_lock_irqsave+0x3e/0x50
      [  154.757163]   mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
      [  154.762554]   ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
      ...
      to a SOFTIRQ-irq-unsafe lock:
      [  154.815603]  (&(&qp_table->lock)->rlock){+.+.}
      [  154.815604]
      ... which became SOFTIRQ-irq-unsafe at:
      [  154.827718] ...
      [  154.827720]   _raw_spin_lock+0x35/0x50
      [  154.833912]   mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
      [  154.839302]   mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]
      
      Since mlx4_qp_lookup() is called only in process space, we can
      simply replace the spin_un/lock calls with spin_un/lock_irq calls.
      
      Fixes: 6dc06c08 ("net/mlx4: Fix the check in attaching steering rules")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d22856ee
    • Shahed Shaikh's avatar
      qed: Fix mask for physical address in ILT entry · 4119db11
      Shahed Shaikh authored
      [ Upstream commit fdd13dd3 ]
      
      ILT entry requires 12 bit right shifted physical address.
      Existing mask for ILT entry of physical address i.e.
      ILT_ENTRY_PHY_ADDR_MASK is not sufficient to handle 64bit
      address because upper 8 bits of 64 bit address were getting
      masked which resulted in completer abort error on
      PCIe bus due to invalid address.
      
      Fix that mask to handle 64bit physical address.
      
      Fixes: fe56b9e6 ("qed: Add module with basic common support")
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@cavium.com>
      Signed-off-by: default avatarAriel Elior <ariel.elior@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4119db11
    • Willem de Bruijn's avatar
      packet: fix reserve calculation · 7666b709
      Willem de Bruijn authored
      [ Upstream commit 9aad13b0 ]
      
      Commit b84bbaf7 ("packet: in packet_snd start writing at link
      layer allocation") ensures that packet_snd always starts writing
      the link layer header in reserved headroom allocated for this
      purpose.
      
      This is needed because packets may be shorter than hard_header_len,
      in which case the space up to hard_header_len may be zeroed. But
      that necessary padding is not accounted for in skb->len.
      
      The fix, however, is buggy. It calls skb_push, which grows skb->len
      when moving skb->data back. But in this case packet length should not
      change.
      
      Instead, call skb_reserve, which moves both skb->data and skb->tail
      back, without changing length.
      
      Fixes: b84bbaf7 ("packet: in packet_snd start writing at link layer allocation")
      Reported-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7666b709
    • Daniele Palmas's avatar
      net: usb: cdc_mbim: add flag FLAG_SEND_ZLP · 163258f2
      Daniele Palmas authored
      [ Upstream commit 9f7c7283 ]
      
      Testing Telit LM940 with ICMP packets > 14552 bytes revealed that
      the modem needs FLAG_SEND_ZLP to properly work, otherwise the cdc
      mbim data interface won't be anymore responsive.
      Signed-off-by: default avatarDaniele Palmas <dnlplm@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      163258f2
    • Eric Dumazet's avatar
      net/packet: refine check for priv area size · 2f59e1e8
      Eric Dumazet authored
      [ Upstream commit eb73190f ]
      
      syzbot was able to trick af_packet again [1]
      
      Various commits tried to address the problem in the past,
      but failed to take into account V3 header size.
      
      [1]
      
      tpacket_rcv: packet too big, clamped from 72 to 4294967224. macoff=96
      BUG: KASAN: use-after-free in prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline]
      BUG: KASAN: use-after-free in prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039
      Write of size 2 at addr ffff8801cb62000e by task kworker/1:2/2106
      
      CPU: 1 PID: 2106 Comm: kworker/1:2 Not tainted 4.17.0-rc7+ #77
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: ipv6_addrconf addrconf_dad_work
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1b9/0x294 lib/dump_stack.c:113
       print_address_description+0x6c/0x20b mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
       __asan_report_store2_noabort+0x17/0x20 mm/kasan/report.c:436
       prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline]
       prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039
       __packet_lookup_frame_in_block net/packet/af_packet.c:1094 [inline]
       packet_current_rx_frame net/packet/af_packet.c:1117 [inline]
       tpacket_rcv+0x1866/0x3340 net/packet/af_packet.c:2282
       dev_queue_xmit_nit+0x891/0xb90 net/core/dev.c:2018
       xmit_one net/core/dev.c:3049 [inline]
       dev_hard_start_xmit+0x16b/0xc10 net/core/dev.c:3069
       __dev_queue_xmit+0x2724/0x34c0 net/core/dev.c:3584
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3617
       neigh_resolve_output+0x679/0xad0 net/core/neighbour.c:1358
       neigh_output include/net/neighbour.h:482 [inline]
       ip6_finish_output2+0xc9c/0x2810 net/ipv6/ip6_output.c:120
       ip6_finish_output+0x5fe/0xbc0 net/ipv6/ip6_output.c:154
       NF_HOOK_COND include/linux/netfilter.h:277 [inline]
       ip6_output+0x227/0x9b0 net/ipv6/ip6_output.c:171
       dst_output include/net/dst.h:444 [inline]
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ndisc_send_skb+0x100d/0x1570 net/ipv6/ndisc.c:491
       ndisc_send_ns+0x3c1/0x8d0 net/ipv6/ndisc.c:633
       addrconf_dad_work+0xbef/0x1340 net/ipv6/addrconf.c:4033
       process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145
       worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279
       kthread+0x345/0x410 kernel/kthread.c:240
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
      
      The buggy address belongs to the page:
      page:ffffea00072d8800 count:0 mapcount:-127 mapping:0000000000000000 index:0xffff8801cb620e80
      flags: 0x2fffc0000000000()
      raw: 02fffc0000000000 0000000000000000 ffff8801cb620e80 00000000ffffff80
      raw: ffffea00072e3820 ffffea0007132d20 0000000000000002 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8801cb61ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff8801cb61ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      >ffff8801cb620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                            ^
       ffff8801cb620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff8801cb620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      
      Fixes: 2b6867c2 ("net/packet: fix overflow in check for priv area size")
      Fixes: dc808110 ("packet: handle too big packets for PACKET_V3")
      Fixes: f6fb8f10 ("af-packet: TPACKET_V3 flexible buffer implementation.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f59e1e8
    • Cong Wang's avatar
      netdev-FAQ: clarify DaveM's position for stable backports · 55a2ed39
      Cong Wang authored
      [ Upstream commit 75d4e704 ]
      
      Per discussion with David at netconf 2018, let's clarify
      DaveM's position of handling stable backports in netdev-FAQ.
      
      This is important for people relying on upstream -stable
      releases.
      
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55a2ed39
    • Wenwen Wang's avatar
      isdn: eicon: fix a missing-check bug · 4c5ea1dd
      Wenwen Wang authored
      [ Upstream commit 6009d1fe ]
      
      In divasmain.c, the function divas_write() firstly invokes the function
      diva_xdi_open_adapter() to open the adapter that matches with the adapter
      number provided by the user, and then invokes the function diva_xdi_write()
      to perform the write operation using the matched adapter. The two functions
      diva_xdi_open_adapter() and diva_xdi_write() are located in diva.c.
      
      In diva_xdi_open_adapter(), the user command is copied to the object 'msg'
      from the userspace pointer 'src' through the function pointer 'cp_fn',
      which eventually calls copy_from_user() to do the copy. Then, the adapter
      number 'msg.adapter' is used to find out a matched adapter from the
      'adapter_queue'. A matched adapter will be returned if it is found.
      Otherwise, NULL is returned to indicate the failure of the verification on
      the adapter number.
      
      As mentioned above, if a matched adapter is returned, the function
      diva_xdi_write() is invoked to perform the write operation. In this
      function, the user command is copied once again from the userspace pointer
      'src', which is the same as the 'src' pointer in diva_xdi_open_adapter() as
      both of them are from the 'buf' pointer in divas_write(). Similarly, the
      copy is achieved through the function pointer 'cp_fn', which finally calls
      copy_from_user(). After the successful copy, the corresponding command
      processing handler of the matched adapter is invoked to perform the write
      operation.
      
      It is obvious that there are two copies here from userspace, one is in
      diva_xdi_open_adapter(), and one is in diva_xdi_write(). Plus, both of
      these two copies share the same source userspace pointer, i.e., the 'buf'
      pointer in divas_write(). Given that a malicious userspace process can race
      to change the content pointed by the 'buf' pointer, this can pose potential
      security issues. For example, in the first copy, the user provides a valid
      adapter number to pass the verification process and a valid adapter can be
      found. Then the user can modify the adapter number to an invalid number.
      This way, the user can bypass the verification process of the adapter
      number and inject inconsistent data.
      
      This patch reuses the data copied in
      diva_xdi_open_adapter() and passes it to diva_xdi_write(). This way, the
      above issues can be avoided.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c5ea1dd
    • Willem de Bruijn's avatar
      ipv4: remove warning in ip_recv_error · b3c91891
      Willem de Bruijn authored
      [ Upstream commit 730c54d5 ]
      
      A precondition check in ip_recv_error triggered on an otherwise benign
      race. Remove the warning.
      
      The warning triggers when passing an ipv6 socket to this ipv4 error
      handling function. RaceFuzzer was able to trigger it due to a race
      in setsockopt IPV6_ADDRFORM.
      
        ---
        CPU0
          do_ipv6_setsockopt
            sk->sk_socket->ops = &inet_dgram_ops;
      
        ---
        CPU1
          sk->sk_prot->recvmsg
            udp_recvmsg
              ip_recv_error
                WARN_ON_ONCE(sk->sk_family == AF_INET6);
      
        ---
        CPU0
          do_ipv6_setsockopt
            sk->sk_family = PF_INET;
      
      This socket option converts a v6 socket that is connected to a v4 peer
      to an v4 socket. It updates the socket on the fly, changing fields in
      sk as well as other structs. This is inherently non-atomic. It races
      with the lockless udp_recvmsg path.
      
      No other code makes an assumption that these fields are updated
      atomically. It is benign here, too, as ip_recv_error cares only about
      the protocol of the skbs enqueued on the error queue, for which
      sk_family is not a precise predictor (thanks to another isue with
      IPV6_ADDRFORM).
      
      Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr
      Fixes: 7ce875e5 ("ipv4: warn once on passing AF_INET6 socket to ip_recv_error")
      Reported-by: default avatarDaeRyong Jeong <threeearcat@gmail.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b3c91891
    • Sabrina Dubroca's avatar
      ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds · 53075e7a
      Sabrina Dubroca authored
      [ Upstream commit 848235ed ]
      
      Currently, raw6_sk(sk)->ip6mr_table is set unconditionally during
      ip6_mroute_setsockopt(MRT6_TABLE). A subsequent attempt at the same
      setsockopt will fail with -ENOENT, since we haven't actually created
      that table.
      
      A similar fix for ipv4 was included in commit 5e1859fb ("ipv4: ipmr:
      various fixes and cleanups").
      
      Fixes: d1db275d ("ipv6: ip6mr: support multiple tables")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53075e7a
    • Govindarajulu Varadarajan's avatar
      enic: set DMA mask to 47 bit · 489f1f04
      Govindarajulu Varadarajan authored
      [ Upstream commit 322eaa06 ]
      
      In commit 624dbf55 ("driver/net: enic: Try DMA 64 first, then
      failover to DMA") DMA mask was changed from 40 bits to 64 bits.
      Hardware actually supports only 47 bits.
      
      Fixes: 624dbf55 ("driver/net: enic: Try DMA 64 first, then failover to DMA")
      Signed-off-by: default avatarGovindarajulu Varadarajan <gvaradar@cisco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      489f1f04
    • Alexey Kodanev's avatar
      dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect() · 44f4aec0
      Alexey Kodanev authored
      [ Upstream commit 2677d206 ]
      
      Syzbot reported the use-after-free in timer_is_static_object() [1].
      
      This can happen because the structure for the rto timer (ccid2_hc_tx_sock)
      is removed in dccp_disconnect(), and ccid2_hc_tx_rto_expire() can be
      called after that.
      
      The report [1] is similar to the one in commit 120e9dab ("dccp:
      defer ccid_hc_tx_delete() at dismantle time"). And the fix is the same,
      delay freeing ccid2_hc_tx_sock structure, so that it is freed in
      dccp_sk_destruct().
      
      [1]
      
      ==================================================================
      BUG: KASAN: use-after-free in timer_is_static_object+0x80/0x90
      kernel/time/timer.c:607
      Read of size 8 at addr ffff8801bebb5118 by task syz-executor2/25299
      
      CPU: 1 PID: 25299 Comm: syz-executor2 Not tainted 4.17.0-rc5+ #54
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        <IRQ>
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0x1b9/0x294 lib/dump_stack.c:113
        print_address_description+0x6c/0x20b mm/kasan/report.c:256
        kasan_report_error mm/kasan/report.c:354 [inline]
        kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
        __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
        timer_is_static_object+0x80/0x90 kernel/time/timer.c:607
        debug_object_activate+0x2d9/0x670 lib/debugobjects.c:508
        debug_timer_activate kernel/time/timer.c:709 [inline]
        debug_activate kernel/time/timer.c:764 [inline]
        __mod_timer kernel/time/timer.c:1041 [inline]
        mod_timer+0x4d3/0x13b0 kernel/time/timer.c:1102
        sk_reset_timer+0x22/0x60 net/core/sock.c:2742
        ccid2_hc_tx_rto_expire+0x587/0x680 net/dccp/ccids/ccid2.c:147
        call_timer_fn+0x230/0x940 kernel/time/timer.c:1326
        expire_timers kernel/time/timer.c:1363 [inline]
        __run_timers+0x79e/0xc50 kernel/time/timer.c:1666
        run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
        __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
        invoke_softirq kernel/softirq.c:365 [inline]
        irq_exit+0x1d1/0x200 kernel/softirq.c:405
        exiting_irq arch/x86/include/asm/apic.h:525 [inline]
        smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052
        apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863
        </IRQ>
      ...
      Allocated by task 25374:
        save_stack+0x43/0xd0 mm/kasan/kasan.c:448
        set_track mm/kasan/kasan.c:460 [inline]
        kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
        kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
        kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554
        ccid_new+0x25b/0x3e0 net/dccp/ccid.c:151
        dccp_hdlr_ccid+0x27/0x150 net/dccp/feat.c:44
        __dccp_feat_activate+0x184/0x270 net/dccp/feat.c:344
        dccp_feat_activate_values+0x3a7/0x819 net/dccp/feat.c:1538
        dccp_create_openreq_child+0x472/0x610 net/dccp/minisocks.c:128
        dccp_v4_request_recv_sock+0x12c/0xca0 net/dccp/ipv4.c:408
        dccp_v6_request_recv_sock+0x125d/0x1f10 net/dccp/ipv6.c:415
        dccp_check_req+0x455/0x6a0 net/dccp/minisocks.c:197
        dccp_v4_rcv+0x7b8/0x1f3f net/dccp/ipv4.c:841
        ip_local_deliver_finish+0x2e3/0xd80 net/ipv4/ip_input.c:215
        NF_HOOK include/linux/netfilter.h:288 [inline]
        ip_local_deliver+0x1e1/0x720 net/ipv4/ip_input.c:256
        dst_input include/net/dst.h:450 [inline]
        ip_rcv_finish+0x81b/0x2200 net/ipv4/ip_input.c:396
        NF_HOOK include/linux/netfilter.h:288 [inline]
        ip_rcv+0xb70/0x143d net/ipv4/ip_input.c:492
        __netif_receive_skb_core+0x26f5/0x3630 net/core/dev.c:4592
        __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4657
        process_backlog+0x219/0x760 net/core/dev.c:5337
        napi_poll net/core/dev.c:5735 [inline]
        net_rx_action+0x7b7/0x1930 net/core/dev.c:5801
        __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
      
      Freed by task 25374:
        save_stack+0x43/0xd0 mm/kasan/kasan.c:448
        set_track mm/kasan/kasan.c:460 [inline]
        __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
        kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
        __cache_free mm/slab.c:3498 [inline]
        kmem_cache_free+0x86/0x2d0 mm/slab.c:3756
        ccid_hc_tx_delete+0xc3/0x100 net/dccp/ccid.c:190
        dccp_disconnect+0x130/0xc66 net/dccp/proto.c:286
        dccp_close+0x3bc/0xe60 net/dccp/proto.c:1045
        inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427
        inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460
        sock_release+0x96/0x1b0 net/socket.c:594
        sock_close+0x16/0x20 net/socket.c:1149
        __fput+0x34d/0x890 fs/file_table.c:209
        ____fput+0x15/0x20 fs/file_table.c:243
        task_work_run+0x1e4/0x290 kernel/task_work.c:113
        tracehook_notify_resume include/linux/tracehook.h:191 [inline]
        exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166
        prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
        syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
        do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8801bebb4cc0
        which belongs to the cache ccid2_hc_tx_sock of size 1240
      The buggy address is located 1112 bytes inside of
        1240-byte region [ffff8801bebb4cc0, ffff8801bebb5198)
      The buggy address belongs to the page:
      page:ffffea0006faed00 count:1 mapcount:0 mapping:ffff8801bebb41c0
      index:0xffff8801bebb5240 compound_mapcount: 0
      flags: 0x2fffc0000008100(slab|head)
      raw: 02fffc0000008100 ffff8801bebb41c0 ffff8801bebb5240 0000000100000003
      raw: ffff8801cdba3138 ffffea0007634120 ffff8801cdbaab40 0000000000000000
      page dumped because: kasan: bad access detected
      ...
      ==================================================================
      
      Reported-by: syzbot+5d47e9ec91a6f15dbd6f@syzkaller.appspotmail.com
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44f4aec0
    • Julia Lawall's avatar
      bnx2x: use the right constant · d6494acb
      Julia Lawall authored
      [ Upstream commit dd612f18 ]
      
      Nearby code that also tests port suggests that the P0 constant should be
      used when port is zero.
      
      The semantic match that finds this problem is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression e,e1;
      @@
      
      * e ? e1 : e1
      // </smpl>
      
      Fixes: 6c3218c6 ("bnx2x: Adjust ETS to 578xx")
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6494acb
    • Stefan Wahren's avatar
      brcmfmac: Fix check for ISO3166 code · a1b993e1
      Stefan Wahren authored
      commit 9b9322db upstream.
      
      The commit "regulatory: add NUL to request alpha2" increases the length of
      alpha2 to 3. This causes a regression on brcmfmac, because
      brcmf_cfg80211_reg_notifier() expect valid ISO3166 codes in the complete
      array. So fix this accordingly.
      
      Fixes: 657308f7 ("regulatory: add NUL to request alpha2")
      Signed-off-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Acked-by: default avatarFranky Lin <franky.lin@broadcom.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      [bwh: Backported to 4.4: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1b993e1
    • Dave Airlie's avatar
      drm: set FMODE_UNSIGNED_OFFSET for drm files · 12958d0f
      Dave Airlie authored
      commit 76ef6b28 upstream.
      
      Since we have the ttm and gem vma managers using a subset
      of the file address space for objects, and these start at
      0x100000000 they will overflow the new mmap checks.
      
      I've checked all the mmap routines I could see for any
      bad behaviour but overall most people use GEM/TTM VMA
      managers even the legacy drivers have a hashtable.
      
      Reported-and-Tested-by: Arthur Marsh (amarsh04 on #radeon)
      Fixes: be83bbf8 (mmap: introduce sane default mmap limits)
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12958d0f
    • Amir Goldstein's avatar
      xfs: fix incorrect log_flushed on fsync · 66824bdf
      Amir Goldstein authored
      commit 47c7d0b1 upstream.
      
      When calling into _xfs_log_force{,_lsn}() with a pointer
      to log_flushed variable, log_flushed will be set to 1 if:
      1. xlog_sync() is called to flush the active log buffer
      AND/OR
      2. xlog_wait() is called to wait on a syncing log buffers
      
      xfs_file_fsync() checks the value of log_flushed after
      _xfs_log_force_lsn() call to optimize away an explicit
      PREFLUSH request to the data block device after writing
      out all the file's pages to disk.
      
      This optimization is incorrect in the following sequence of events:
      
       Task A                    Task B
       -------------------------------------------------------
       xfs_file_fsync()
         _xfs_log_force_lsn()
           xlog_sync()
              [submit PREFLUSH]
                                 xfs_file_fsync()
                                   file_write_and_wait_range()
                                     [submit WRITE X]
                                     [endio  WRITE X]
                                   _xfs_log_force_lsn()
                                     xlog_wait()
              [endio  PREFLUSH]
      
      The write X is not guarantied to be on persistent storage
      when PREFLUSH request in completed, because write A was submitted
      after the PREFLUSH request, but xfs_file_fsync() of task A will
      be notified of log_flushed=1 and will skip explicit flush.
      
      If the system crashes after fsync of task A, write X may not be
      present on disk after reboot.
      
      This bug was discovered and demonstrated using Josef Bacik's
      dm-log-writes target, which can be used to record block io operations
      and then replay a subset of these operations onto the target device.
      The test goes something like this:
      - Use fsx to execute ops of a file and record ops on log device
      - Every now and then fsync the file, store md5 of file and mark
        the location in the log
      - Then replay log onto device for each mark, mount fs and compare
        md5 of file to stored value
      
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66824bdf
    • Nathan Chancellor's avatar
      kconfig: Avoid format overflow warning from GCC 8.1 · 31658909
      Nathan Chancellor authored
      commit 2ae89c7a upstream.
      
      In file included from scripts/kconfig/zconf.tab.c:2485:
      scripts/kconfig/confdata.c: In function ‘conf_write’:
      scripts/kconfig/confdata.c:773:22: warning: ‘%s’ directive writing likely 7 or more bytes into a region of size between 1 and 4097 [-Wformat-overflow=]
        sprintf(newname, "%s%s", dirname, basename);
                            ^~
      scripts/kconfig/confdata.c:773:19: note: assuming directive output of 7 bytes
        sprintf(newname, "%s%s", dirname, basename);
                         ^~~~~~
      scripts/kconfig/confdata.c:773:2: note: ‘sprintf’ output 1 or more bytes (assuming 4104) into a destination of size 4097
        sprintf(newname, "%s%s", dirname, basename);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      scripts/kconfig/confdata.c:776:23: warning: ‘.tmpconfig.’ directive writing 11 bytes into a region of size between 1 and 4097 [-Wformat-overflow=]
         sprintf(tmpname, "%s.tmpconfig.%d", dirname, (int)getpid());
                             ^~~~~~~~~~~
      scripts/kconfig/confdata.c:776:3: note: ‘sprintf’ output between 13 and 4119 bytes into a destination of size 4097
         sprintf(tmpname, "%s.tmpconfig.%d", dirname, (int)getpid());
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Increase the size of tmpname and newname to make GCC happy.
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31658909
    • Linus Torvalds's avatar
      mmap: relax file size limit for regular files · 6ea1dc96
      Linus Torvalds authored
      commit 423913ad upstream.
      
      Commit be83bbf8 ("mmap: introduce sane default mmap limits") was
      introduced to catch problems in various ad-hoc character device drivers
      doing mmap and getting the size limits wrong.  In the process, it used
      "known good" limits for the normal cases of mapping regular files and
      block device drivers.
      
      It turns out that the "s_maxbytes" limit was less "known good" than I
      thought.  In particular, /proc doesn't set it, but exposes one regular
      file to mmap: /proc/vmcore.  As a result, that file got limited to the
      default MAX_INT s_maxbytes value.
      
      This went unnoticed for a while, because apparently the only thing that
      needs it is the s390 kernel zfcpdump, but there might be other tools
      that use this too.
      
      Vasily suggested just changing s_maxbytes for all of /proc, which isn't
      wrong, but makes me nervous at this stage.  So instead, just make the
      new mmap limit always be MAX_LFS_FILESIZE for regular files, which won't
      affect anything else.  It wasn't the regular file case I was worried
      about.
      
      I'd really prefer for maxsize to have been per-inode, but that is not
      how things are today.
      
      Fixes: be83bbf8 ("mmap: introduce sane default mmap limits")
      Reported-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ea1dc96
    • Linus Torvalds's avatar
      mmap: introduce sane default mmap limits · bd2f9ce5
      Linus Torvalds authored
      commit be83bbf8 upstream.
      
      The internal VM "mmap()" interfaces are based on the mmap target doing
      everything using page indexes rather than byte offsets, because
      traditionally (ie 32-bit) we had the situation that the byte offset
      didn't fit in a register.  So while the mmap virtual address was limited
      by the word size of the architecture, the backing store was not.
      
      So we're basically passing "pgoff" around as a page index, in order to
      be able to describe backing store locations that are much bigger than
      the word size (think files larger than 4GB etc).
      
      But while this all makes a ton of sense conceptually, we've been dogged
      by various drivers that don't really understand this, and internally
      work with byte offsets, and then try to work with the page index by
      turning it into a byte offset with "pgoff << PAGE_SHIFT".
      
      Which obviously can overflow.
      
      Adding the size of the mapping to it to get the byte offset of the end
      of the backing store just exacerbates the problem, and if you then use
      this overflow-prone value to check various limits of your device driver
      mmap capability, you're just setting yourself up for problems.
      
      The correct thing for drivers to do is to do their limit math in page
      indices, the way the interface is designed.  Because the generic mmap
      code _does_ test that the index doesn't overflow, since that's what the
      mmap code really cares about.
      
      HOWEVER.
      
      Finding and fixing various random drivers is a sisyphean task, so let's
      just see if we can just make the core mmap() code do the limiting for
      us.  Realistically, the only "big" backing stores we need to care about
      are regular files and block devices, both of which are known to do this
      properly, and which have nice well-defined limits for how much data they
      can access.
      
      So let's special-case just those two known cases, and then limit other
      random mmap users to a backing store that still fits in "unsigned long".
      Realistically, that's not much of a limit at all on 64-bit, and on
      32-bit architectures the only worry might be the GPU drivers, which can
      have big physical address spaces.
      
      To make it possible for drivers like that to say that they are 64-bit
      clean, this patch does repurpose the "FMODE_UNSIGNED_OFFSET" bit in the
      file flags to allow drivers to mark their file descriptors as safe in
      the full 64-bit mmap address space.
      
      [ The timing for doing this is less than optimal, and this should really
        go in a merge window. But realistically, this needs wide testing more
        than it needs anything else, and being main-line is the only way to do
        that.
      
        So the earlier the better, even if it's outside the proper development
        cycle        - Linus ]
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Dave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd2f9ce5
    • Chris Chiu's avatar
      tpm: self test failure should not cause suspend to fail · 459e0c3b
      Chris Chiu authored
      commit 0803d7be upstream.
      
      The Acer Acer Veriton X4110G has a TPM device detected as:
        tpm_tis 00:0b: 1.2 TPM (device-id 0xFE, rev-id 71)
      
      After the first S3 suspend, the following error appears during resume:
        tpm tpm0: A TPM error(38) occurred continue selftest
      
      Any following S3 suspend attempts will now fail with this error:
        tpm tpm0: Error (38) sending savestate before suspend
        PM: Device 00:0b failed to suspend: error 38
      
      Error 38 is TPM_ERR_INVALID_POSTINIT which means the TPM is
      not in the correct state. This indicates that the platform BIOS
      is not sending the usual TPM_Startup command during S3 resume.
      >From this point onwards, all TPM commands will fail.
      
      The same issue was previously reported on Foxconn 6150BK8MC and
      Sony Vaio TX3.
      
      The platform behaviour seems broken here, but we should not break
      suspend/resume because of this.
      
      When the unexpected TPM state is encountered, set a flag to skip the
      affected TPM_SaveState command on later suspends.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChris Chiu <chiu@endlessm.com>
      Signed-off-by: default avatarDaniel Drake <drake@endlessm.com>
      Link: http://lkml.kernel.org/r/CAB4CAwfSCvj1cudi+MWaB5g2Z67d9DwY1o475YOZD64ma23UiQ@mail.gmail.com
      Link: https://lkml.org/lkml/2011/3/28/192
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591031Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      459e0c3b
    • Enric Balletbo i Serra's avatar
      tpm: do not suspend/resume if power stays on · c7d58182
      Enric Balletbo i Serra authored
      commit b5d0ebc9 upstream.
      
      The suspend/resume behavior of the TPM can be controlled by setting
      "powered-while-suspended" in the DTS. This is useful for the cases
      when hardware does not power-off the TPM.
      Signed-off-by: default avatarSonny Rao <sonnyrao@chromium.org>
      Signed-off-by: default avatarEnric Balletbo i Serra <enric.balletbo@collabora.com>
      Reviewed-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7d58182
  3. 06 Jun, 2018 3 commits