1. 21 Jul, 2017 40 commits
    • Marcin Nowakowski's avatar
      kernel/extable.c: mark core_kernel_text notrace · e1a6709a
      Marcin Nowakowski authored
      commit c0d80dda upstream.
      
      core_kernel_text is used by MIPS in its function graph trace processing,
      so having this method traced leads to an infinite set of recursive calls
      such as:
      
        Call Trace:
           ftrace_return_to_handler+0x50/0x128
           core_kernel_text+0x10/0x1b8
           prepare_ftrace_return+0x6c/0x114
           ftrace_graph_caller+0x20/0x44
           return_to_handler+0x10/0x30
           return_to_handler+0x0/0x30
           return_to_handler+0x0/0x30
           ftrace_ops_no_ops+0x114/0x1bc
           core_kernel_text+0x10/0x1b8
           core_kernel_text+0x10/0x1b8
           core_kernel_text+0x10/0x1b8
           ftrace_ops_no_ops+0x114/0x1bc
           core_kernel_text+0x10/0x1b8
           prepare_ftrace_return+0x6c/0x114
           ftrace_graph_caller+0x20/0x44
           (...)
      
      Mark the function notrace to avoid it being traced.
      
      Link: http://lkml.kernel.org/r/1498028607-6765-1-git-send-email-marcin.nowakowski@imgtec.comSigned-off-by: default avatarMarcin Nowakowski <marcin.nowakowski@imgtec.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Meyer <thomas@m3y3r.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1a6709a
    • Kirill A. Shutemov's avatar
      thp, mm: fix crash due race in MADV_FREE handling · 8a6fcf5f
      Kirill A. Shutemov authored
      commit bbf29ffc upstream.
      
      Reinette reported the following crash:
      
        BUG: Bad page state in process log2exe  pfn:57600
        page:ffffea00015d8000 count:0 mapcount:0 mapping:          (null) index:0x20200
        flags: 0x4000000000040019(locked|uptodate|dirty|swapbacked)
        raw: 4000000000040019 0000000000000000 0000000000020200 00000000ffffffff
        raw: ffffea00015d8020 ffffea00015d8020 0000000000000000 0000000000000000
        page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
        bad because of flags: 0x1(locked)
        Modules linked in: rfcomm 8021q bnep intel_rapl x86_pkg_temp_thermal coretemp efivars btusb btrtl btbcm pwm_lpss_pci snd_hda_codec_hdmi btintel pwm_lpss snd_hda_codec_realtek snd_soc_skl snd_hda_codec_generic snd_soc_skl_ipc spi_pxa2xx_platform snd_soc_sst_ipc snd_soc_sst_dsp i2c_designware_platform i2c_designware_core snd_hda_ext_core snd_soc_sst_match snd_hda_intel snd_hda_codec mei_me snd_hda_core mei snd_soc_rt286 snd_soc_rl6347a snd_soc_core efivarfs
        CPU: 1 PID: 354 Comm: log2exe Not tainted 4.12.0-rc7-test-test #19
        Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS AYAPLCEL.86A.0027.2016.1108.1529 11/08/2016
        Call Trace:
         bad_page+0x16a/0x1f0
         free_pages_check_bad+0x117/0x190
         free_hot_cold_page+0x7b1/0xad0
         __put_page+0x70/0xa0
         madvise_free_huge_pmd+0x627/0x7b0
         madvise_free_pte_range+0x6f8/0x1150
         __walk_page_range+0x6b5/0xe30
         walk_page_range+0x13b/0x310
         madvise_free_page_range.isra.16+0xad/0xd0
         madvise_free_single_vma+0x2e4/0x470
         SyS_madvise+0x8ce/0x1450
      
      If somebody frees the page under us and we hold the last reference to
      it, put_page() would attempt to free the page before unlocking it.
      
      The fix is trivial reorder of operations.
      
      Dave said:
       "I came up with the exact same patch.  For posterity, here's the test
        case, generated by syzkaller and trimmed down by Reinette:
      
        	https://www.sr71.net/~dave/intel/log2.c
      
        And the config that helps detect this:
      
        	https://www.sr71.net/~dave/intel/config-log2"
      
      Fixes: b8d3c4c3 ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
      Link: http://lkml.kernel.org/r/20170628101249.17879-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Acked-by: default avatarDave Hansen <dave.hansen@intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a6fcf5f
    • David Rientjes's avatar
      compiler, clang: always inline when CONFIG_OPTIMIZE_INLINING is disabled · c53c2a4d
      David Rientjes authored
      commit 9a04dbcf upstream.
      
      The motivation for commit abb2ea7d ("compiler, clang: suppress
      warning for unused static inline functions") was to suppress clang's
      warnings about unused static inline functions.
      
      For configs without CONFIG_OPTIMIZE_INLINING enabled, such as any non-x86
      architecture, `inline' in the kernel implies that
      __attribute__((always_inline)) is used.
      
      Some code depends on that behavior, see
        https://lkml.org/lkml/2017/6/13/918:
      
        net/built-in.o: In function `__xchg_mb':
        arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99'
        arch/arm64/include/asm/cmpxchg.h:99: undefined reference to `__compiletime_assert_99
      
      The full fix would be to identify these breakages and annotate the
      functions with __always_inline instead of `inline'.  But since we are
      late in the 4.12-rc cycle, simply carry forward the forced inlining
      behavior and work toward moving arm64, and other architectures, toward
      CONFIG_OPTIMIZE_INLINING behavior.
      
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706261552200.1075@chino.kir.corp.google.comSigned-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Reported-by: default avatarSodagudi Prasad <psodagud@codeaurora.org>
      Tested-by: default avatarSodagudi Prasad <psodagud@codeaurora.org>
      Tested-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c53c2a4d
    • Ben Hutchings's avatar
      tools/lib/lockdep: Reduce MAX_LOCK_DEPTH to avoid overflowing lock_chain/: Depth · 921f4e22
      Ben Hutchings authored
      commit 98dcea0c upstream.
      
      liblockdep has been broken since commit 75dd602a ("lockdep: Fix
      lock_chain::base size"), as that adds a check that MAX_LOCK_DEPTH is
      within the range of lock_chain::depth and in liblockdep it is much
      too large.
      
      That should have resulted in a compiler error, but didn't because:
      
      - the check uses ARRAY_SIZE(), which isn't yet defined in liblockdep
        so is assumed to be an (undeclared) function
      - putting a function call inside a BUILD_BUG_ON() expression quietly
        turns it into some nonsense involving a variable-length array
      
      It did produce a compiler warning, but I didn't notice because
      liblockdep already produces too many warnings if -Wall is enabled
      (which I'll fix shortly).
      
      Even before that commit, which reduced lock_chain::depth from 8 bits
      to 6, MAX_LOCK_DEPTH was too large.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: a.p.zijlstra@chello.nl
      Link: http://lkml.kernel.org/r/20170525130005.5947-3-alexander.levin@verizon.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      921f4e22
    • Helge Deller's avatar
      parisc/mm: Ensure IRQs are off in switch_mm() · 56a13c4d
      Helge Deller authored
      commit 649aa242 upstream.
      
      This is because of commit f98db601 ("sched/core: Add switch_mm_irqs_off()
      and use it in the scheduler") in which switch_mm_irqs_off() is called by the
      scheduler, vs switch_mm() which is used by use_mm().
      
      This patch lets the parisc code mirror the x86 and powerpc code, ie. it
      disables interrupts in switch_mm(), and optimises the scheduler case by
      defining switch_mm_irqs_off().
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56a13c4d
    • Thomas Bogendoerfer's avatar
      parisc: DMA API: return error instead of BUG_ON for dma ops on non dma devs · 57b605a6
      Thomas Bogendoerfer authored
      commit 33f9e024 upstream.
      
      Enabling parport pc driver on a B2600 (and probably other 64bit PARISC
      systems) produced following BUG:
      
      CPU: 0 PID: 1 Comm: swapper Not tainted 4.12.0-rc5-30198-g1132d5e7 #156
      task: 000000009e050000 task.stack: 000000009e04c000
      
           YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
      PSW: 00001000000001101111111100001111 Not tainted
      r00-03  000000ff0806ff0f 000000009e04c990 0000000040871b78 000000009e04cac0
      r04-07  0000000040c14de0 ffffffffffffffff 000000009e07f098 000000009d82d200
      r08-11  000000009d82d210 0000000000000378 0000000000000000 0000000040c345e0
      r12-15  0000000000000005 0000000040c345e0 0000000000000000 0000000040c9d5e0
      r16-19  0000000040c345e0 00000000f00001c4 00000000f00001bc 0000000000000061
      r20-23  000000009e04ce28 0000000000000010 0000000000000010 0000000040b89e40
      r24-27  0000000000000003 0000000000ffffff 000000009d82d210 0000000040c14de0
      r28-31  0000000000000000 000000009e04ca90 000000009e04cb40 0000000000000000
      sr00-03  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      
      IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000404aece0 00000000404aece4
       IIR: 03ffe01f    ISR: 0000000010340000  IOR: 000001781304cac8
       CPU:        0   CR30: 000000009e04c000 CR31: 00000000e2976de2
       ORIG_R28: 0000000000000200
       IAOQ[0]: sba_dma_supported+0x80/0xd0
       IAOQ[1]: sba_dma_supported+0x84/0xd0
       RP(r2): parport_pc_probe_port+0x178/0x1200
      
      Cause is a call to dma_coerce_mask_and_coherenet in parport_pc_probe_port,
      which PARISC DMA API doesn't handle very nicely. This commit gives back
      DMA_ERROR_CODE for DMA API calls, if device isn't capable of DMA
      transaction.
      Signed-off-by: default avatarThomas Bogendoerfer <tsbogend@alpha.franken.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57b605a6
    • Eric Biggers's avatar
      parisc: use compat_sys_keyctl() · fbc8f9a7
      Eric Biggers authored
      commit b0f94efd upstream.
      
      Architectures with a compat syscall table must put compat_sys_keyctl()
      in it, not sys_keyctl().  The parisc architecture was not doing this;
      fix it.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fbc8f9a7
    • Helge Deller's avatar
      parisc: Report SIGSEGV instead of SIGBUS when running out of stack · 3bc74245
      Helge Deller authored
      commit 24746231 upstream.
      
      When a process runs out of stack the parisc kernel wrongly faults with SIGBUS
      instead of the expected SIGSEGV signal.
      
      This example shows how the kernel faults:
      do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000]
      trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000
      
      The vma->vm_end value is the first address which does not belong to the vma, so
      adjust the check to include vma->vm_end to the range for which to send the
      SIGSEGV signal.
      
      This patch unbreaks building the debian libsigsegv package.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bc74245
    • Suzuki K Poulose's avatar
      irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity · 5b2c6af5
      Suzuki K Poulose authored
      commit 866d7c1b upstream.
      
      The GICv3 driver doesn't check if the target CPU for gic_set_affinity
      is valid before going ahead and making the changes. This triggers the
      following splat with KASAN:
      
      [  141.189434] BUG: KASAN: global-out-of-bounds in gic_set_affinity+0x8c/0x140
      [  141.189704] Read of size 8 at addr ffff200009741d20 by task swapper/1/0
      [  141.189958]
      [  141.190158] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.12.0-rc7
      [  141.190458] Hardware name: Foundation-v8A (DT)
      [  141.190658] Call trace:
      [  141.190908] [<ffff200008089d70>] dump_backtrace+0x0/0x328
      [  141.191224] [<ffff20000808a1b4>] show_stack+0x14/0x20
      [  141.191507] [<ffff200008504c3c>] dump_stack+0xa4/0xc8
      [  141.191858] [<ffff20000826c19c>] print_address_description+0x13c/0x250
      [  141.192219] [<ffff20000826c5c8>] kasan_report+0x210/0x300
      [  141.192547] [<ffff20000826ad54>] __asan_load8+0x84/0x98
      [  141.192874] [<ffff20000854eeec>] gic_set_affinity+0x8c/0x140
      [  141.193158] [<ffff200008148b14>] irq_do_set_affinity+0x54/0xb8
      [  141.193473] [<ffff200008148d2c>] irq_set_affinity_locked+0x64/0xf0
      [  141.193828] [<ffff200008148e00>] __irq_set_affinity+0x48/0x78
      [  141.194158] [<ffff200008bc48a4>] arm_perf_starting_cpu+0x104/0x150
      [  141.194513] [<ffff2000080d73bc>] cpuhp_invoke_callback+0x17c/0x1f8
      [  141.194783] [<ffff2000080d94ec>] notify_cpu_starting+0x8c/0xb8
      [  141.195130] [<ffff2000080911ec>] secondary_start_kernel+0x15c/0x200
      [  141.195390] [<0000000080db81b4>] 0x80db81b4
      [  141.195603]
      [  141.195685] The buggy address belongs to the variable:
      [  141.196012]  __cpu_logical_map+0x200/0x220
      [  141.196176]
      [  141.196315] Memory state around the buggy address:
      [  141.196586]  ffff200009741c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  141.196913]  ffff200009741c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  141.197158] >ffff200009741d00: 00 00 00 00 fa fa fa fa 00 00 00 00 00 00 00 00
      [  141.197487]                                ^
      [  141.197758]  ffff200009741d80: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00
      [  141.198060]  ffff200009741e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  141.198358] ==================================================================
      [  141.198609] Disabling lock debugging due to kernel taint
      [  141.198961] CPU1: Booted secondary processor [410fd051]
      
      This patch adds the check to make sure the cpu is valid.
      
      Fixes: commit 021f6537 ("irqchip: gic-v3: Initial support for GICv3")
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b2c6af5
    • Alex Williamson's avatar
      kvm-vfio: Decouple only when we match a group · 0aed5da5
      Alex Williamson authored
      commit e323369b upstream.
      
      Unset-KVM and decrement-assignment only when we find the group in our
      list.  Otherwise we can get out of sync if the user triggers this for
      groups that aren't currently on our list.
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Tested-by: default avatarEric Auger <eric.auger@redhat.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0aed5da5
    • Paul Mackerras's avatar
      KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code · e840db55
      Paul Mackerras authored
      commit 00c14757 upstream.
      
      This fixes a typo where the wrong loop index was used to index
      the kvmppc_xive_vcpu.queues[] array in xive_pre_save_scan().
      The variable i contains the vcpu number; we need to index queues[]
      using j, which iterates from 0 to KVMPPC_XIVE_Q_COUNT-1.
      
      The effect of this bug is that things that save the interrupt
      controller state, such as "virsh dump", on a VM with more than
      8 vCPUs, result in xive_pre_save_queue() getting called on a
      bogus queue structure, usually resulting in a crash like this:
      
      [  501.821107] Unable to handle kernel paging request for data at address 0x00000084
      [  501.821212] Faulting instruction address: 0xc008000004c7c6f8
      [  501.821234] Oops: Kernel access of bad area, sig: 11 [#1]
      [  501.821305] SMP NR_CPUS=1024
      [  501.821307] NUMA
      [  501.821376] PowerNV
      [  501.821470] Modules linked in: vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel kvm_hv nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm tg3 ptp pps_core
      [  501.822477] CPU: 3 PID: 3934 Comm: live_migration Not tainted 4.11.0-4.git8caa70f.el7.centos.ppc64le #1
      [  501.822633] task: c0000003f9e3ae80 task.stack: c0000003f9ed4000
      [  501.822745] NIP: c008000004c7c6f8 LR: c008000004c7c628 CTR: 0000000030058018
      [  501.822877] REGS: c0000003f9ed7980 TRAP: 0300   Not tainted  (4.11.0-4.git8caa70f.el7.centos.ppc64le)
      [  501.823030] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
      [  501.823047]   CR: 28022244  XER: 00000000
      [  501.823203] CFAR: c008000004c7c77c DAR: 0000000000000084 DSISR: 40000000 SOFTE: 1
      [  501.823203] GPR00: c008000004c7c628 c0000003f9ed7c00 c008000004c91450 00000000000000ff
      [  501.823203] GPR04: c0000003f5580000 c0000003f559bf98 9000000000009033 0000000000000000
      [  501.823203] GPR08: 0000000000000084 0000000000000000 00000000000001e0 9000000000001003
      [  501.823203] GPR12: c00000000008a7d0 c00000000fdc1b00 000000000a9a0000 0000000000000000
      [  501.823203] GPR16: 00000000402954e8 000000000a9a0000 0000000000000004 0000000000000000
      [  501.823203] GPR20: 0000000000000008 c000000002e8f180 c000000002e8f1e0 0000000000000001
      [  501.823203] GPR24: 0000000000000008 c0000003f5580008 c0000003f4564018 c000000002e8f1e8
      [  501.823203] GPR28: 00003ff6e58bdc28 c0000003f4564000 0000000000000000 0000000000000000
      [  501.825441] NIP [c008000004c7c6f8] xive_get_attr+0x3b8/0x5b0 [kvm]
      [  501.825671] LR [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm]
      [  501.825887] Call Trace:
      [  501.825991] [c0000003f9ed7c00] [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm] (unreliable)
      [  501.826312] [c0000003f9ed7cd0] [c008000004c62ec4] kvm_device_ioctl_attr+0x64/0xa0 [kvm]
      [  501.826581] [c0000003f9ed7d20] [c008000004c62fcc] kvm_device_ioctl+0xcc/0xf0 [kvm]
      [  501.826843] [c0000003f9ed7d40] [c000000000350c70] do_vfs_ioctl+0xd0/0x8c0
      [  501.827060] [c0000003f9ed7de0] [c000000000351534] SyS_ioctl+0xd4/0xf0
      [  501.827282] [c0000003f9ed7e30] [c00000000000b8e0] system_call+0x38/0xfc
      [  501.827496] Instruction dump:
      [  501.827632] 419e0078 3b760008 e9160008 83fb000c 83db0010 80fb0008 2f280000 60000000
      [  501.827901] 60000000 60420000 419a0050 7be91764 <7d284c2c> 552a0ffe 7f8af040 419e003c
      [  501.828176] ---[ end trace 2d0529a5bbbbafed ]---
      
      Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e840db55
    • Hu Huajun's avatar
      KVM: ARM64: fix phy counter access failure in guest. · 233b62de
      Hu Huajun authored
      commit 02d50cda upstream.
      
      When reading the cntpct_el0 in guest with VHE (Virtual Host Extension)
      enabled in host, the "Unsupported guest sys_reg access" error reported.
      The reason is cnthctl_el2.EL1PCTEN is not enabled, which is expected
      to be done in kvm_timer_init_vhe(). The problem is kvm_timer_init_vhe
      is called by cpu_init_hyp_mode, and which is called when VHE is disabled.
      This patch remove the incorrect call to kvm_timer_init_vhe() from
      cpu_init_hyp_mode(), and calls kvm_timer_init_vhe() to enable
      cnthctl_el2.EL1PCTEN in cpu_hyp_reinit().
      
      Fixes: 488f94d7 ("KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems")
      Signed-off-by: default avatarHu Huajun <huhuajun@huawei.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      233b62de
    • Alex Deucher's avatar
      drm/amdgpu/gfx6: properly cache mc_arb_ramcfg · 20df3657
      Alex Deucher authored
      commit 6653ebd4 upstream.
      
      This was missing for gfx6.
      Acked-by: default avatarHuang Rui <ray.huang@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      20df3657
    • Srinivas Dasari's avatar
      cfg80211: Check if NAN service ID is of expected size · 227ea2fd
      Srinivas Dasari authored
      commit 0a27844c upstream.
      
      nla policy checks for only maximum length of the attribute data when the
      attribute type is NLA_BINARY. If userspace sends less data than
      specified, cfg80211 may access illegal memory. When type is NLA_UNSPEC,
      nla policy check ensures that userspace sends minimum specified length
      number of bytes.
      
      Remove type assignment to NLA_BINARY from nla_policy of
      NL80211_NAN_FUNC_SERVICE_ID to make these NLA_UNSPEC and to make sure
      minimum NL80211_NAN_FUNC_SERVICE_ID_LEN bytes are received from
      userspace with NL80211_NAN_FUNC_SERVICE_ID.
      
      Fixes: a442b761 ("cfg80211: add add_nan_func / del_nan_func")
      Signed-off-by: default avatarSrinivas Dasari <dasaris@qti.qualcomm.com>
      Signed-off-by: default avatarJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      227ea2fd
    • Srinivas Dasari's avatar
      cfg80211: Check if PMKID attribute is of expected size · 3853bb15
      Srinivas Dasari authored
      commit 9361df14 upstream.
      
      nla policy checks for only maximum length of the attribute data
      when the attribute type is NLA_BINARY. If userspace sends less
      data than specified, the wireless drivers may access illegal
      memory. When type is NLA_UNSPEC, nla policy check ensures that
      userspace sends minimum specified length number of bytes.
      
      Remove type assignment to NLA_BINARY from nla_policy of
      NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum
      WLAN_PMKID_LEN bytes are received from userspace with
      NL80211_ATTR_PMKID.
      
      Fixes: 67fbb16b ("nl80211: PMKSA caching support")
      Signed-off-by: default avatarSrinivas Dasari <dasaris@qti.qualcomm.com>
      Signed-off-by: default avatarJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3853bb15
    • Srinivas Dasari's avatar
      cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES · be35aac0
      Srinivas Dasari authored
      commit d7f13f74 upstream.
      
      validate_scan_freqs() retrieves frequencies from attributes
      nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with
      nla_get_u32(), which reads 4 bytes from each attribute
      without validating the size of data received. Attributes
      nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy.
      
      Validate size of each attribute before parsing to avoid potential buffer
      overread.
      
      Fixes: 2a519311 ("cfg80211/nl80211: scanning (and mac80211 update to use it)")
      Signed-off-by: default avatarSrinivas Dasari <dasaris@qti.qualcomm.com>
      Signed-off-by: default avatarJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be35aac0
    • Srinivas Dasari's avatar
      cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE · b9582dbe
      Srinivas Dasari authored
      commit 8feb69c7 upstream.
      
      Buffer overread may happen as nl80211_set_station() reads 4 bytes
      from the attribute NL80211_ATTR_LOCAL_MESH_POWER_MODE without
      validating the size of data received when userspace sends less
      than 4 bytes of data with NL80211_ATTR_LOCAL_MESH_POWER_MODE.
      Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE to avoid
      the buffer overread.
      
      Fixes: 3b1c5a53 ("{cfg,nl}80211: mesh power mode primitives and userspace access")
      Signed-off-by: default avatarSrinivas Dasari <dasaris@qti.qualcomm.com>
      Signed-off-by: default avatarJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b9582dbe
    • Daniel Kiper's avatar
      efi: Process the MEMATTR table only if EFI_MEMMAP is enabled · c43499cd
      Daniel Kiper authored
      commit 457ea3f7 upstream.
      
      Otherwise e.g. Xen dom0 on x86_64 EFI platforms crashes.
      
      In theory we can check EFI_PARAVIRT too, however,
      EFI_MEMMAP looks more targeted and covers more cases.
      Signed-off-by: default avatarDaniel Kiper <daniel.kiper@oracle.com>
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: andrew.cooper3@citrix.com
      Cc: boris.ostrovsky@oracle.com
      Cc: jgross@suse.com
      Cc: linux-efi@vger.kernel.org
      Cc: matt@codeblueprint.co.uk
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1498128697-12943-2-git-send-email-daniel.kiper@oracle.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c43499cd
    • Peter S. Housel's avatar
      brcmfmac: Fix glom_skb leak in brcmf_sdiod_recv_chain · c9e07e2f
      Peter S. Housel authored
      commit 5ea59db8 upstream.
      
      An earlier change to this function (3bdae810) fixed a leak in the
      case of an unsuccessful call to brcmf_sdiod_buffrw(). However, the
      glom_skb buffer, used for emulating a scattering read, is never used
      or referenced after its contents are copied into the destination
      buffers, and therefore always needs to be freed by the end of the
      function.
      
      Fixes: 3bdae810 ("brcmfmac: Fix glob_skb leak in brcmf_sdiod_recv_chain")
      Fixes: a413e39a ("brcmfmac: fix brcmf_sdcard_recv_chain() for host without sg support")
      Signed-off-by: default avatarPeter S. Housel <housel@acm.org>
      Signed-off-by: default avatarArend van Spriel <arend.vanspriel@broadcom.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9e07e2f
    • Christophe Jaillet's avatar
      brcmfmac: Fix a memory leak in error handling path in 'brcmf_cfg80211_attach' · c3439585
      Christophe Jaillet authored
      commit 57c00f2f upstream.
      
      If 'wiphy_new()' fails, we leak 'ops'. Add a new label in the error
      handling path to free it in such a case.
      
      Fixes: 5c22fb85 ("brcmfmac: add wowl gtk rekeying offload support")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3439585
    • Nitin Gupta's avatar
      sparc64: Fix gup_huge_pmd · 14a86f75
      Nitin Gupta authored
      
      [ Upstream commit dbd2667a ]
      
      The function assumes that each PMD points to head of a
      huge page. This is not correct as a PMD can point to
      start of any 8M region with a, say 256M, hugepage. The
      fix ensures that it points to the correct head of any PMD
      huge page.
      
      Cc: Julian Calaby <julian.calaby@gmail.com>
      Signed-off-by: default avatarNitin Gupta <nitin.m.gupta@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14a86f75
    • Nagarathnam Muthusamy's avatar
      Adding the type of exported symbols · 7b3b294b
      Nagarathnam Muthusamy authored
      
      [ Upstream commit f5a651f1 ]
      
      Missing symbol type for few functions prevents genksyms from generating
      symbol versions for those functions. This patch fixes them.
      Signed-off-by: default avatarNagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
      Reviewed-by: default avatarBabu Moger <babu.moger@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b3b294b
    • Nagarathnam Muthusamy's avatar
      sed regex in Makefile.build requires line break between exported symbols · 7bf934b8
      Nagarathnam Muthusamy authored
      
      [ Upstream commit d16c0649 ]
      
      The following regex in Makefile.build matches only one ___EXPORT_SYMBOL per line.
      
      sed
      's/.*___EXPORT_SYMBOL[[:space:]]*\([a-zA-Z0-9_]*\)[[:space:]]*,.*/EXPORT_SYMBOL(\1);/'
      
      ATOMIC_OPS macro in atomic_64.S expands multiple symbols in same line hence
      version generation is done only for the last matched symbol. This patch adds
      new line between the symbol expansions.
      Signed-off-by: default avatarNagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
      Reviewed-by: default avatarBabu Moger <babu.moger@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7bf934b8
    • Nagarathnam Muthusamy's avatar
      Adding asm-prototypes.h for genksyms to generate crc · ca2c0e7e
      Nagarathnam Muthusamy authored
      
      [ Upstream commit bdca8cc0 ]
      
      This patch adds the prototypes of assembly defined functions to asm-prototypes.h.
      Some prototypes are directly added as they are not present in any existing header
      files.
      Signed-off-by: default avatarNagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
      Reviewed-by: default avatarBabu Moger <babu.moger@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca2c0e7e
    • Bert Kenward's avatar
      sfc: don't read beyond unicast address list · 28cc60e9
      Bert Kenward authored
      
      [ Upstream commit c70d6815 ]
      
      If we have more than 32 unicast MAC addresses assigned to an interface
      we will read beyond the end of the address table in the driver when
      adding filters. The next 256 entries store multicast addresses, so we
      will end up attempting to insert duplicate filters, which is mostly
      harmless. If we add more than 288 unicast addresses we will then read
      past the multicast address table, which is likely to be more exciting.
      
      Fixes: 12fb0da4 ("sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode")
      Signed-off-by: default avatarBert Kenward <bkenward@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28cc60e9
    • Arend van Spriel's avatar
      brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx() · f888b9ad
      Arend van Spriel authored
      
      [ Upstream commit 8f44c9a4 ]
      
      The lower level nl80211 code in cfg80211 ensures that "len" is between
      25 and NL80211_ATTR_FRAME (2304).  We subtract DOT11_MGMT_HDR_LEN (24) from
      "len" so thats's max of 2280.  However, the action_frame->data[] buffer is
      only BRCMF_FIL_ACTION_FRAME_SIZE (1800) bytes long so this memcpy() can
      overflow.
      
      	memcpy(action_frame->data, &buf[DOT11_MGMT_HDR_LEN],
      	       le16_to_cpu(action_frame->len));
      
      Cc: stable@vger.kernel.org # 3.9.x
      Fixes: 18e2f61d ("brcmfmac: P2P action frame tx.")
      Reported-by: default avatar"freenerguo(郭大兴)" <freenerguo@tencent.com>
      Signed-off-by: default avatarArend van Spriel <arend.vanspriel@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f888b9ad
    • Eduardo Valentin's avatar
      bridge: mdb: fix leak on complete_info ptr on fail path · fdb5c326
      Eduardo Valentin authored
      
      [ Upstream commit 1bfb1596 ]
      
      We currently get the following kmemleak report:
      unreferenced object 0xffff8800039d9820 (size 32):
        comm "softirq", pid 0, jiffies 4295212383 (age 792.416s)
        hex dump (first 32 bytes):
          00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00  ................
          00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff  ................
        backtrace:
          [<ffffffff8152b4aa>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff811d8ec8>] kmem_cache_alloc_trace+0xb8/0x1c0
          [<ffffffffa0389683>] __br_mdb_notify+0x2a3/0x300 [bridge]
          [<ffffffffa038a0ce>] br_mdb_notify+0x6e/0x70 [bridge]
          [<ffffffffa0386479>] br_multicast_add_group+0x109/0x150 [bridge]
          [<ffffffffa0386518>] br_ip6_multicast_add_group+0x58/0x60 [bridge]
          [<ffffffffa0387fb5>] br_multicast_rcv+0x1d5/0xdb0 [bridge]
          [<ffffffffa037d7cf>] br_handle_frame_finish+0xcf/0x510 [bridge]
          [<ffffffffa03a236b>] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter]
          [<ffffffffa03a3738>] br_nf_hook_thresh+0x48/0xb0 [br_netfilter]
          [<ffffffffa03a3fb9>] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 [br_netfilter]
          [<ffffffffa03a4400>] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter]
          [<ffffffffa03a3c27>] br_nf_pre_routing+0x197/0x3d0 [br_netfilter]
          [<ffffffff814a2952>] nf_iterate+0x52/0x60
          [<ffffffff814a29bc>] nf_hook_slow+0x5c/0xb0
          [<ffffffffa037ddf4>] br_handle_frame+0x1a4/0x2c0 [bridge]
      
      This happens when switchdev_port_obj_add() fails. This patch
      frees complete_info object in the fail path.
      Reviewed-by: default avatarVallish Vaidyeshwara <vallish@amazon.com>
      Signed-off-by: default avatarEduardo Valentin <eduval@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fdb5c326
    • WANG Cong's avatar
      tap: convert a mutex to a spinlock · 09a279f3
      WANG Cong authored
      
      [ Upstream commit ffa423fb ]
      
      We are not allowed to block on the RCU reader side, so can't
      just hold the mutex as before. As a quick fix, convert it to
      a spinlock.
      
      Fixes: d9f1f61c ("tap: Extending tap device create/destroy APIs")
      Reported-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Tested-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Cc: Sainath Grandhi <sainath.grandhi@intel.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09a279f3
    • Guilherme G. Piccoli's avatar
      cxgb4: fix BUG() on interrupt deallocating path of ULD · c92947f9
      Guilherme G. Piccoli authored
      
      [ Upstream commit 6a146f3a ]
      
      Since the introduction of ULD (Upper-Layer Drivers), the MSI-X
      deallocating path changed in cxgb4: the driver frees the interrupts
      of ULD when unregistering it or on shutdown PCI handler.
      
      Problem is that if a MSI-X is not freed before deallocated in the PCI
      layer, it will trigger a BUG() due to still "alive" interrupt being
      tentatively quiesced.
      
      The below trace was observed when doing a simple unbind of Chelsio's
      adapter PCI function, like:
        "echo 001e:80:00.4 > /sys/bus/pci/drivers/cxgb4/unbind"
      
      Trace:
      
        kernel BUG at drivers/pci/msi.c:352!
        Oops: Exception in kernel mode, sig: 5 [#1]
        ...
        NIP [c0000000005a5e60] free_msi_irqs+0xa0/0x250
        LR [c0000000005a5e50] free_msi_irqs+0x90/0x250
        Call Trace:
        [c0000000005a5e50] free_msi_irqs+0x90/0x250 (unreliable)
        [c0000000005a72c4] pci_disable_msix+0x124/0x180
        [d000000011e06708] disable_msi+0x88/0xb0 [cxgb4]
        [d000000011e06948] free_some_resources+0xa8/0x160 [cxgb4]
        [d000000011e06d60] remove_one+0x170/0x3c0 [cxgb4]
        [c00000000058a910] pci_device_remove+0x70/0x110
        [c00000000064ef04] device_release_driver_internal+0x1f4/0x2c0
        ...
      
      This patch fixes the issue by refactoring the shutdown path of ULD on
      cxgb4 driver, by properly freeing and disabling interrupts on PCI
      remove handler too.
      
      Fixes: 0fbc81b3 ("Allocate resources dynamically for all cxgb4 ULD's")
      Reported-by: default avatarHarsha Thyagaraja <hathyaga@in.ibm.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c92947f9
    • Huy Nguyen's avatar
      net/mlx5e: Initialize CEE's getpermhwaddr address buffer to 0xff · 3399c4c4
      Huy Nguyen authored
      
      [ Upstream commit d968f0f2 ]
      
      Latest change in open-lldp code uses bytes 6-11 of perm_addr buffer
      as the Ethernet source address for the host TLV packet.
      Since our driver does not fill these bytes, they stay at zero and
      the open-lldp code ends up sending the TLV packet with zero source
      address and the switch drops this packet.
      
      The fix is to initialize these bytes to 0xff. The open-lldp code
      considers 0xff:ff:ff:ff:ff:ff as the invalid address and falls back to
      use the host's mac address as the Ethernet source address.
      
      Fixes: 3a6a931d ("net/mlx5e: Support DCBX CEE API")
      Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
      Reviewed-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3399c4c4
    • Sowmini Varadhan's avatar
      rds: tcp: use sock_create_lite() to create the accept socket · e8c11223
      Sowmini Varadhan authored
      
      [ Upstream commit 0933a578 ]
      
      There are two problems with calling sock_create_kern() from
      rds_tcp_accept_one()
      1. it sets up a new_sock->sk that is wasteful, because this ->sk
         is going to get replaced by inet_accept() in the subsequent ->accept()
      2. The new_sock->sk is a leaked reference in sock_graft() which
         expects to find a null parent->sk
      
      Avoid these problems by calling sock_create_lite().
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8c11223
    • Jason Wang's avatar
      virtio-net: fix leaking of ctx array · 354934e1
      Jason Wang authored
      
      [ Upstream commit 55281621 ]
      
      Fixes: commit d45b897b ("virtio_net: allow specifying context for rx")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      354934e1
    • Nikolay Aleksandrov's avatar
      vrf: fix bug_on triggered by rx when destroying a vrf · 34e7c498
      Nikolay Aleksandrov authored
      
      [ Upstream commit f630c38e ]
      
      When destroying a VRF device we cleanup the slaves in its ndo_uninit()
      function, but that causes packets to be switched (skb->dev == vrf being
      destroyed) even though we're pass the point where the VRF should be
      receiving any packets while it is being dismantled. This causes a BUG_ON
      to trigger if we have raw sockets (trace below).
      The reason is that the inetdev of the VRF has been destroyed but we're
      still sending packets up the stack with it, so let's free the slaves in
      the dellink callback as David Ahern suggested.
      
      Note that this fix doesn't prevent packets from going up when the VRF
      device is admin down.
      
      [   35.631371] ------------[ cut here ]------------
      [   35.631603] kernel BUG at net/ipv4/fib_frontend.c:285!
      [   35.631854] invalid opcode: 0000 [#1] SMP
      [   35.631977] Modules linked in:
      [   35.632081] CPU: 2 PID: 22 Comm: ksoftirqd/2 Not tainted 4.12.0-rc7+ #45
      [   35.632247] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [   35.632477] task: ffff88005ad68000 task.stack: ffff88005ad64000
      [   35.632632] RIP: 0010:fib_compute_spec_dst+0xfc/0x1ee
      [   35.632769] RSP: 0018:ffff88005ad67978 EFLAGS: 00010202
      [   35.632910] RAX: 0000000000000001 RBX: ffff880059a7f200 RCX: 0000000000000000
      [   35.633084] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff82274af0
      [   35.633256] RBP: ffff88005ad679f8 R08: 000000000001ef70 R09: 0000000000000046
      [   35.633430] R10: ffff88005ad679f8 R11: ffff880037731cb0 R12: 0000000000000001
      [   35.633603] R13: ffff8800599e3000 R14: 0000000000000000 R15: ffff8800599cb852
      [   35.634114] FS:  0000000000000000(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000
      [   35.634306] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   35.634456] CR2: 00007f3563227095 CR3: 000000000201d000 CR4: 00000000000406e0
      [   35.634632] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   35.634865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   35.635055] Call Trace:
      [   35.635271]  ? __lock_acquire+0xf0d/0x1117
      [   35.635522]  ipv4_pktinfo_prepare+0x82/0x151
      [   35.635831]  raw_rcv_skb+0x17/0x3c
      [   35.636062]  raw_rcv+0xe5/0xf7
      [   35.636287]  raw_local_deliver+0x169/0x1d9
      [   35.636534]  ip_local_deliver_finish+0x87/0x1c4
      [   35.636820]  ip_local_deliver+0x63/0x7f
      [   35.637058]  ip_rcv_finish+0x340/0x3a1
      [   35.637295]  ip_rcv+0x314/0x34a
      [   35.637525]  __netif_receive_skb_core+0x49f/0x7c5
      [   35.637780]  ? lock_acquire+0x13f/0x1d7
      [   35.638018]  ? lock_acquire+0x15e/0x1d7
      [   35.638259]  __netif_receive_skb+0x1e/0x94
      [   35.638502]  ? __netif_receive_skb+0x1e/0x94
      [   35.638748]  netif_receive_skb_internal+0x74/0x300
      [   35.639002]  ? dev_gro_receive+0x2ed/0x411
      [   35.639246]  ? lock_is_held_type+0xc4/0xd2
      [   35.639491]  napi_gro_receive+0x105/0x1a0
      [   35.639736]  receive_buf+0xc32/0xc74
      [   35.639965]  ? detach_buf+0x67/0x153
      [   35.640201]  ? virtqueue_get_buf_ctx+0x120/0x176
      [   35.640453]  virtnet_poll+0x128/0x1c5
      [   35.640690]  net_rx_action+0x103/0x343
      [   35.640932]  __do_softirq+0x1c7/0x4b7
      [   35.641171]  run_ksoftirqd+0x23/0x5c
      [   35.641403]  smpboot_thread_fn+0x24f/0x26d
      [   35.641646]  ? sort_range+0x22/0x22
      [   35.641878]  kthread+0x129/0x131
      [   35.642104]  ? __list_add+0x31/0x31
      [   35.642335]  ? __list_add+0x31/0x31
      [   35.642568]  ret_from_fork+0x2a/0x40
      [   35.642804] Code: 05 bd 87 a3 00 01 e8 1f ef 98 ff 4d 85 f6 48 c7 c7 f0 4a 27 82 41 0f 94 c4 31 c9 31 d2 41 0f b6 f4 e8 04 71 a1 ff 45 84 e4 74 02 <0f> 0b 0f b7 93 c4 00 00 00 4d 8b a5 80 05 00 00 48 03 93 d0 00
      [   35.644342] RIP: fib_compute_spec_dst+0xfc/0x1ee RSP: ffff88005ad67978
      
      Fixes: 193125db ("net: Introduce VRF device driver")
      Reported-by: default avatarChris Cormier <chriscormier@cumulusnetworks.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34e7c498
    • David Ahern's avatar
      net: ipv6: Compare lwstate in detecting duplicate nexthops · f3ef3b80
      David Ahern authored
      
      [ Upstream commit f06b7549 ]
      
      Lennert reported a failure to add different mpls encaps in a multipath
      route:
      
        $ ip -6 route add 1234::/16 \
              nexthop encap mpls 10 via fe80::1 dev ens3 \
              nexthop encap mpls 20 via fe80::1 dev ens3
        RTNETLINK answers: File exists
      
      The problem is that the duplicate nexthop detection does not compare
      lwtunnel configuration. Add it.
      
      Fixes: 19e42e45 ("ipv6: support for fib route lwtunnel encap attributes")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reported-by: default avatarJoão Taveira Araújo <joao.taveira@gmail.com>
      Reported-by: default avatarLennert Buytenhek <buytenh@wantstofly.org>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Tested-by: default avatarLennert Buytenhek <buytenh@wantstofly.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3ef3b80
    • Derek Chickles's avatar
      liquidio: fix bug in soft reset failure detection · 1e28d50c
      Derek Chickles authored
      
      [ Upstream commit 05a6b4ca ]
      
      The code that detects a failed soft reset of Octeon is comparing the wrong
      value against the reset value of the Octeon SLI_SCRATCH_1 register,
      resulting in an inability to detect a soft reset failure.  Fix it by using
      the correct value in the comparison, which is any non-zero value.
      
      Fixes: f21fb3ed ("Add support of Cavium Liquidio ethernet adapters")
      Fixes: c0eab5b3 ("liquidio: CN23XX firmware download")
      Signed-off-by: default avatarDerek Chickles <derek.chickles@cavium.com>
      Signed-off-by: default avatarSatanand Burla <satananda.burla@cavium.com>
      Signed-off-by: default avatarRaghu Vatsavayi <raghu.vatsavayi@cavium.com>
      Signed-off-by: default avatarFelix Manlunas <felix.manlunas@cavium.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e28d50c
    • Alban Browaeys's avatar
      net: core: Fix slab-out-of-bounds in netdev_stats_to_stats64 · 1fe829df
      Alban Browaeys authored
      
      [ Upstream commit 9af9959e ]
      
      commit 9256645a ("net/core: relax BUILD_BUG_ON in
      netdev_stats_to_stats64") made an attempt to read beyond
      the size of the source a possibility.
      
      Fix to only copy src size to dest. As dest might be bigger than src.
      
       ==================================================================
       BUG: KASAN: slab-out-of-bounds in netdev_stats_to_stats64+0xe/0x30 at addr ffff8801be248b20
       Read of size 192 by task VBoxNetAdpCtl/6734
       CPU: 1 PID: 6734 Comm: VBoxNetAdpCtl Tainted: G           O    4.11.4prahal+intel+ #118
       Hardware name: LENOVO 20CDCTO1WW/20CDCTO1WW, BIOS GQET52WW (1.32 ) 05/04/2017
       Call Trace:
        dump_stack+0x63/0x86
        kasan_object_err+0x1c/0x70
        kasan_report+0x270/0x520
        ? netdev_stats_to_stats64+0xe/0x30
        ? sched_clock_cpu+0x1b/0x190
        ? __module_address+0x3e/0x3b0
        ? unwind_next_frame+0x1ea/0xb00
        check_memory_region+0x13c/0x1a0
        memcpy+0x23/0x50
        netdev_stats_to_stats64+0xe/0x30
        dev_get_stats+0x1b9/0x230
        rtnl_fill_stats+0x44/0xc00
        ? nla_put+0xc6/0x130
        rtnl_fill_ifinfo+0xe9e/0x3700
        ? rtnl_fill_vfinfo+0xde0/0xde0
        ? sched_clock+0x9/0x10
        ? sched_clock+0x9/0x10
        ? sched_clock_local+0x120/0x130
        ? __module_address+0x3e/0x3b0
        ? unwind_next_frame+0x1ea/0xb00
        ? sched_clock+0x9/0x10
        ? sched_clock+0x9/0x10
        ? sched_clock_cpu+0x1b/0x190
        ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        ? depot_save_stack+0x1d8/0x4a0
        ? depot_save_stack+0x34f/0x4a0
        ? depot_save_stack+0x34f/0x4a0
        ? save_stack+0xb1/0xd0
        ? save_stack_trace+0x16/0x20
        ? save_stack+0x46/0xd0
        ? kasan_slab_alloc+0x12/0x20
        ? __kmalloc_node_track_caller+0x10d/0x350
        ? __kmalloc_reserve.isra.36+0x2c/0xc0
        ? __alloc_skb+0xd0/0x560
        ? rtmsg_ifinfo_build_skb+0x61/0x120
        ? rtmsg_ifinfo.part.25+0x16/0xb0
        ? rtmsg_ifinfo+0x47/0x70
        ? register_netdev+0x15/0x30
        ? vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp]
        ? vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
        ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        ? do_vfs_ioctl+0x17f/0xff0
        ? SyS_ioctl+0x74/0x80
        ? do_syscall_64+0x182/0x390
        ? __alloc_skb+0xd0/0x560
        ? __alloc_skb+0xd0/0x560
        ? save_stack_trace+0x16/0x20
        ? init_object+0x64/0xa0
        ? ___slab_alloc+0x1ae/0x5c0
        ? ___slab_alloc+0x1ae/0x5c0
        ? __alloc_skb+0xd0/0x560
        ? sched_clock+0x9/0x10
        ? kasan_unpoison_shadow+0x35/0x50
        ? kasan_kmalloc+0xad/0xe0
        ? __kmalloc_node_track_caller+0x246/0x350
        ? __alloc_skb+0xd0/0x560
        ? kasan_unpoison_shadow+0x35/0x50
        ? memset+0x31/0x40
        ? __alloc_skb+0x31f/0x560
        ? napi_consume_skb+0x320/0x320
        ? br_get_link_af_size_filtered+0xb7/0x120 [bridge]
        ? if_nlmsg_size+0x440/0x630
        rtmsg_ifinfo_build_skb+0x83/0x120
        rtmsg_ifinfo.part.25+0x16/0xb0
        rtmsg_ifinfo+0x47/0x70
        register_netdevice+0xa2b/0xe50
        ? __kmalloc+0x171/0x2d0
        ? netdev_change_features+0x80/0x80
        register_netdev+0x15/0x30
        vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp]
        vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
        ? vboxNetAdpComposeMACAddress+0x1d0/0x1d0 [vboxnetadp]
        ? kasan_check_write+0x14/0x20
        VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        ? VBoxNetAdpLinuxOpen+0x20/0x20 [vboxnetadp]
        ? lock_acquire+0x11c/0x270
        ? __audit_syscall_entry+0x2fb/0x660
        do_vfs_ioctl+0x17f/0xff0
        ? __audit_syscall_entry+0x2fb/0x660
        ? ioctl_preallocate+0x1d0/0x1d0
        ? __audit_syscall_entry+0x2fb/0x660
        ? kmem_cache_free+0xb2/0x250
        ? syscall_trace_enter+0x537/0xd00
        ? exit_to_usermode_loop+0x100/0x100
        SyS_ioctl+0x74/0x80
        ? do_sys_open+0x350/0x350
        ? do_vfs_ioctl+0xff0/0xff0
        do_syscall_64+0x182/0x390
        entry_SYSCALL64_slow_path+0x25/0x25
       RIP: 0033:0x7f7e39a1ae07
       RSP: 002b:00007ffc6f04c6d8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
       RAX: ffffffffffffffda RBX: 00007ffc6f04c730 RCX: 00007f7e39a1ae07
       RDX: 00007ffc6f04c730 RSI: 00000000c0207601 RDI: 0000000000000007
       RBP: 00007ffc6f04c700 R08: 00007ffc6f04c780 R09: 0000000000000008
       R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000007
       R13: 00000000c0207601 R14: 00007ffc6f04c730 R15: 0000000000000012
       Object at ffff8801be248008, in cache kmalloc-4096 size: 4096
       Allocated:
       PID = 6734
        save_stack_trace+0x16/0x20
        save_stack+0x46/0xd0
        kasan_kmalloc+0xad/0xe0
        __kmalloc+0x171/0x2d0
        alloc_netdev_mqs+0x8a7/0xbe0
        vboxNetAdpOsCreate+0x65/0x1c0 [vboxnetadp]
        vboxNetAdpCreate+0x210/0x400 [vboxnetadp]
        VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp]
        do_vfs_ioctl+0x17f/0xff0
        SyS_ioctl+0x74/0x80
        do_syscall_64+0x182/0x390
        return_from_SYSCALL_64+0x0/0x6a
       Freed:
       PID = 5600
        save_stack_trace+0x16/0x20
        save_stack+0x46/0xd0
        kasan_slab_free+0x73/0xc0
        kfree+0xe4/0x220
        kvfree+0x25/0x30
        single_release+0x74/0xb0
        __fput+0x265/0x6b0
        ____fput+0x9/0x10
        task_work_run+0xd5/0x150
        exit_to_usermode_loop+0xe2/0x100
        do_syscall_64+0x26c/0x390
        return_from_SYSCALL_64+0x0/0x6a
       Memory state around the buggy address:
        ffff8801be248a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff8801be248b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       >ffff8801be248b80: 00 00 00 00 00 00 00 00 00 00 00 07 fc fc fc fc
                                                           ^
        ffff8801be248c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffff8801be248c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ==================================================================
      Signed-off-by: default avatarAlban Browaeys <alban.browaeys@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fe829df
    • Jiri Benc's avatar
      geneve: fix hlist corruption · 40d16256
      Jiri Benc authored
      
      [ Upstream commit 4b4c21fa ]
      
      It's not a good idea to add the same hlist_node to two different hash lists.
      This leads to various hard to debug memory corruptions.
      
      Fixes: 8ed66f0e ("geneve: implement support for IPv6-based tunnels")
      Cc: John W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40d16256
    • Jiri Benc's avatar
      vxlan: fix hlist corruption · b7b3a67e
      Jiri Benc authored
      
      [ Upstream commit 69e76661 ]
      
      It's not a good idea to add the same hlist_node to two different hash lists.
      This leads to various hard to debug memory corruptions.
      
      Fixes: b1be00a6 ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device")
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b7b3a67e
    • Sabrina Dubroca's avatar
      ipv6: dad: don't remove dynamic addresses if link is down · 55db4f9f
      Sabrina Dubroca authored
      
      [ Upstream commit ec8add2a ]
      
      Currently, when the link for $DEV is down, this command succeeds but the
      address is removed immediately by DAD (1):
      
          ip addr add 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
      
      In the same situation, this will succeed and not remove the address (2):
      
          ip addr add 1111::12/64 dev $DEV
          ip addr change 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800
      
      The comment in addrconf_dad_begin() when !IF_READY makes it look like
      this is the intended behavior, but doesn't explain why:
      
           * If the device is not ready:
           * - keep it tentative if it is a permanent address.
           * - otherwise, kill it.
      
      We clearly cannot prevent userspace from doing (2), but we can make (1)
      work consistently with (2).
      
      addrconf_dad_stop() is only called in two cases: if DAD failed, or to
      skip DAD when the link is down. In that second case, the fix is to avoid
      deleting the address, like we already do for permanent addresses.
      
      Fixes: 3c21edbd ("[IPV6]: Defer IPv6 device initialization until the link becomes ready.")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55db4f9f
    • Gal Pressman's avatar
      net/mlx5e: Fix TX carrier errors report in get stats ndo · 419c4e86
      Gal Pressman authored
      
      [ Upstream commit 8ff93de7 ]
      
      Symbol error during carrier counter from PPCNT was mistakenly reported as
      TX carrier errors in get_stats ndo, although it's an RX counter.
      
      Fixes: 269e6b3a ("net/mlx5e: Report additional error statistics in get stats ndo")
      Signed-off-by: default avatarGal Pressman <galp@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      419c4e86